This is a list of 725 machine learning terms compiled by several machine learning experts, which is very comprehensive and worth collecting!
English terms Chinese translated
0-1 Loss Function0-1 Loss function
Accept-Reject Sampling Method Accept-Reject Sampling Method/Accept-Reject Sampling Method
Accumulated Error Backpropagation cumulative error backpropagation
Accuracy precision
Acquisition Function acquisition function
Action actions
Activation Function activation function
Active Learning Active Learning
Adaptive Bitrate Algorithm Adaptive Bitrate Algorithm
Adaptive BoostingAdaBoost
Adaptive Gradient AlgorithmAdaGrad
Adaptive Moment Estimation AlgorithmAdam algorithm
Adaptive Resonance Theory of Adaptive Resonance
Additive Model additive model
Affinity Matrix Affinity Matrix
Agent agent
Algorithm algorithm
Alpha-Beta Pruningα-β pruning
Anomaly Detection anomaly detection
Approximate Inference approximate inference
Area Under ROC CurveAUC
Artificial Intelligence Artificial Intelligence
Artificial Neural Network artificial neural networks
Artificial Neuron artificial neurons
Attention attention
Attention Mechanism Attention Mechanism
Attribute property
Attribute Space property space
Autoencoder autoencoder
Automatic Differentiation automatic differentiation
Autoregressive Model autoregressive model
Back Propagation backpropagation
Back Propagation Algorithm backpropagation algorithm
Back Propagation Through Time backpropagation over time
Backward Induction reverse induction
Backward Search reverse search
Bag of Words词袋
Bandit gambling machines/slot machines
Base Learner base learner
Base Learning Algorithm-based learning algorithm
Baseline baseline
Batch batch
Batch Normalization batch normalization
Bayes Decision Rule Bayesian Decision Criteria
Bayes Model Averaging Bayesian model averaging
Bayes Optimal Classifier Bayesian optimal classifier
Bayes' Theorem Bayes' theorem
Bayesian Decision Theory of Bayesian Decisions
Bayesian Inference Bayesh deduced
Bayesian Learning Bayesian Learning
Bayesian Network Bayesian Network/Bayesian Network
Bayesian Optimization Bayesian optimization
Beam Search bundle search
Benchmark benchmark
Belief Network
Belief Propagation Faith Spread
Bellman Equation Bellman Equation
Bernoulli Distribution Bernoulli distribution
Beta Distribution beta distribution
Between-Class Scatter Matrix Interclass Divergence Matrix
BFGSBFGS
Bias bias/bias
Bias In Affine Function偏置
Bias In Statistics bias
Bias Shift bias offset
Bias-Variance Decomposition Deviation - Amuth Decomposition
Bias-Variance Dilemma bias - Variance dilemma
Bidirectional Recurrent Neural Network bidirectional recurrent neural networks
Bigram binary syntax
Bilingual Evaluation UnderstudyBLEU
Binary Classification binary classification
Binomial Distribution binomial distribution
Binomial Test binomial test
Boltzmann Distribution Boltzmann distribution
Boltzmann Machine Boltzmann machine
BoostingBoosting
Bootstrap AggregatingBagging
Bootstrap Sampling self-service sampling method
Bootstrapping self-help/bootstrapping
Break-Event Point balance point
Bucketing bins
Calculus of Variations variational method
Cascade-Correlation cascading correlation
Catastrophic Forgetting catastrophic forgetting
Categorical Distribution category distribution
Cell unit
Chain Rule Chain Rule
Chebyshev Distance Chebyshev Distance
Class category
The Class-Imbalance category is unbalanced
Classification classification
Classification And Regression Tree Classification and Regression Tree
Classifier classifier
Clique Regiment
Cluster clusters
Cluster Assumption clustering assumptions
Clustering clustering
Clustering Ensemble clustering integration
Co-Training collaborative training
Coding Matrix encoding matrix
Collaborative Filtering Collaborative Filtering
Competitive Learning Competitive Learning
Comprehensibility interpretability
Computation Graph calculation graph
Computational Learning Theory of Computational Learning
Conditional Entropy conditional entropy
Conditional Probability conditional probability
Conditional Probability Distribution conditional probability distribution
Conditional Random Field conditions are with the airport
Conditional Risk Conditional Risk
Confidence confidence
Confusion Matrix confusion matrix
Conjugate Distribution conjugate distribution
Connection Weight connection rights
Connectionism connectionism
Consistency consistency
Constrained Optimization constraint optimization
Context Variable context variable
Context Vector context vector
Context Window Context window
Context Word context words
Contextual Bandit Contextual Gambling Machine/Context Slot Machine
Contingency Table column table
Continuous Attribute continuous property
Contrastive Divergence contrast divergence
Convergence convergence
Convex Optimization convex optimization
Convex Quadratic Programming convex secondary planning
Convolution convolution
Convolutional Kernel convolution kernel
Convolutional Neural Network convolutional neural networks
Coordinate Descent coordinates drop
Corpus corpus
Correlation Coefficient correlation coefficient
Cosine Similarity cosine similarity
Cost cost
Cost Curve
Cost Function
Cost Matrix cost matrix
Cost-Sensitive is cost sensitive
Covariance covariance
Covariance Matrix covariance matrix
Critical Point tipping point
Cross Entropy cross entropy
Cross Validation cross-validation
Curse of Dimensionality dimensionality disaster
Cutting Plane Algorithm Cut Plane Method
Data Mining
Data Set dataset
Davidon-Fletcher-PowellDFP
Decision Boundary Decision Boundary
Decision Function decision function
Decision Stump Decision stump
Decision Tree Decision tree
Decoder decoder
Decoding decoding
Deconvolution deconvolution
Deconvolutional Network Deconvolutional Network
Deduction deduction
Deep Belief Network Deep Belief Network
Deep Boltzmann Machine Deep Boltzmann Machine
Deep Convolutional Generative Adversarial Network deep convolution generates adversarial networks
Deep Learning deep learning
Deep Neural Network Deep Neural Network
Deep Q-NetworkDeep Q Network
Delta-Bar-DeltaDelta-Bar-Delta
Denoising denoising
Denoising Autoencoder denoising autoencoder
Denoising Score Matching de-flex score matching
Density Estimation density estimation
Density-Based Clustering density clustering
Derivative derivative
Determinant determinant
Diagonal Matrix diagonal matrix
Dictionary Learning Dictionary Learning
Dimension Reduction
Directed Edge has a diagonal edge
Directed Graphical Model has a directed graph model
Directed Separation is directed
Dirichlet Distribution Dirichlet Distribution
Discriminative Model discriminant model
Discriminator discriminator
Discriminator Network Discriminating Network
Distance Measure distance measure
Distance Metric Learning Distance Metric Learning
Distributed Representation distributed representation
Diverge diverges
Divergence divergence
Diversity
Diversity Measure Diversity Measure/Variance Measure
Domain Adaptation field adaptation
Dominant Strategy main feature value
Dominant Strategy Dominance Strategy
Down sampling downsampling
Dropout retirement method
Dropout Boosting暂退Boosting
Dropout Method Retirement Method
Dual Problem dual problem
Dummy Node dumb node
Dynamic Bayesian Network Dynamic Bayesian Network
Dynamic Programming dynamic programming
Early Stopping early
Eigendecomposition feature decomposition
Eigenvalue eigenvalue
Element-Wise Product is an element-by-element product
Embedding
Empirical Conditional Entropy Empirical ConditionAltropy
Empirical Distribution Experience Distribution
Empirical Entropy entropy
Empirical Error empirical error
Empirical Risk experience risk
Empirical Risk Minimization experience risk minimization
Encoder encoder
Encoding encoding
End-To-End end-to-end
Energy Function Energy function
The Energy-Based Model is based on an energy model
Ensemble Learning integrated learning
Ensemble Pruning integrated trimming
Entropy熵
Episode round
Epoch wheel
Error error
Error Backpropagation Algorithm Error Backpropagation Algorithm
Error Backpropagation error backpropagation
Error Correcting Output Codes Error Correction Output Encoding
Error Rate error rate
Error-Ambiguity Decomposition error-divergence decomposition
Estimator estimates/estimates
Euclidean Distance Euclidean distance
Evidence evidence
Evidence Lower Bound Evidence Nether
Exact Inference inference
Example example
Expectation expectations
Expectation Maximization expectations are maximized
Expected Loss Expected Loss
Expert System Expert System
Exploding Gradient gradient explosion
Exponential Loss Function Exponential Loss Function
Factor factor
Factorization factorization
Feature features
Feature Engineering Feature Engineering
Feature Map feature diagram
Feature Selection feature selection
Feature Vector feature vector
Featured Learning Feature Learning
Feedforward feedforward
Feedforward Neural Network feedfor neural network
Few-Shot Learning learns less
Filter filter
Fine-Tuning fine-tuning
Fluctuation oscillates
Forget Gate Forgotten Doors
Forward Propagation Forward Propagation/Forward Propagation
Forward Stagewise Algorithm Forward Stepwise Algorithm
Fractionally Strided Convolution microstep convolution
Frobenius NormFrobenius norm
Full Padding full padding
Functional functional
Functional Neuron functional neurons
Gated Recurrent Unit Gated Cycle Unit
Gated RNN Gated RNN
Gaussian Distribution Gaussian distribution
Gaussian Kernel Gaussian kernel
Gaussian Kernel Function Gaussian kernel function
Gaussian Mixture Model Gaussian hybrid model
Gaussian Process Gaussian Process
Generalization Ability generalization capabilities
Generalization Error Generalization error
Generalization Error Bound generalization error upper bound
Generalize generalization
Generalized Lagrange Function is a generalized Lagrange function
Generalized Linear Model generalized linear model
Generalized Rayleigh Quotient Broad Rayleigh Quotient
Generative Adversarial Network generates adversarial networks
Generative Model Generation Model
Generator generator
Generator Network Generator Network
Genetic Algorithm genetic algorithm
Gibbs Distribution Gibbs distribution
Gibbs Sampling/Gibbs Sampling
Gini Index
Global Markov Property Global Markovability
Global Minimum Global Minimum
Gradient gradient
Gradient Clipping gradient truncation
Gradient Descent gradient descent
Gradient Descent Method gradient descent method
Gradient Exploding Problem Gradient Explosion Problem
Gram MatrixGram matrix
Graph Convolutional Network Diagram Convolutional Neural Network/Graph Convolutional Network
Graph Neural Network Diagram Neural Network
Graphical Model diagram model
Grid Search grid search
Ground Truth true value
Hadamard Product Hadamard product
Hamming Distance Hamming Distance
Hard Margin hard spacing
Hebbian Rule Heb's Law
Hidden Layer Hidden Layer
Hidden Markov Model Hidden Markov Model
Hidden Variable hidden variable
Hierarchical Clustering hierarchical clustering
Hilbert Space Hilbert Space
Hinge Loss Function Hinge Loss Function/Hinge Loss Function
Hold-Out leave law
Hyperparameter hyperparameter
Hyperparameter Optimization hyperparameter optimization
Hypothesis hypothesis
Hypothesis Space assumes space
Hypothesis Test hypothesis testing
Identity Matrix identity matrix
Imitation Learning imitates learning
Importance Sampling Importance Sampling
Improved Iterative Scaling's improved iterative scale method
Incremental Learning incremental learning
Independent and Identically Distributed are independent of the same distribution
Indicator Function indicates the function
Individual Learner
Induction induction
Inductive Bias inductive preferences
Inductive Learning inductive learning
Inductive Logic Programming Inductive Logic Programming
Inference deduced
Information Entropy Information entropy
Information Gain information gain
Inner Product inner product
Instance example
Internal Covariate Shift internal covariate offset
Inverse Matrix inverse matrix
Inverse Resolution reversed
Metric mappings such as Isometric Mapping
Jacobian Matrix Jacobian matrix
Jensen InequalityJensen inequality
Joint Probability Distribution joint probability distribution
K-Armed Bandit Problemk - Rocker Slot Machine
K-Fold Cross Validationk Fold Cross Validation
Karush-Kuhn-Tucker ConditionKKT condition
Karush–Kuhn–TuckerKarush–Kuhn–Tucker
Kernel Function kernel functions
Kernel Method kernel method
Kernel Trick kernel tricks
Kernelized Linear Discriminant Analysis kernel linear discriminant analysis
KL DivergenceKL divergence
L-BFGSL-BFGS
Label label
Label Space tag space
Lagrange Duality Lagrange duality
Lagrange Multiplier Lagrange multiplier
Language Model language model
Laplace Smoothing Laplace smooths
Laplacian Correction Laplace correction
Latent Dirichlet Allocation Potential Dirichlet Allocation
Latent Semantic Analysis potential semantic analysis
Latent Variable latent variable/hidden variable
Law of Large Numbers
Layer Normalization layer normalization
Lazy Learning lazy learning
Leaky Relu leak correction linear unit/leak rectifier linear element
Learner Learner
Learning learning
Learning By Analogy analogy
Learning Rate Learning Rate
Learning Vector Quantization Learning Vector Quantization
Least Square Method Least Squares
Least Squares Regression Tree Least Squares Regression Tree
Left Singular Vector left singular vector
Likelihood seems
Linear Chain Conditional Random Field Linear Chain Condition with Airport
Linear Classification Model Linear Classification Model
Linear Classifier linear classifier
Linear Dependence is linearly correlated
Linear Discriminant Analysis Linear Discriminant Analysis
Linear Model linear model
Linear Regression linear regression
Link Function contact function
Local Markov Property Local Markovability
Local Minima is locally minimal
Local Minimum is locally minimal
Local Representation local representation/local representation
Log Likelihood log-likelihood function
Log Linear Model logarithmic linear model
Log-Likelihood is logarithmic
Log-Linear Regression logarithmic linear regression
Logistic Function logarithmic odds function
Logistic Regression logarithmic odds regression
Logit logarithmic odds
Long Short Term Memory Long Short Term Memory
Long Short-Term Memory Network Long short-term memory network
Loopy Belief Propagation Ring Belief Propagation
Loss Function loss function
Low Rank Matrix Approximation low rank matrix approximation
Machine Learning Machine Learning
Macron-R macro lookup complete rate
Manhattan Distance Manhattan distance
Manifold manifold
Manifold Assumption manifold hypothesis
Manifold Learning manifold learning
Margin interval
Marginal Distribution edge distribution
Marginal Independence edge independence
Marginalization marginalized
Markov Chain Markov Chain
Markov Chain Monte Carlo Markov Chain Monte Carlo
Markov Decision Process Markov Decision Process
Markov Network Markov Network
Markov Process Markov Process
Markov Random Field With Airport
Mask mask
Matrix matrix
Matrix Inversion inverse matrix
Max Pooling maximum convergence
Maximal Clique largest regiment
Maximum Entropy Model Maximum Entropy Model
Maximum Likelihood Estimation maximum likelihood estimate
Maximum Margin Maximum Interval
Mean Filed average field
Mean Pooling converges on average
Mean Squared Error Mean Squared Error
Mean-Field average field
Memory Network Memory Network
Message Passing messaging
Metric Learning metrics learning
Micro-R micro-check full rate
Minibatch small batch size
Minimal Description Length Minimum Description Length
Minimax Game Mini Mini Game MiniMax Game
Minkowski Distance Minkowski Distance
Mixture of Experts Hybrid Expert Model
Mixture-of-Gaussian Gaussian mixture
Model model
Model Selection model selection
Momentum Method momentum method
Monte Carlo Method Monte Carlo Method
Moral Graph Upright Diagram/Moral Diagram
Moralization moralization
Multi-Class Classification multi-classification
Multi-Head Attention Multi-Head Attention
Multi-Head Self-Attention Multi-Head Self-Attention
Multi-Kernel Learning multicore learning
Multi-Label Learning Multi-Markup Learning
Multi-Layer Feedforward Neural Networks Multilayer Feedforward Neural Networks
Multi-Layer Perceptron multilayer perceptron
Multinomial Distribution multinomial distribution
Multiple Dimensional Scaling multidimensional scaling
Multiple Linear Regression Multiple Linear Regression
Multitask Learning multitasking learning
Multivariate Normal Distribution multivariate normal distribution
Mutual Information
N-Gram ModelN metamodel
Naive Bayes Classifier Naive Bayes classifier
Naive Bayes Naive Bayes
Nearest Neighbor Classifier nearest neighbor classifier
Negative Log Likelihood negative log likelihood function
Neighbourhood Component Analysis Near Neighbor Component Analysis
Net Input net input
Neural Network Neural Network
Neural Turing Machine Neural Turing Machine
Neuron neurons
Newton Method Newton's Method
No Free Lunch Theorem has no free lunch theorem
Noise-Contrastive Estimation noise comparison estimate
Nominal Attribute column name property
Non-Convex Optimization is non-convex optimized
Non-Metric Distance is not a metric distance
Non-Negative Matrix Factorization non-negative matrix decomposition
Non-Ordinal Attribute unordered properties
Norm norm
Normal Distribution normal distribution
Normalization normalization
Nuclear Norm kernel norm
Number of Epochs轮数
Numerical Attribute numeric property
Object Detection object detection
Oblique Decision Tree Oblique decision tree
Occam's Razor Occam Razor Razor
Odds odds
Off-Policy heterogeneous policy
On-Policy is the same as policy
One-Dependent Estimator relies solely on estimation
One-Hot is a hit
Online Learning
Optimizer optimizer
Ordinal Attribute ordered property
Orthogonal orthogal
Orthogonal Matrix orthogonal matrix
Out-of-Bag Estimate out-of-package estimation
Outlier exception
Over-Parameterized is overparametric
Overfitting overfitting
Oversampling is oversampled
Pac-LearnablePAC is learnable
Padding fill
Pairwise Markov Property is paired markov sex
Parallel Distributed Processing distributed parallel processing
Parameter parameter
Parameter Estimation parameter estimation
Parameter Space parameter space
Parameter Tuning tuning
Parametric ReLU Parametric Correction Linear Element/Parametric Rectified Linear Element
Part-Of-Speech Tagging part-of-speech annotation
Partial Derivative partial derivative
Partially Observable Markov Decision Processes part observes markov decision-making processes
Partition Function distribution functions
Perceptron perceptron
Performance Measure performance measures
Perplexity confusion
Pointer Network Pointer Network
Policy policy
Policy Gradient policy gradient
Policy Iteration policy iteration
Polynomial Kernel Function polynomial kernel function
Pooling convergence
Pooling Layer aggregation layer
Positive Definite Matrix positive definite matrix
Pruning after Post-Pruning
Potential Function potential function
Power Method power method
Pre-Training pre-training
Precision accuracy/accuracy
Prepruning pre-pruning
Primal Problem is the main problem
Primary Visual Cortex primary visual cortex
Principal Component Analysis
Prior prior
Probabilistic Context-Free Grammar Probability Contextual Agnostic Grammar
Probabilistic Graphical Model probability graph model
Probabilistic Model probabilistic model
Probability Density Function probability density function
Probability Distribution probability distribution
Probably Approximately Correct probability approximation is correct
Proposal Distribution proposes distribution
Prototype-Based Clustering prototype clustering
Proximal Gradient Descent descends in a proximal gradient
Pruning pruning
Quadratic Loss Function squared loss function
Quadratic Programming Secondary Planning
Quasi Newton Method quasi-Newton method
Radial Basis Function Radial Basis Function
Random Forest Random Forest
Random Sampling Random sampling
Random Search random search
Random Variable random variable
Random Walk
Recall totality/recall rate
Receptive Field Feel Wild
Reconstruction Error Reconstruction Error
Rectified Linear Unit corrects linear elements/rectified linear units
Recurrent Neural Network recurrent neural networks
Recursive Neural Network recurrent neural networks
Regression regression
Regularization regularization
Regularizer regularizes the term
Reinforcement Learning reinforcement learning
Relative Entropy relative entropy
Reparameterization reparameterization/reparameterization
Representation representation
Representation Learning stands for Learning
Representer Theorem representation theorem
Reproducing Kernel Hilbert Space regenerates nuclear Hilbert Space
Rescaling zooms again
Reset Gate resets the door
Residual Connection residual connection
Residual Network residual network
Restricted Boltzmann Machine Restricted Boltzmann machine
Reward Rewards
Ridge Regression Ridge returns
Right Singular Vector right singular vector
Risk risk
Robustness robustness
Root Node root node
Rule Learning rules learning
Saddle Point saddle point
Sample sample
Sample Complexity Sample complexity
Sample Space Sample Space
Scalar scalar
Selective Ensemble selective integration
Self Information is self-contained
Self-Attention is self-attention
Self-Organizing Map Self-Organizing Mapping Network
Self-training is self-training
Semi-Definite Programming semi-regular programming
Semi-Naive Bayes Classifiers semi-naive Bayes classifier
Semi-Restricted Boltzmann Machine Semi-restricted Boltzmann machine
Semi-Supervised Clustering semi-supervised clustering
Semi-Supervised Learning Semi-supervised learning
Semi-Supervised Support Vector Machine Semi-supervised support vector machine
Sentiment Analysis. Sentiment Analysis
Separating Hyperplane separates the hyperplane
Sequential Covering sequential coverage
Sigmoid Belief NetworkSigmoid Belief Network
Sigmoid FunctionSigmoid function
Signed Distance with sign distance
Similarity Measure Similarity Measure
Simulated Annealing simulated annealing
Simultaneous Localization And Mapping Instant Localization and Map Building
Singular Value singular value
Singular Value Decomposition singular value decomposition
Skip-Gram Model jump model
Smoothing smoothing
Soft Margin soft interval
Soft Margin Maximization Soft Interval Maximization
SoftmaxSoftmax/Soft Maximization
Softmax FunctionSoftmax function/soft maximize function
Softmax RegressionSoftmax Regression/Soft Maximization Regression
Softplus FunctionSoftplus function
Span stretches into subspaces
Sparse Coding Sparse encoding
Sparse Representation sparse representation
Sparsity sparsity
Specialization Specialization
Splitting Variable splitting variables
Squashing Function squeeze function
Standard Normal Distribution Standard normal distribution
State status
State Value Function Status Value function
State-Action Value FunctionStatus-Action-value function
Stationary Distribution is smoothly distributed
Stationary Point is stationed
Statistical Learning Statistics Learning
Steepest Descent's fastest descent method
Stochastic Gradient Descent random gradient descent
Stochastic Matrix random matrix
Stochastic Process stochastic process
Stratified Sampling hierarchical sampling
Stride stride
Structural Risk structural risk
Structural Risk Minimization Structural Risk Minimization
Subsample subsampling
Subsampling downsampling
Subset Search subset search
Subspace subspace
Supervised Learning supervises learning
Support Vector support vectors
Support Vector Expansion supports vector spreading
The Support Vector Machine supports vector machines
Surrogat Loss replaces losses
Surrogate Function alternative function
Surrogate Loss Function proxy loss function
Symbolism symbolism
Tangent Propagation tangent propagation
Teacher Forcing compulsory teaching
Temporal-Difference Learning Timing Difference Learning
Tensor tensor
Test Error Test Error
Test Sample Test Sample
Test Set test set
Threshold threshold
Threshold Logic Unit Threshold Logic Unit
Threshold-Moving threshold movement
Tied Weight bundle weights
Tikhonov Regularization Tikhonov regularization
Time Delay Neural Network Latency Neural Network
Time Homogenous Markov Chain Time Qi Times Markov Chain
Time Step time step
Token tokens
Tokenization lexicalization
Tokenizer
Topic Model topic model
Topic Modeling Topic Analysis
Trace trace
Training training
Training Error training error
Training Sample Training Samples
Training Set training set
Transductive Learning pushes learning directly
Transductive Transfer Learning pushes transfer learning directly
Transfer Learning Transfer Learning
TransformerTransformer
Transformer ModelTransformer model
Transpose transpose
Transposed Convolution transposes convolution
Trial And Error试错
Trigram ternary syntax
Turing Machine Turing Machine
Underfitting is underfitting
Undersampling undersampling
Undirected Graphical Model undirected graph model
Uniform Distribution is evenly distributed
Unigram unary syntax
Unit unit
Universal Approximation Theorem universal approximation theorem
Universal Approximator universal approximator
Universal Function Approximator universal function approximator
Unknown Token unknown word
Unsupervised Layer-Wise Training unsupervised layer-by-layer training
Unsupervised Learning unsupervised learning
Update Gate updates the door
Upsampling upsampling
V-StructureV type structure
Validation Set validation set
Validity Index effectiveness metrics
The Value Function Approximation value function is approximate
Value Iteration value iteration
Vanishing Gradient Problem Gradient Disappearing Problem
Vapnik-Chervonenkis Dimension VC dimension
Variable Elimination variable elimination
Variance variance
Variational Autoencoder variational autoencoder
Variational Inference variational inference
Vector vector
Vector Space Model Vector Space Model
Version Space version space
Viterbi Algorithm Viterbi algorithm
Vocabulary thesaurus
Warp thread bundle
Weak Learner weak learner
Weakly Supervised Learning Weakly Supervised Learning
Weight weights
Weight Decay weight decay
Weight Sharing rights sharing
Weighted Voting weighted vote
Whitening albinism
Winner-Take-All Winner-Take-All
Within-Class Scatter Matrix class within the divergence matrix
Word Embedding word embedding
Word Sense Disambiguation disambiguation
Word Vector word vector
Zero Padding zero padding
Zero-Shot Learning Zero Trial Learning
Zipf's Law Zippf's Law
Source: Artificial Intelligence AI Technology