Glossary

0-9

0-shot learning 1-shot learning 2-stage detector 3D convolution 4D data 5G + AI 6DoF pose estimation 7D representation 8-bit quantization 9-layer network

A

AGI / Artificial General Intelligence Algorithm Artificial Intelligence (AI)Attention Autoencoder

B

Backpropagation Batch Normalization BERT Bias Boosting

C

Chatbot Classifier / Classification Clustering CNN / Convolutional Neural Network Cross-Validation

D

Data Augmentation Deep Learning Deepfake Deterministic Model Discriminative Model

E

Embedding Encoder Ensemble Learning Epoch Explainable AI (XAI)

F

Feature Extraction Fine-tuning Forward Propagation Foundation Model Fusion / Multimodal Fusion

G

GAN / Generative Adversarial Network Generative AI Gradient Descent Graph Neural Network (GNN)Grounding

H

Hallucination Heuristic Hidden Layer Hierarchical Model Hyperparameter

I

Imbalanced Data Instance / Sample Instruction tuning Intelligence Amplification / Augmentation Interpretability

J

JAX Jittering Joint Embedding JSONL / JSON-lines Juxtaposition

K

K-means Clustering K-Shot Learning Kernel Trick KL Divergence (Kullback–Leibler Divergence)Knowledge Distillation

L

Large Language Model (LLM)Latent Variable Learning Rate Loss Function LSTM / Long Short-Term Memory

M

Machine Learning (ML)Meta-learning Model Multi-head Attention Multimodal / Multimodality

N

Neural Network NLP / Natural Language Processing NLU / Natural Language Understanding Normalization Novelty Detection / Anomaly Detection

O

Objective Function One-hot Encoding Online Learning Optimizer Overfitting

P

Parameter Policy / Reinforcement Learning Policy Pooling Pretraining Prompt

Q

Q-learning Quality Estimation Quantization Query Queue / Buffer

R

Regularization Reinforcement Learning (RL)Representation Learning Retrieval Augmented Generation (RAG)RNN / Recurrent Neural Network

S

Sampling Self-Supervised Learning Sequence Modeling Softmax Supervised Learning

T

Tokenizer Training Data Transfer Learning Transformer Tuning / Hyperparameter Tuning

U

U-Net Uncertainty Estimation Underfitting Universal Approximation Theorem Unsupervised Learning

V

Validation Set Vanishing / Exploding Gradient Variational Autoencoder (VAE)Vector Embedding Vision Transformer (ViT)

W

Weak Supervision Weight Decay Whitening / Whitening Transformation Word Embedding Workflow

X

X-axis / feature axis XAI / Explainable AI XLM XLNet XOR problem

Y

Y-axis / feature axis Y-transform / YUV YAGNI (You Aren't Gonna Need It)Yield (model yield / throughput)Yoga of AI

Z

Z-score Normalization Zero-centric / Zero-bias initialization Zero-gradient phenomenon Zero-shot Learning / Zero-shot inference Zygosity in augmentation

What is Batch Normalization

Batch Normalization is a crucial technique in training deep learning models aimed at improving training speed and stability.

The core idea is to standardize the inputs of each layer by maintaining a small mean and variance range on each mini-batch of data. This method effectively reduces internal covariate shift, allowing for higher learning rates and faster convergence.

The importance of Batch Normalization lies in several aspects. Firstly, it accelerates the training of neural networks, as normalized data leads to a smoother learning process. Secondly, it enhances the generalization ability of models, reducing the risk of overfitting. Additionally, in some cases, Batch Normalization can provide a certain level of regularization, minimizing the dependence on other techniques like Dropout.

The operational mechanism involves calculating the mean and variance of the current batch and then standardizing the input based on these statistics. Subsequently, the standardized data is adjusted through trainable scale and shift parameters. This process is updated in each training step, allowing the model to adaptively adjust during training.

However, Batch Normalization is not without drawbacks. In certain situations, particularly with small batch sizes, the estimates of mean and variance may be unstable. Moreover, it may perform poorly in specific network architectures, such as recurrent neural networks.

Future trends indicate that Batch Normalization might integrate with emerging regularization methods like Layer Normalization and Group Normalization to better accommodate various network architectures and task requirements. Overall, Batch Normalization has become an indispensable part of modern deep learning, significantly enhancing training efficiency and model performance.

What is Batch Normalization - Glossary