Preface
DL Note
Preface
Chap 0: Before Neural Networks
0.1 Hopfield Network
0.2 Boltzmann Machine
0.3 Hebbian Learning
0.4 Perceptron
Chap 1: Pytorch
1.1 Tensor
1.2 OOP
1.3 Computation Performance
1.4 Useful Tensor Operations
1.5 Autograd and Training Loop
1.6 Data Pipeline and Reproducibility
Chap 2: Neural Networks
2.1 Theory of Neural Networks
2.2 Basic Neural Network Block
2.3 CNN
2.4 RNN
2.5 GNN
2.6 GAN
2.7 Diffusion
2.8 Discrete Modeling & Training Paradigms
2.9 Loss Functions and Objectives
Chap 3: Optimization
3.1 凸优化
3.2 优化器
3.3 学习率调度
3.4 正则化
3.5 梯度裁剪
3.6 参数初始化
3.7 Advanced Convex Methods
3.8 Optimizer Engineering
Chap 4: Transformers and LLMs
4.1 Tokens, Vocabulary, and Embeddings
4.1B Tokenizers and BPE
4.2 Attention Mechanism
4.2B Attention Implementation
4.3 Masks and Positional Encoding
4.4 Model Structure
4.5 Decoder-Only and GPT-2
4.6 Next-Token Prediction Math
4.7 Qwen3 Technical Report
4.8 Post-Training and Preference Optimization
4.9 PEFT, LoRA, and QLoRA
4.10 LLM Training Systems
4.11 LLM Inference and KV Cache
Transformer-XL
LLaDA / Masked Diffusion LM
SSM & Mamba
LLM Inference Infrastructure
Epilogue
About
Deep Learning Notes
本文是笔者在学习深度学习的过程中所做的笔记。内容来源主要包括:
李沐老师的
动手学深度学习
;
吴恩达老师的
深度学习专项课程
;
以及笔者在学习过程中所参考的其他资料(包括 GPT)。