Quick Recap — Day 1
Everything we built yesterday in 60 seconds
Evolution
Symbolic AI → ML → DL → Transformers → LLMs
Foundations
Gradient descent, backpropagation, loss landscapes
Tokenization
BPE subword pieces, ~4 chars ≈ 1 token
Embeddings
Words as vectors: King − Man + Woman ≈ Queen
Attention
Q·K→weights, V→output. Every token sees every other.
Training
Pre-training → SFT → RLHF. Next-token prediction at scale.