Phase 07

Transformers Deep Dive

Phase 7: Transformers Deep Dive. 16 hands-on lessons building AI from first principles in the browser. Free reading; graded exercises and certificate with lifetime access.

  1. Why Transformers — The Problems with RNNs
  2. Self-Attention from Scratch (graded)
  3. Multi-Head Attention (graded)
  4. Positional Encoding — Sinusoidal, RoPE, ALiBi (graded)
  5. The Full Transformer — Encoder + Decoder (graded)
  6. BERT — Masked Language Modeling (graded)
  7. GPT — Causal Language Modeling (graded)
  8. T5, BART — Encoder-Decoder Models (graded)
  9. Vision Transformers (ViT) (graded)
  10. Audio Transformers — Whisper Architecture (graded)
  11. Mixture of Experts (MoE) (graded)
  12. KV Cache, Flash Attention & Inference Optimization (graded)
  13. Scaling Laws (graded)
  14. Build a Transformer from Scratch — The Capstone
  15. Attention Variants — Sliding Window, Sparse, Differential (graded)
  16. Speculative Decoding — Draft, Verify, Repeat (graded)
0 lifetime access. Curriculum based on AI Engineering from Scratch by Rohit Ghumare (MIT, used under attribution).