Transformer from Scratch · Optimizations and Modern Variants
LoRA, MoE and Future Directions
Optimizations and Modern Variants
Introduction
At the end of the course you will learn directions beyond a minimal Transformer: low-rank fine-tuning, mixture-of-experts models, quantization, longer context, and practical deployment tradeoffs.