Week 5 Outline
- Discuss experience summaries
- Review quiz answers
- Intro to attention (sections 3.1 to 3.4 in Raschka)
- Seminal article on attention
- Simple unweighted attention (section 3.3)
- Learnable attention (section 3.4)
- Demo code
- Project idea: reusing microGPT tokens
- Attention example
- Attention variants (remainder of chapter 3)
- Causal/masking attention
- Random masking attention
- Multi-head attention
- Project ideas: ablation studies
- Revisit course learning goals
- Experience report ideas