Week 5 Outline

  • Discuss experience summaries
  • Review quiz answers
  • Intro to attention (sections 3.1 to 3.4 in Raschka)
  • Project idea: reusing microGPT tokens
  • Attention example
  • Attention variants (remainder of chapter 3)
    • Causal/masking attention
    • Random masking attention
    • Multi-head attention
  • Project ideas: ablation studies
  • Revisit course learning goals
  • Experience report ideas