LLM Components

  • Document selection
  • Tokenizing
  • Training input
  • Training elements
    • Architecture: transformer with attention
    • Forward prediction
    • Loss reporting
    • Weight updates (backpropagation)
    • Training loop
  • Generative completion (base model)
  • Fine tuning for applications

Compare to Figures 1.8 and 1.9 in Raschka

Compare to microGPT