Week 6 Outline

  • Discuss experience reports
  • XOR revisited (code, updated 6 May): training variations:
    • Activation functions
    • Number of hidden nodes
    • Loss functions
    • Extra inputs with random settings
    • Learning rate
  • Advanced variations
    • Storing weights
    • Adding more layers (easy)
    • Adding an attention head
    • Training on documents
  • Review for quiz
  • Next steps
  • Thursday: review XOR learning; incompatibility of BCE_loss and activation functions
  • Demo sentence generator for language learning project
  • Comments on downloading for GPT 2 weights (Raschka 5.5)
  • Project proposal and experience reports
  • Quiz 3