GPT-3: Language Models are Few-shot Learners

GPT-3: Language Models are Few-shot Learners

Presented by: Aman Singhal

Venue: Large Language and Vision Models Symposium, NYU Center for Data Science

Year: 2024

Abstract

Overview of GPT-3, OpenAI’s 175 billion parameter autoregressive language model that achieves strong performance through few-shot learning without task-specific fine-tuning. This talk covers the model’s architecture, training data, evaluation results, and limitations, as well as the evolution of subsequent large language models (PaLM, LLaMA, DALL-E, etc.) that built upon GPT-3’s foundational insights on scaling.

Presentation Slides

View slides in new window

Key Topics

  • GPT-3 architecture and scaling laws
  • Few-shot learning without fine-tuning
  • Training data and methodology
  • Benchmark evaluation results
  • Model limitations and challenges
  • Evolution of subsequent LLMs (PaLM, LLaMA, DALL-E)
  • Impact on the field of natural language processing
Back to Profile View All Talks