GPT-3: Language Models are Few-shot Learners
GPT-3: Language Models are Few-shot Learners
Presented by: Aman Singhal
Venue: Large Language and Vision Models Symposium, NYU Center for Data Science
Year: 2024
Abstract
Overview of GPT-3, OpenAI’s 175 billion parameter autoregressive language model that achieves strong performance through few-shot learning without task-specific fine-tuning. This talk covers the model’s architecture, training data, evaluation results, and limitations, as well as the evolution of subsequent large language models (PaLM, LLaMA, DALL-E, etc.) that built upon GPT-3’s foundational insights on scaling.
Presentation Slides
Key Topics
- GPT-3 architecture and scaling laws
- Few-shot learning without fine-tuning
- Training data and methodology
- Benchmark evaluation results
- Model limitations and challenges
- Evolution of subsequent LLMs (PaLM, LLaMA, DALL-E)
- Impact on the field of natural language processing