Training

Foundation Model

2021ActivePublished

Key innovation

A single model trained at scale on broad data and subsequently adapted to many downstream tasks, replacing per-task models trained from scratch.

Category

Training

Abstraction level

Paradigm

Use cases

Large language models (LLM)Image generationMultimodal understandingRobotics (foundation models for manipulation and control)Search and retrievalEmbeddings

How it works

1) Pretraining: the model learns general representations on a very large, diverse corpus, typically via self-supervised objectives (e.g., next-token prediction, masked language modeling, contrastive learning). 2) Adaptation: the same model is adapted to specific tasks via fine-tuning, instruction tuning, RLHF, prompting, or parameter-efficient adapters (LoRA). Scaling parameters, data, and compute is associated with 'emergent capabilities' — abilities not observed in smaller models.

Problem solved

Removes the need to train a separate model from scratch for each task — one large, general model adapts to many applications at low marginal cost.

Evolution

Original paper · 2021 · Rishi Bommasani

On the Opportunities and Risks of Foundation Models

Rishi Bommasani, Percy Liang, Stanford CRFM (et al.)

2018

BERT and GPT — pretraining + fine-tuning as the dominant NLP paradigm

Inflection point

BERT (Google) and GPT (OpenAI) established the 'pretrain-then-adapt' paradigm as the NLP standard.

2020

GPT-3 and emergent capabilities

Inflection point

GPT-3 demonstrated that scale gives rise to few-shot capabilities without task-specific fine-tuning.

2021

Stanford CRFM coins the term 'foundation model'

Inflection point

Bommasani et al. formalize the paradigm and introduce the name.

2022

Multimodal foundation models (CLIP, DALL-E, Flamingo)

Extension of the paradigm beyond text — image, video, audio.

2023

Robotics foundation models (RT-2, RT-X)

Google DeepMind brings the paradigm to robotics by combining VLM with manipulation.

2024

Open-weight foundation models (Llama 3, Mistral)

Open-weight models become competitive with closed counterparts.

Sources

On the Opportunities and Risks of Foundation Models

Stanford CRFM / arXiv

Stanford Center for Research on Foundation Models (CRFM)

official_website

Stanford University