Inference

Prompt Engineering

2020ActiveUpdated: 6 May 2026Published

Key innovation

Demonstrated that language model behavior can be significantly controlled through careful formulation of input text instructions, without any modification of model weights.

How it works

An engineered prompt is a constructed input text containing: (1) system/role instructions, (2) optional few-shot examples (question-answer), (3) the current user query. Techniques like CoT ask the model to show reasoning ('think step by step'), improving answer quality on complex tasks. Prompt format directly influences which training patterns the model activates.

Problem solved

Language models are sensitive to input formulation: the same model can return significantly different results depending on how a question is phrased, the order of examples, or whether role instructions are added.

Implementation

Implementation pitfalls

Prompt sensitivity — small word changes yield different resultsMedium

LLMs are sensitive to word order, punctuation and formatting. A prompt working on GPT-4 may produce poor results on Claude or Gemini without modification.

Overfitting to a specific model and versionMedium

A prompt optimized for model X stops working after a model update. Lack of prompt versioning leads to production regressions.

Injection attacks in agentic systemsMedium

Untrusted input (e.g. webpage content) can override system instructions and hijack the agent — so-called prompt injection.

Evolution

2020

GPT-3 and few-shot prompting

Inflection point

Brown et al. demonstrate that GPT-3 can perform tasks via in-context examples, launching prompt engineering as a field.

2022

Chain-of-Thought prompting

Inflection point

Wei et al. show that asking models to reason step-by-step dramatically improves performance on arithmetic and reasoning tasks.

2023

System prompts in chat models

Chat APIs (OpenAI, Anthropic) standardize system prompt separation, enabling persona and instruction injection.

2023

Automated prompt optimization

Works like APE and DSPy propose automatic prompt optimization, reducing reliance on manual engineering.

Sources

Language Models are Few-Shot Learners (GPT-3)

Paper

Prompt Engineering Guide

Documentation