Architecture

LLM

2020ActivePublished: 3 May 2026Updated: 3 May 2026Published

Key innovation

Scaling autoregressive language modeling to billions of parameters enabled emergent reasoning, instruction following, and general-purpose text generation capabilities.

How it works

The Transformer model is trained on tokens from a text corpus, learning to predict the next token (autoregression). At sufficient scale (parameters, data, compute), emergent capabilities arise: reasoning, in-context learning, and instruction following.

Problem solved

Previous NLP models were narrowly specialized (separate models for translation, classification, QA). LLMs unify multiple language tasks within a single generic model.

Implementation

Implementation pitfalls

Hallucinations — model confidently generates false factsMedium

LLMs generate fluent text even without knowledge of a given fact — instead of saying "I do not know" the model fabricates details. Critical in medical, legal, financial applications.

Context window limit — information loss with long documentsMedium

LLMs have a finite context window (4k–1M tokens). When exceeded the model loses earlier information. Long documents require chunking + RAG or summarization.

Prompt injection in agentic systemsMedium

Malicious data in the agent environment (webpage content, email) can override system instructions and hijack the agent. Especially dangerous for agents with tool access.