Sakana Fugu is the flagship international commercial product from Japanese AI lab Sakana AI, delivering a full multi-agent orchestration system as a single foundation model with an OpenAI-compatible API endpoint. Beta launched on April 24, 2026; general availability (GA) with the Fugu Ultra model — June 22, 2026.
How it works
Fugu is itself a language model trained to decide when and to whom to delegate tasks from a pool of external agents (including Gemini 3.1 Pro, GPT 5.5, Opus 4.8). Decisions span: model selection, inter-agent communication patterns, verification, and response synthesis. From the outside, the user calls one API endpoint — inside, a coordinated team of experts works. The architecture builds on two ICLR 2026 papers: Trinity (Xu et al.) — an evolved LLM coordinator, and Conductor (Nielsen et al.) — learning to orchestrate agents in natural language.
Two variants
Fugu — balances strong performance with low latency, the default choice for everyday work. Integrates naturally with tools like Codex, chatbots, and interactive services. Allows opting specific models out of the pool (for teams with privacy and compliance requirements).
Fugu Ultra — tuned for maximum answer quality on hard, multi-step problems. Coordinates a deeper expert pool. Early users rely on it for AI research, paper reproduction, cybersecurity analysis, and literature and patent investigations.
Benchmark results
Fugu Ultra stands shoulder-to-shoulder with frontier leaders (Anthropic Fable 5, Mythos Preview) on the industry's most rigorous engineering, scientific, and reasoning benchmarks: GPQAD 95.1 (exceeds Gemini 3.1 high 94.4 and Opus 4.6 max 92.7), LCBv6 93.2, SWEPro 54.2 (vs Opus 4.6 max 53.4). In applications spanning AutoResearch, mechanical design, financial time series prediction, Rubik's Cube, Japanese handwriting analysis, and one-shot chess, Fugu consistently outperforms frontier models.
Recursive self-orchestration
A distinguishing feature is the ability to recursively call itself as an agent in the pool. The model reads its own prior responses as context and decides whether to revise its coordination strategy. This introduces a new test-time scaling axis — recursion depth can be tuned at inference time without retraining. A small model, by reading itself, reaches answers unattainable in a single pass.
Geopolitical context: AI sovereignty
Sakana AI positions Fugu as a practical hedge against single-vendor dependency risk. After export controls were imposed on Anthropic's Fable 5 and Mythos models, access can be revoked overnight. Fugu's agent pool is fully swappable — if one provider restricts access, Fugu dynamically routes around the disruption. This makes it critical infrastructure for finance, infrastructure, and government administration in the era of AI sovereignty.