Robots Atlas>ROBOTS ATLAS
LongCat-2.0

LongCat-2.0

2.0 · Family: LongCat
Meituan's frontier LLM: 1.6T MoE parameters (~48B active per token), native 1M-token context via LongCat Sparse Attention; fully trained on Chinese ASIC superpods on 35T+ tokens.
⏳ Limited access⚖ Open sourceFeaturedLLM📁 LongCat
Context window
1 mln (1M) tokenów (natywne, dzięki LongCat Sparse Attention)
tokens
Parameters
1,6 bln (1.6T) — łącznie; ~48 mld (~48B) aktywnych per token
parameters
Release date
30 December 2025
Access:DownloadDeployment:☁ Cloud💻 Local

Overview

LongCat-2.0 is a large-scale MoE (Mixture of Experts) language model with 1.6 trillion total parameters and approximately 48 billion activated per token, released by Meituan's LongCat team in December 2025 as the next step after the LongCat-Flash line. The model introduces several architectural improvements over the previous generation.

Training on Chinese AI ASIC superpods

Both the full training run and the large-scale deployment of LongCat-2.0 are built entirely on China-made AI ASIC superpods (no NVIDIA). Pre-training spans millions of accelerator-hours across more than 35 trillion tokens, with no rollbacks or irrecoverable loss spikes — a demonstration that frontier-scale training is feasible on alternative hardware platforms.

Native 1M context via LongCat Sparse Attention

To strengthen the model on long-horizon tasks, the team introduced LongCat Sparse Attention and trained LongCat-2.0 on hundreds of billions of tokens of 1M-context data. Together with dedicated post-training, this gives the model strong performance on coding and agentic tasks.

Status as of December 2025: model announcement with a technical blog; weights 'coming soon' (not yet released). MIT licence.

Classification
LLM
Family: LongCat
Applications
Access & deployment
Download
CloudLocal
Weights: Open source
Key parameters
📏 Context: 1 mln (1M) tokenów (natywne, dzięki LongCat Sparse Attention)
🧩 Parameters: 1,6 bln (1.6T) — łącznie; ~48 mld (~48B) aktywnych per token
Tools
📥 Input: text

Technical specification

Context window
1 mln (1M) tokenów (natywne, dzięki LongCat Sparse Attention)
tokens
Parameters
1,6 bln (1.6T) — łącznie; ~48 mld (~48B) aktywnych per token
parameters
Max output tokens
0
tokens per response
License
MIT
Hardware requirements
Trained on China-made AI ASIC superpods (no NVIDIA), millions of accelerator-hours; inference requirements for the 1.6T MoE / 48B active model have not been publicly specified yet (weights 'coming soon' as of December 2025).
Features:Tool use
Modalities
⬇ Input
text
⬆ Output
textcode

Capabilities and applications

Native model capabilities
Coding
Generating, analysing and modifying source code.
Category: coding
Reasoning
The model's ability to reason logically and solve complex problems.
Category: reasoning
Agentic capability
The model's ability to autonomously plan and execute multi-step tasks by sequentially using tools, maintaining context, and adapting to intermediate results.
Category: planning
Function Calling
Category: planning
Long context
Maintaining coherence and focus across very long input context.
Category: language
Multi-step reasoning
Carrying out multi-step chains of reasoning across long, complex tasks.
Category: reasoning
Mathematical reasoning
The model's ability to solve mathematical tasks requiring multi-step reasoning — equations, proofs, combinatorics, geometry, calculus and competition-level problems.
Category: reasoning
Language modeling
Ability to predict subsequent tokens and generate coherent natural-language text based on the preceding context.
Category: language
Multilingual
Understanding and generating text in many languages.
Category: language
Application domains

Technical architecture

Core Architecture