MiniMax M2.7

M2.7

MiniMax M2.7 is a MoE language model with 230B parameters (10B active during inference), released on March 18, 2026.

✓ Active✓ Public access⚖ Open weightsLLMReasoning modelTool-using model

Context window

200K

tokens

Parameters

230B (10B active)

parameters

Max output

131,072

tokens

Release date

18 March 2026

🏢MiniMaxProducer

Access:APIDownloadDeployment:☁ Cloud💻 Local

Overview

MiniMax M2.7 is a large language model (LLM) developed by the Chinese company MiniMax Group Inc. (稀宇科技), released on March 18, 2026. It is a Mixture of Experts (MoE) model with 230 billion total parameters, of which only 10 billion are activated during a single inference pass. M2.7 is the direct successor to MiniMax M2.5 (February 2026) and represents the first iteration in the M2 series in which the model actively participated in its own evolution and refinement process.

Architecture and parameters

The model is based on a sparse Mixture of Experts architecture with 256 experts and a top-k routing mechanism. It employs multi-head causal self-attention augmented with Rotary Position Embeddings (RoPE) and Query-Key Root Mean Square Normalization (QK RMSNorm). The context window is 200,000 tokens, and the maximum output length reaches 131,072 tokens. By activating only ~10B parameters per forward pass, the model offers low latency and reduced inference costs while maintaining capabilities comparable to dense models.

Self-directed model evolution

A key distinguishing feature of M2.7 is that an internal version of the model actively participated in its own training process. In this experiment, the model autonomously optimized a programming scaffold over more than 100 rounds — analyzing failure trajectories, modifying code, running evaluations, and deciding whether to accept or revert changes — achieving a 30% performance improvement without human intervention. The model constructs complex agentic harnesses, updates its own memory, creates dozens of compound skills, and refines its own learning process based on experimental results.

Software engineering and agentic workflows

M2.7 is designed primarily for advanced programming tasks and long agentic chains. It supports log analysis, bug detection, refactoring, code security, machine learning tasks, and comprehensive end-to-end project delivery. The model natively supports Agent Teams (multi-agent collaboration) and dynamic tool retrieval. On the SWE-Pro benchmark it scored 56.22%, matching GPT-5.3-Codex. On Terminal Bench 2 it achieved 57.0%, on VIBE-Pro — 55.6%, on SWE Multilingual — 76.5, and on MLE Bench Lite a medal rate of 66.6% (second among open-weight models).

Professional office work

The model demonstrates strong capabilities in office document editing — Excel, Word, and PowerPoint — with support for multi-round, high-fidelity modifications. On the GDPval-AA benchmark it achieved an ELO score of 1495, the highest among open-weight models. On Toolathon it reached 46.3% accuracy, and on MM Claw — 62.7%, approaching Sonnet 4.6. The model maintains a 97% skill compliance rate across a set of 40 complex skills each exceeding 2,000 tokens.

Availability and license

Model weights are publicly available on Hugging Face (MiniMaxAI/MiniMax-M2.7) and in the GitHub repository (MiniMax-AI/MiniMax-M2.7). The model can be run locally using SGLang, vLLM, or Transformers, as well as through the NVIDIA NIM Endpoint and Ollama. The license, described by MiniMax as "Modified-MIT," is in practice a non-commercial license — non-commercial use is permitted, while any commercial use requires prior written consent from MiniMax. M2.7 is the first model in the M2 series to depart from the fully permissive MIT license used in earlier versions (M2, M2.1, M2.5).

API pricing

Under the Pay-as-You-Go model, inference via the MiniMax API costs $0.30 per million input tokens and $1.20 per million output tokens. A MiniMax-M2.7-highspeed variant is also available ($0.60/$2.40 per million tokens), offering higher throughput at the same quality. Prompt caching read costs $0.06/M tokens, and prompt caching write — $0.375/M tokens.

Classification

LLMReasoning modelTool-using model

Access & deployment

APIDownload

CloudLocal

Weights: Open weights

Key parameters

📏 Context: 200K

🧩 Parameters: 230B (10B active)

✓ Tools

📥 Input: text

Technical specification

Context window

200K

tokens

Parameters

230B (10B active)

parameters

Max output tokens

131,072

tokens per response

License

MiniMax Non-Commercial License (Modified-MIT, non-commercial; commercial use requires prior written authorization)

Features:✓ Tool use

Modalities

⬇ Input

text

⬆ Output

textcode

Capabilities and applications

Native model capabilities

Reasoning

The model's ability to reason logically and solve complex problems.

Category: reasoning

Multi-step reasoning

Carrying out multi-step chains of reasoning across long, complex tasks.

Category: reasoning

Long context

Maintaining coherence and focus across very long input context.

Category: language

Coding

Generating, analysing and modifying source code.

Category: coding

Function Calling

Category: planning

Structured output

Producing data in structured formats such as JSON.

Category: structured_generation

Multilingual

Understanding and generating text in many languages.

Category: language

Planning

Forming and executing action plans for complex tasks.

Category: planning

Streaming output

Category: reasoning

Benchmark results

11 benchmarks

SWE-Pro

pass@1

56.22%%

📄 MiniMax official blog / Hugging Face model card

Real-world software engineering tasks. MiniMax reports this matches GPT-5.3-Codex level.

VIBE-Pro

pass@1

55.6%%

📄 MiniMax official blog / Hugging Face model card

End-to-end full project delivery benchmark. Reported close to Opus 4.6.

Terminal Bench 2

accuracy

57.0%%

📄 MiniMax official blog / Hugging Face model card

SWE Multilingual

score

76.5

📄 Hugging Face model card

Multi SWE Bench

score

52.7

📄 Hugging Face model card

NL2Repo

score

39.8

📄 Hugging Face model card

GDPval-AA (ELO)

ELO

1495

📄 MiniMax official blog / Hugging Face model card

Highest among open-source models per MiniMax; benchmark covers 45 models in professional office tasks.

Toolathon

accuracy

46.3%%

📄 Hugging Face model card

MM Claw

accuracy

62.7%%

📄 Hugging Face model card

End-to-end benchmark. MiniMax reports close to Sonnet 4.6.

MLE Bench Lite (Medal Rate, 22 competitions)

medal rate

66.6%%

📄 Hugging Face model card

Average over three 24-hour autonomous runs. Second only to Opus 4.6 (75.7%) and GPT-5.4 (71.2%).

Artificial Analysis Intelligence Index

index score

📄 Artificial Analysis (https://artificialanalysis.ai/models/minimax-m2-7)

Score of 50 vs field average of 27 (non-reasoning open-weight models of similar size). As of March/April 2026.

Pricing

Sources and related pages

12 sources

WebMiniMax M2.7 — Oficjalna strona produktuminimax.io BlogMiniMax M2.7: Early Echoes of Self-Evolution — oficjalny blog MiniMaxminimax.io RepoMiniMaxAI/MiniMax-M2.7 — Hugging Face model cardhuggingface.co RepoMiniMax-AI/MiniMax-M2.7 — GitHubgithub.com DocsMiniMax API Models — oficjalna dokumentacjaplatform.minimax.io DocsMiniMax Pay as You Go Pricingplatform.minimax.io DocsMiniMax Release Notes — Modelsplatform.minimax.io BlogNVIDIA Technical Blog: MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platformsdeveloper.nvidia.com WebArtificial Analysis — MiniMax M2.7 Intelligence Index & Specsartificialanalysis.ai WebOllama — minimax-m2.7 library pageollama.com WebDecrypt: MiniMax Drops State-of-the-Art AI Agent Model—Then Quietly Changes the Licensedecrypt.co RepoMiniMax-M2.7 LICENSE — GitHubgithub.com

Browse related topics

All llm models All reasoning model models