1,432 10 hours ago

MiniMax's M2-series model for coding, agentic workflows, and professional productivity.

tools thinking cloud
ollama run minimax-m2.7:cloud

Applications

Claude Code
Claude Code ollama launch claude --model minimax-m2.7:cloud
Codex
Codex ollama launch codex --model minimax-m2.7:cloud
OpenCode
OpenCode ollama launch opencode --model minimax-m2.7:cloud
OpenClaw
OpenClaw ollama launch openclaw --model minimax-m2.7:cloud

Models

View all →

Readme

logo

MiniMax M2.7 is the first model in the M2-series to deeply participate in its own evolution, capable of building complex agent harnesses and completing highly elaborate productivity tasks through Agent Teams, complex Skills, and dynamic tool search.

Highlights

  • Professional software engineering. M2.7 delivers outstanding performance in real-world engineering scenarios including end-to-end project delivery, log analysis and bug troubleshooting, code security, and machine learning. On SWE-Pro it scored 56.22%, matching GPT-5.3-Codex and nearly approaching Opus’s best level. On VIBE-Pro (55.6%) and Terminal Bench 2 (57.0%), it demonstrates deep understanding of complex engineering systems.

  • Professional work and complex environments. In the GDPval-AA evaluation across 45 models, M2.7 achieved an ELO score of 1495, the highest among open-source models. It handles complex editing in Excel, PPT, and Word with multi-round high-fidelity revisions, and maintains a 97% skill adherence rate across 40 complex skills (each exceeding 2,000 tokens). On Toolathon, M2.7 reached 46.3% accuracy, a global top-tier result.

  • Character consistency and entertainment. M2.7 possesses excellent character consistency and emotional intelligence. MiniMax has open-sourced OpenRoom, a Web GUI interaction system where conversation drives real-time visual feedback and scene interactions with characters proactively engaging their environment.

benchmark overview

Benchmarks

Software Engineering

M2.7 reaches the level of state-of-the-art models across real-world programming tasks spanning multiple languages and system-level comprehension.

Benchmark M2.7 Notes
SWE-Pro 56.22% Matches GPT-5.3-Codex
VIBE-Pro 55.6% Nearly on par with Opus 4.6
Terminal Bench 2 57.0% Deep system-level understanding
SWE Multilingual 76.5 Multi-language engineering
Multi SWE Bench 52.7 Multi-repo tasks
NL2Repo 39.8 Natural language to repository

Professional Work & Office

Benchmark M2.7 Notes
GDPval-AA (ELO) 1495 Highest among open-source models
Toolathon 46.3% Global top tier
MM Claw 62.7% Close to Sonnet 4.6
Skill Adherence (40 skills) 97% Each skill >2,000 tokens

Machine Learning (MLE Bench Lite)

In exploratory self-evolution tests across 22 ML competitions, M2.7 achieved an average medal rate of 66.6% across three 24-hour autonomous runs, second only to Opus 4.6 (75.7%) and GPT-5.4 (71.2%), tying with Gemini 3.1.

Reference