glm-5.2

Applications

Claude Code ollama launch claude --model glm-5.2:cloud

Codex App ollama launch codex-app --model glm-5.2:cloud

OpenClaw ollama launch openclaw --model glm-5.2:cloud

Hermes Agent ollama launch hermes --model glm-5.2:cloud

Codex ollama launch codex --model glm-5.2:cloud

OpenCode ollama launch opencode --model glm-5.2:cloud

GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks.

With a truly usable 1M-token context window, it can handle project-level engineering context, execute long-running tasks more reliably, follow engineering standards more consistently, and complete the full development workflow from requirements to multi-platform deployment in a single task.

What’s new

Solid 1M context: A 1M-token context that stably sustains long-horizon work, not just accepts more tokens.
Advanced coding with flexible effort: Stronger coding capabilities with multiple thinking effort levels (High and Max) to balance performance against latency and compute.
Pure open: An MIT open-source license — technical access without borders.

Built for long-horizon tasks

Supporting long-horizon tasks starts with making long context engineering-usable: the model must maintain quality across long, messy coding-agent trajectories, not just accept more tokens. A 1M context is easy to claim but much harder to keep reliable under real engineering pressure. To this end, Z.AI substantially expanded 1M-context training for coding-agent scenarios — large-scale implementation, automated research, performance optimization, and complex debugging — producing a long-context system that is both wide in scope and solid in execution.

This shows up on three long-horizon coding benchmarks. On FrontierSWE, which measures whether an agent can complete open-ended technical projects at the scale of hours to tens of hours, GLM-5.2 trails Opus 4.8 by only 1% while edging out GPT-5.5 by 1% and Opus 4.7 by 11%. On PostTrainBench, where each agent is given an H100 GPU and judged by how much it improves small models through post-training, GLM-5.2 outperforms both Opus 4.7 and GPT-5.5, ranking second only to Opus 4.8. On SWE-Marathon, an ultra-long-horizon benchmark covering tasks like building compilers, optimizing kernels, and developing production-grade services, GLM-5.2 trails Opus 4.8 by 13% while remaining second only to the Opus series. Across all three, GLM-5.2 is the highest-ranked open-source model.

Stronger coding

On standard coding benchmarks, GLM-5.2 is the strongest open-source model, improving on GLM-5.1 by a wide margin: 81.0 vs. 62.0 on Terminal-Bench 2.1 and 62.1 vs. 58.4 on SWE-bench Pro. It also closes much of the gap to the closed-source frontier — on Terminal-Bench 2.1 (81.0) it lands within a few points of Claude Opus 4.8 (85.0) — while staying ahead of Gemini 3.1 Pro.

GLM-5.2 also introduces effort level control, letting users explicitly balance capability against speed and cost. At comparable token budgets, GLM-5.2 delivers substantially stronger agentic coding than GLM-5.1, with capability roughly positioned between Claude Opus 4.7 and Claude Opus 4.8 under similar token consumption. The Max effort level allocates additional computation when higher performance is needed on challenging tasks.

GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks.

Applications

Models

Readme

What’s new

Built for long-horizon tasks

Stronger coding

Full benchmark table