Devstral is an agentic LLM for software engineering tasks. Devstral 2 models excel at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench.

Applications

Claude Code ollama launch claude --model SimonPu/Devstral-Small

Codex App ollama launch codex-app --model SimonPu/Devstral-Small

OpenClaw ollama launch openclaw --model SimonPu/Devstral-Small

Hermes Agent ollama launch hermes --model SimonPu/Devstral-Small

Codex ollama launch codex --model SimonPu/Devstral-Small

OpenCode ollama launch opencode --model SimonPu/Devstral-Small

Updated:

. 2026/02/09 update new model support for Devstral-Small:2512-Q4_K_M

Devstral Small 2 24B Instruct 2512

Devstral is an agentic LLM for software engineering tasks. Devstral Small 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents.
The model achieves remarkable performance on SWE-bench.

This model is an Instruct model in FP8, fine-tuned to follow instructions, making it ideal for chat, agentic and instruction based tasks for SWE use cases.

For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we invite companies to reach out to us.

Key Features

The Devstral Small 2 Instruct model offers the following capabilities: - Agentic Coding: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents. - Lightweight: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use. - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes. - Context Window: A 256k context window.

Updates compared to Devstral Small 1.1: - Vision Capabilities: Enables the model to analyze images and provide insights based on visual content, in addition to text. - Improved Performance: Devstral Small 2 is a step-up compared to its predecessors. - Attention Softmax Temperature: Devstral Small 2 uses the same architecture as Ministral 3 using rope-scaling as introduced by Llama 4 and Scalable-Softmax Is Superior for Attention. - Better Generalization: Generalises better to diverse prompts and coding environments.

Use Cases

AI Code Assistants, Agentic Coding, and Software Engineering Tasks. Leveraging advanced AI capabilities for complex tool integration and deep codebase understanding in coding environments.

Benchmark Results

Model/Benchmark	Size (B Parameters)	SWE Bench Verified	SWE Bench Multilingual	Terminal Bench 2
Devstral 2	123	72.2%	61.3%	32.6%
Devstral Small 2	24	68.0%	55.7%	22.5%

GLM 4.6	355	68.0%	–	24.6%
Qwen 3 Coder Plus	480	69.6%	54.7%	25.4%
MiniMax M2	230	69.4%	56.5%	30.0%
Kimi K2 Thinking	1000	71.3%	61.1%	35.7%
DeepSeek v3.2	671	73.1%	70.2%	46.4%

GPT 5.1 Codex High	–	73.7%	–	52.8%
GPT 5.1 Codex Max	–	77.9%	–	60.4%
Gemini 3 Pro	–	76.2%	–	54.2%
Claude Sonnet 4.5	–	77.2%	68.0%	42.8%

*Benchmark results presented are based on publicly reported values for competitor models.