810 5 months ago

Mistral Small 3.1 mainly focuses on local deployment, and along with Gemma 3 27B, they're both mid-sized multi-modal AI models with billions of parameters. Because they're lightweight, you can run them on something like a single Nvidia RTX4090

tools

Models

View all →

Readme


Mistral Small 3.1 is released under an Apache 2.0 license.


Model Card for Mistral-Small-3.1-24B-Instruct-2503

Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
This model is an instruction-finetuned version of: Mistral-Small-3.1-24B-Base-2503.

Mistral Small 3.1 can be deployed locally and is exceptionally “knowledge-dense,” fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.

It is ideal for: - Fast-response conversational agents. - Low-latency function calling. - Subject matter experts via fine-tuning. - Local inference for hobbyists and organizations handling sensitive data. - Programming and math reasoning. - Long document understanding. - Visual understanding.

For enterprises requiring specialized capabilities (increased context, specific modalities, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.

Learn more about Mistral Small 3.1 in our blog post.

Key Features

  • Vision: Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
  • Agent-Centric: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Advanced Reasoning: State-of-the-art conversational and reasoning capabilities.
  • Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
  • Context Window: A 128k context window.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.

Benchmark Results

When available, we report numbers previously published by other model providers, otherwise we re-evaluate them using our own evaluation harness.

Pretrain Evals

Model MMLU (5-shot) MMLU Pro (5-shot CoT) TriviaQA GPQA Main (5-shot CoT) MMMU
Small 3.1 24B Base 81.01% 56.03% 80.50% 37.50% 59.27%
Gemma 3 27B PT 78.60% 52.20% 81.30% 24.30% 56.10%

Instruction Evals

Text

Model MMLU MMLU Pro (5-shot CoT) MATH GPQA Main (5-shot CoT) GPQA Diamond (5-shot CoT ) MBPP HumanEval SimpleQA (TotalAcc)
Small 3.1 24B Instruct 80.62% 66.76% 69.30% 44.42% 45.96% 74.71% 88.41% 10.43%
Gemma 3 27B IT 76.90% 67.50% 89.00% 36.83% 42.40% 74.40% 87.80% 10.00%
GPT4o Mini 82.00% 61.70% 70.20% 40.20% 39.39% 84.82% 87.20% 9.50%
Claude 3.5 Haiku 77.60% 65.00% 69.20% 37.05% 41.60% 85.60% 88.10% 8.02%
Cohere Aya-Vision 32B 72.14% 47.16% 41.98% 34.38% 33.84% 70.43% 62.20% 7.65%

Vision

Model MMMU MMMU PRO Mathvista ChartQA DocVQA AI2D MM MT Bench
Small 3.1 24B Instruct 64.00% 49.25% 68.91% 86.24% 94.08% 93.72% 7.3
Gemma 3 27B IT 64.90% 48.38% 67.60% 76.00% 86.60% 84.50% 7
GPT4o Mini 59.40% 37.60% 56.70% 76.80% 86.70% 88.10% 6.6
Claude 3.5 Haiku 60.50% 45.03% 61.60% 87.20% 90.00% 92.10% 6.5
Cohere Aya-Vision 32B 48.20% 31.50% 50.10% 63.04% 72.40% 82.57% 4.1

Multilingual Evals

Model Average European East Asian Middle Eastern
Small 3.1 24B Instruct 71.18% 75.30% 69.17% 69.08%
Gemma 3 27B IT 70.19% 74.14% 65.65% 70.76%
GPT4o Mini 70.36% 74.21% 65.96% 70.90%
Claude 3.5 Haiku 70.16% 73.45% 67.05% 70.00%
Cohere Aya-Vision 32B 62.15% 64.70% 57.61% 64.12%

Long Context Evals

Model LongBench v2 RULER 32K RULER 128K
Small 3.1 24B Instruct 37.18% 93.96% 81.20%
Gemma 3 27B IT 34.59% 91.10% 66.00%
GPT4o Mini 29.30% 90.20% 65.8%
Claude 3.5 Haiku 35.19% 92.60% 91.90%

Basic Instruct Template (V7-Tekken)

<s>[SYSTEM_PROMPT]<system prompt>[/SYSTEM_PROMPT][INST]<user message>[/INST]<assistant response></s>[INST]<user message>[/INST]

<system_prompt>, <user message> and <assistant response> are placeholders.

Please make sure to use mistral-common as the source of truth

Citation

  1. Mistral Small 3.1
  2. Mistral-Small-3.1-24B-Instruct-2503