The Ministral-8B-Instruct-2410 Language Model is an instruct fine-tuned model significantly outperforming existing models of similar size, released under the Mistral Research License.

Ministral 8B Instruct

HuggingFace.co link We introduce two new state-of-the-art models for local intelligence, on-device computing, and at-the-edge use cases. We call them les Ministraux: Ministral 3B and Ministral 8B.

The Ministral-8B-Instruct-2410 Language Model is an instruct fine-tuned model significantly outperforming existing models of similar size, released under the Mistral Research License.

If you are interested in using Ministral-3B or Ministral-8B commercially, outperforming Mistral-7B, reach out to us.

For more details about les Ministraux please refer to our release blog post.

Basic Instruct Template (V3-Tekken)

<s>[INST]user message[/INST]assistant response</s>[INST]new user message[/INST]

Ministral 8B Architecture

Feature	Value
Architecture	Dense Transformer
Parameters	8,019,808,256
Layers	36
Heads	32
Dim	4096
KV Heads (GQA)	8
Hidden Dim	12288
Head Dim	128
Vocab Size	131,072
Context Length	128k
Attention Pattern	Ragged (128k,32k,32k,32k)

Benchmarks

Base Models

Knowledge & Commonsense

Model	MMLU	AGIEval	Winogrande	Arc-c	TriviaQA
Mistral 7B Base	62.5	42.5	74.2	67.9	62.5
Llama 3.1 8B Base	64.7	44.4	74.6	46.0	60.2
Ministral 8B Base	65.0	48.3	75.3	71.9	65.5

Gemma 2 2B Base	52.4	33.8	68.7	42.6	47.8
Llama 3.2 3B Base	56.2	37.4	59.6	43.1	50.7
Ministral 3B Base	60.9	42.1	72.7	64.2	56.7

Code & Math

Model	HumanEval pass@1	GSM8K maj@8
Mistral 7B Base	26.8	32.0
Llama 3.1 8B Base	37.8	42.2
Ministral 8B Base	34.8	64.5

Gemma 2 2B	20.1	35.5
Llama 3.2 3B	14.6	33.5
Ministral 3B	34.2	50.9

Multilingual

Model	French MMLU	German MMLU	Spanish MMLU
Mistral 7B Base	50.6	49.6	51.4
Llama 3.1 8B Base	50.8	52.8	54.6
Ministral 8B Base	57.5	57.4	59.6

Gemma 2 2B Base	41.0	40.1	41.7
Llama 3.2 3B Base	42.3	42.2	43.1
Ministral 3B Base	49.1	48.3	49.5

Instruct Models

Chat/Arena (gpt-4o judge)

Model	MTBench	Arena Hard	Wild bench
Mistral 7B Instruct v0.3	6.7	44.3	33.1
Llama 3.1 8B Instruct	7.5	62.4	37.0
Gemma 2 9B Instruct	7.6	68.7	43.8
Ministral 8B Instruct	8.3	70.9	41.3

Gemma 2 2B Instruct	7.5	51.7	32.5
Llama 3.2 3B Instruct	7.2	46.0	27.2
Ministral 3B Instruct	8.1	64.3	36.3

Code & Math

Model	MBPP pass@1	HumanEval pass@1	Math maj@1
Mistral 7B Instruct v0.3	50.2	38.4	13.2
Gemma 2 9B Instruct	68.5	67.7	47.4
Llama 3.1 8B Instruct	69.7	67.1	49.3
Ministral 8B Instruct	70.0	76.8	54.5

Gemma 2 2B Instruct	54.5	42.7	22.8
Llama 3.2 3B Instruct	64.6	61.0	38.4
Ministral 3B Instruct	67.7	77.4	51.7

Function calling

Model	Internal bench
Mistral 7B Instruct v0.3	6.9
Llama 3.1 8B Instruct	N/A
Gemma 2 9B Instruct	N/A
Ministral 8B Instruct	31.6

Gemma 2 2B Instruct	N/A
Llama 3.2 3B Instruct	N/A
Ministral 3B Instruct	28.4

The Ministral-8B-Instruct-2410 Language Model is an instruct fine-tuned model significantly outperforming existing models of similar size, released under the Mistral Research License.

Models

Readme

Ministral 8B Instruct

Basic Instruct Template (V3-Tekken)

Ministral 8B Architecture

Benchmarks

Base Models

Instruct Models