761 Downloads Updated 9 months ago
Name
11 models
mistral-small:24b-instruct-2501-q3_K_S
10GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q3_K_M
11GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q3_K_L
12GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q4_0
13GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q4_1
15GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q4_K_S
14GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q5_0
16GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q5_1
18GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q5_K_S
16GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q5_K_M
17GB · 32K context window · Text · 9 months ago
mistral-small:24b-instruct-2501-q6_K
19GB · 32K context window · Text · 9 months ago
These are alternative quantization levels from Mistral’s new 24B Mistral Small 3. No fine-tuning has been done, these are purely quantized.
Benchmarks on M1 Max (64GB):
| Quant | Tok/sec |
|---|---|
Q8_0 |
13.39597190567003 |
Q6_K |
12.196783864813302 |
Q5_K_M |
13.346122678485786 |
Q5_K_S |
13.907560335445874 |
Q5_1 |
15.163411522229856 |
Q5_0 |
15.23285945396498 |
Q4_K_M |
17.98863875447086 |
Q4_K_S |
20.048530172242334 |
Q4_1 |
20.496397117694155 |
Q4_0 |
22.094949324798563 |
Q3_K_L |
14.348439705190527 |
Q3_K_M |
16.1832971338529 |
Q3_K_S |
14.962143973080158 |
Easy prompts that are tolerant to potential mistakes should run Q4_0. For balanced quality with decent speed, use Q4_K_M. Avoid Q6_K.