freerainboxbox/mistral-small

freerainboxbox/

mistral-small

761 Downloads Updated 9 months ago

Alternative quantization levels, no fine-tuning

tools

Models

Name

11 models

Size

Context

Input

mistral-small:24b-instruct-2501-q3_K_S

10GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q3_K_S

10GB

32K

Text

mistral-small:24b-instruct-2501-q3_K_M

11GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q3_K_M

11GB

32K

Text

mistral-small:24b-instruct-2501-q3_K_L

12GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q3_K_L

12GB

32K

Text

mistral-small:24b-instruct-2501-q4_0

13GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q4_0

13GB

32K

Text

mistral-small:24b-instruct-2501-q4_1

15GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q4_1

15GB

32K

Text

mistral-small:24b-instruct-2501-q4_K_S

14GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q4_K_S

14GB

32K

Text

mistral-small:24b-instruct-2501-q5_0

16GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q5_0

16GB

32K

Text

mistral-small:24b-instruct-2501-q5_1

18GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q5_1

18GB

32K

Text

mistral-small:24b-instruct-2501-q5_K_S

16GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q5_K_S

16GB

32K

Text

mistral-small:24b-instruct-2501-q5_K_M

17GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q5_K_M

17GB

32K

Text

mistral-small:24b-instruct-2501-q6_K

19GB · 32K context window · Text · 9 months ago

mistral-small:24b-instruct-2501-q6_K

19GB

32K

Text

Readme

These are alternative quantization levels from Mistral’s new 24B Mistral Small 3. No fine-tuning has been done, these are purely quantized.

Benchmarks on M1 Max (64GB):

Quant	Tok/sec
`Q8_0`	13.39597190567003
`Q6_K`	12.196783864813302
`Q5_K_M`	13.346122678485786
`Q5_K_S`	13.907560335445874
`Q5_1`	15.163411522229856
`Q5_0`	15.23285945396498
`Q4_K_M`	17.98863875447086
`Q4_K_S`	20.048530172242334
`Q4_1`	20.496397117694155
`Q4_0`	22.094949324798563
`Q3_K_L`	14.348439705190527
`Q3_K_M`	16.1832971338529
`Q3_K_S`	14.962143973080158

Easy prompts that are tolerant to potential mistakes should run Q4_0. For balanced quality with decent speed, use Q4_K_M. Avoid Q6_K.