Q6_K / Q5_K_M / Q4_K_S | mistral-small3.1:24b-instruct-2503

vision tools

176 12 days ago

Readme

Extra quants for Mistral-Small-3.1-24B

Q6_K / Q5_K_M / Q4_K_S

These are quantized using ollama client, so these quants supports Vision


On an RTX 4090 with 24GB of VRAM

Q8 KV Cache enabled

Leave 1GB to 800MB of VRAM as a buffer


Q6_K: 35K context

Q5_K_M: 64K context

Q4_K_S: 100K context