403 5 months ago

Q6_K / Q5_K_M / Q4_K_S | mistral-small3.1:24b-instruct-2503

tools

5 months ago

e0a0e861d441 · 18GB

mistral3
·
24B
·
Q5_K_M
You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup head
{ "num_ctx": 4096 }
{{- range $index, $_ := .Messages }} {{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTE

Readme

Extra quants for Mistral-Small-3.1-24B

Q6_K / Q5_K_M / Q4_K_S

These are quantized using ollama client, so these quants supports Vision


On an RTX 4090 with 24GB of VRAM

Q8 KV Cache enabled

Leave 1GB to 800MB of VRAM as a buffer


Q6_K: 35K context

Q5_K_M: 64K context

Q4_K_S: 100K context