Q6_K / Q5_K_M / Q4_K_S | mistral-small3.1:24b-instruct-2503
vision
tools
176 Pulls Updated 12 days ago
Updated 12 days ago
12 days ago
5b3f2f03c049 · 15GB
model
archmistral3
·
parameters24B
·
quantizationQ4_K_S
15GB
params
{
"num_ctx": 4096
}
17B
template
{{- range $index, $_ := .Messages }}
{{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTE
695B
system
You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup head
1.5kB
Readme
Extra quants for Mistral-Small-3.1-24B
Q6_K / Q5_K_M / Q4_K_S
These are quantized using ollama client, so these quants supports Vision
On an RTX 4090 with 24GB of VRAM
Q8 KV Cache enabled
Leave 1GB to 800MB of VRAM as a buffer
Q6_K: 35K context
Q5_K_M: 64K context
Q4_K_S: 100K context