JollyLlama/Mistral-Small-3.1-24B:Q4_K_S

JollyLlama/

Mistral-Small-3.1-24B

Q6_K / Q5_K_M / Q4_K_S | mistral-small3.1:24b-instruct-2503

vision tools

176 Pulls Updated 12 days ago

Updated 12 days ago

12 days ago

5b3f2f03c049 · 15GB

quantizationQ4_K_S

{ "num_ctx": 4096 }

{{- range $index, $_ := .Messages }} {{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTE

You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup head

Readme

Extra quants for Mistral-Small-3.1-24B

Q6_K / Q5_K_M / Q4_K_S

These are quantized using ollama client, so these quants supports Vision

On an RTX 4090 with 24GB of VRAM

Q8 KV Cache enabled

Leave 1GB to 800MB of VRAM as a buffer

Q6_K: 35K context

Q5_K_M: 64K context

Q4_K_S: 100K context