55 6 days ago

Light version - GigaChat-3.1-Lightning: it shows the level of GPT-4o in arenas, but at the same time remains compact and fast (quantized Q4_K_M)

tools 10b
ollama run second_constantine/gigachat3.1:10b

Details

6 days ago

0cb27546b529 · 6.5GB ·

deepseek2
·
10.7B
·
Q4_K_M
{{- if .Messages }} {{- if or .System .Tools }} {{- if .System }} {{ .System }} {{- end }} {{- if .T
{ "stop": [ "User:", "Assistant:" ] }

Readme

Based on https://huggingface.co/ai-sage/GigaChat3.1-10B-A1.8B-GGUF (with deepseek Modelfile)

Feature Value
vision false
thinking false
tools true
Device Speed, token/s Context Memory, gb Versions
RTX 3090 24gb ~230 256k 21 (21.271) Q4_K_M, 0.20.2
RTX 2080ti 11gb ~144 64k 10 (10.434) Q4_K_M, 0.20.2
RTX 3070ti Mobile 8gb ~145 15k 7.5 (7.763) Q4_K_M, 0.20.2
i7-12700H + 3070ti Mobile 8gb ~19 256k 22 (71%/29% CPU/GPU) Q4_K_M, 0.20.2
M1 Max 32gb ~79 256k 21 (25.123) Q4_K_M, 0.20.2