55 6 days ago

Light version - GigaChat-3.1-Lightning: it shows the level of GPT-4o in arenas, but at the same time remains compact and fast (quantized Q4_K_M)

tools 10b
ollama run second_constantine/gigachat3.1:10b

Applications

Claude Code
Claude Code ollama launch claude --model second_constantine/gigachat3.1:10b
Codex
Codex ollama launch codex --model second_constantine/gigachat3.1:10b
OpenCode
OpenCode ollama launch opencode --model second_constantine/gigachat3.1:10b
OpenClaw
OpenClaw ollama launch openclaw --model second_constantine/gigachat3.1:10b

Models

View all →

Readme

Based on https://huggingface.co/ai-sage/GigaChat3.1-10B-A1.8B-GGUF (with deepseek Modelfile)

Feature Value
vision false
thinking false
tools true
Device Speed, token/s Context Memory, gb Versions
RTX 3090 24gb ~230 256k 21 (21.271) Q4_K_M, 0.20.2
RTX 2080ti 11gb ~144 64k 10 (10.434) Q4_K_M, 0.20.2
RTX 3070ti Mobile 8gb ~145 15k 7.5 (7.763) Q4_K_M, 0.20.2
i7-12700H + 3070ti Mobile 8gb ~19 256k 22 (71%/29% CPU/GPU) Q4_K_M, 0.20.2
M1 Max 32gb ~79 256k 21 (25.123) Q4_K_M, 0.20.2