second_constantine/gigachat3.1

second_constantine/ gigachat3.1

55 Downloads Updated 6 days ago

Light version - GigaChat-3.1-Lightning: it shows the level of GPT-4o in arenas, but at the same time remains compact and fast (quantized Q4_K_M)

tools 10b

ollama run second_constantine/gigachat3.1:10b

curl http://localhost:11434/api/chat \
  -d '{
    "model": "second_constantine/gigachat3.1:10b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='second_constantine/gigachat3.1:10b',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'second_constantine/gigachat3.1:10b',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code ollama launch claude --model second_constantine/gigachat3.1:10b

Codex ollama launch codex --model second_constantine/gigachat3.1:10b

OpenCode ollama launch opencode --model second_constantine/gigachat3.1:10b

OpenClaw ollama launch openclaw --model second_constantine/gigachat3.1:10b

Models

View all →

Name

1 model

Size

Context

Input

gigachat3.1:10b

6.5GB · 256K context window · Text · 6 days ago

gigachat3.1:10b

6.5GB

256K

Text

Readme

Based on https://huggingface.co/ai-sage/GigaChat3.1-10B-A1.8B-GGUF (with deepseek Modelfile)

Feature	Value
vision	false
thinking	false
tools	true

Device	Speed, token/s	Context	Memory, gb	Versions
RTX 3090 24gb	~230	256k	21 (21.271)	Q4_K_M, 0.20.2
RTX 2080ti 11gb	~144	64k	10 (10.434)	Q4_K_M, 0.20.2
RTX 3070ti Mobile 8gb	~145	15k	7.5 (7.763)	Q4_K_M, 0.20.2
i7-12700H + 3070ti Mobile 8gb	~19	256k	22 (71%/29% CPU/GPU)	Q4_K_M, 0.20.2
M1 Max 32gb	~79	256k	21 (25.123)	Q4_K_M, 0.20.2