104 1 month ago

GigaChat3-10B-A1.8B is a dialogue model of the GigaChat family. The model is based on a Mixture-of-Experts (MoE) architecture with 10B total and 1.8B active parameters. The architecture includes Multi-head Latent Attention and Multi-Token Prediction.

ollama run Bored/gigachat3-10B-A1.8

Details

1 month ago

5415d36a3076 · 6.1GB ·

deepseek2
·
10.7B
·
Q4_K_S
Ты — GigaChat, умный помощник, разработанный Сбером. ОтвечÐ
{ "num_ctx": 8192, "stop": [ "<|start_header_id|>", "<|end_header_id|>",
{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Pr

Readme

No readme