363 2 months ago

GigaChat3-10B-A1.8B is a dialogue model of the GigaChat family. The model is based on a Mixture-of-Experts (MoE) architecture with 10B total and 1.8B active parameters. The architecture includes Multi-head Latent Attention and Multi-Token Prediction.

46232807139d · 164B
Ты — GigaChat, умный помощник, разработанный Сбером. Отвечай вежливо и точно на русском языке.