1,659 Downloads Updated 11 months ago
Name
24 models
bggpt:latest
5.5GB · 8K context window · Text · 11 months ago
bggpt:v0.1
4.4GB · 32K context window · Text · 1 year ago
bggpt:v0.2
4.4GB · 32K context window · Text · 1 year ago
bggpt:v1.0
latest5.5GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.F16
5.2GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q4_K_M
1.7GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q4_K_S
1.6GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q5_K_M
1.9GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q5_K_S
1.9GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q6_K
2.2GB · 8K context window · Text · 11 months ago
bggpt:2B-IT-v1.0.Q8_0
2.8GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.F16
18GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q4_K_M
5.8GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q4_K_S
5.5GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q5_K_M
6.6GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q5_K_S
6.5GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q6_K
7.6GB · 8K context window · Text · 11 months ago
bggpt:9B-IT-v1.0.Q8_0
9.8GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q4_K_M
17GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q4_K_S
16GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q5_K_M
19GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q5_K_S
19GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q6_K
22GB · 8K context window · Text · 11 months ago
bggpt:27B-IT-v1.0.Q8_0
29GB · 8K context window · Text · 11 months ago
Meet BgGPT, a Bulgarian language model built on top of Google’s Gemma 2. BgGPT is distributed under Gemma Terms of Use.
Versions 0.1 and 0.2 of the model were built on top of Mistral 0.1 and 0.2.
This model was created by INSAIT Institute, part of Sofia University, in Sofia, Bulgaria.
The model was built on top of Google’s Gemma 2 2B, 9B and 27B open models. It was continuously pre-trained on around 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy INSAIT presented at EMNLP’24, allowing the model to gain outstanding Bulgarian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Bulgarian web crawl data, freely available datasets such as Wikipedia, a range of specialized Bulgarian datasets sourced by the INSAIT Institute, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Bulgarian instruction dataset created using real-world conversations. For more information check our blogpost.
ollama run todorov/bggpt
Example:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "todorov/bggpt",
"prompt":"Кога е основан Софийският университет?"
}'
BgGPT
BgGPT-Gemma-2-2.6B-IT-v1.0 on Hugging Face
BgGPT-Gemma-2-9B-IT-v1.0 on Hugging Face
BgGPT-Gemma-2-27B-IT-v1.0 on Hugging Face