80 Downloads Updated 4 months ago

BgGPT-v1.0 is a Bulgarian language model based on Google’s Gemma 2 architecture. The model is free to use and distributed under the Gemma Terms of Use. This model was developed by INSAIT, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria.
This model was built on top of Google’s Gemma 2 open models through continuous pre-training on approximately 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy. The training allows the model to develop Bulgarian cultural and linguistic capabilities while maintaining English performance.
The pre-training utilized various datasets including Bulgarian web crawl data, Wikipedia, specialized Bulgarian datasets, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a Bulgarian instruction dataset created from real-world conversations.


The model has been evaluated on standard English benchmarks, their Bulgarian translations, and Bulgarian-specific benchmarks including:
Performance comparisons show the model competing with other small open language models while retaining English performance from the original Gemma 2 base models.
Multiple model sizes and quantizations are available:
| Model | Size | Context | Quantization |
|---|---|---|---|
| BgGPT-v1.0:2.6b | 1.7GB | 8K | Q4_K_M |
| BgGPT-v1.0:2.6b-q8 | 2.8GB | 8K | Q8_0 |
| BgGPT-v1.0:9b | 5.8GB | 8K | Q4_K_M |
| BgGPT-v1.0:9b-q8 | 9.8GB | 8K | Q8_0 |
| BgGPT-v1.0:27b | 17GB | 8K | Q4_K_M |
| BgGPT-v1.0:27b-q8 | 29GB | 8K | Q8_0 |
To use this model with Ollama, you can pull it using:
# 2.6B model
ollama pull s_emanuilov/BgGPT-v1.0:2.6b
# 9B model
ollama pull s_emanuilov/BgGPT-v1.0:9b
# 27B model
ollama pull s_emanuilov/BgGPT-v1.0:27b
# Q8 quantized versions (higher quality)
ollama pull s_emanuilov/BgGPT-v1.0:2.6b-q8
ollama pull s_emanuilov/BgGPT-v1.0:9b-q8
ollama pull s_emanuilov/BgGPT-v1.0:27b-q8
Then run it with:
ollama run s_emanuilov/BgGPT-v1.0:2.6b
In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token <bos> and be formatted in the Gemma 2 chat template. <bos> should only be the first token in a chat sequence.
E.g.
<bos><start_of_turn>user
Кога е основан Софийският университет?<end_of_turn>
<start_of_turn>model
For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: