A fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

Swahili Gemma 1B

A fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

🚀 Quick Start

# Run the recommended Q4_K_M quantization
ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m

# Try different quantizations based on your needs
ollama run crane-ai-labs/swahili-gemma-1b:q8-0    # Higher quality
ollama run crane-ai-labs/swahili-gemma-1b:q4-k-s  # Smaller size
ollama run crane-ai-labs/swahili-gemma-1b:f16     # Original quality

🌍 Language Capabilities

Input Languages: English + Swahili
Output Language: Swahili only
Primary Focus: English-to-Swahili translation and Swahili conversation

📊 Performance Metrics

Model Comparison

Model	Parameters	BLEU	chrF++	Efficiency*
Gemma 3 4B	4B	10.9	44.1	2.7
Swahili Gemma 1B	1B	27.6	56.8	27.6
Gemma 3 27B	27B	29.4	60.0	1.1
GPT-5 Mini	~8B	31.8	62.4	4.0
Gemini 2.0 Flash	Large	35.6	64.6	N/A

*Efficiency = BLEU Score / Parameters (in billions)

Key Performance Insights

🏆 Efficiency Leader: Achieves the highest BLEU-to-parameter ratio (27.6 BLEU per billion parameters)
🚀 Size Advantage: Outperforms Gemma 3 4B (4x larger) by 153% on BLEU score
💎 Competitive Quality: Achieves 94% of Gemma 3 27B performance with 27x fewer parameters
⚡ Practical Deployment: Runs efficiently on consumer hardware while maintaining quality

Evaluation Details

Dataset: FLORES-200 English→Swahili (1,012 translation pairs)
Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
Evaluation: Zero-shot translation performance

📊 Available Quantizations

Quantization	Size	Quality	Use Case
`f16`	~1.9GB	Highest	Maximum quality inference
`f32`	~3.8GB	Highest	Research & benchmarking
`q8-0`	~1.0GB	Very High	Production with ample resources
`q5-k-m`	~812MB	High	Balanced quality/size
`q4-k-m`	~769MB	Good	Recommended for most users
`q4-k-s`	~745MB	Good	Resource-constrained environments
`q3-k-m`	~689MB	Fair	Mobile/edge deployment
`q2-k`	~658MB	Lower	Minimal resource usage

🤖 Model Details

Base Model: Gemma 3 1B Instruction Tuned
Specialization: English-to-Swahili translation and Swahili conversation
Context Length: 32K tokens
Architecture: Transformer with sliding window attention
Input Languages: English + Swahili
Output Language: Swahili only

⚙️ Generation Parameters

The model is optimized with the following parameters:

temperature: 0.3      # Focused, coherent responses
top_p: 0.95          # Nucleus sampling
top_k: 64            # Top-k sampling  
max_tokens: 128      # Response length limit
repeat_penalty: 1.1  # Reduces repetition

💻 Usage Examples

Basic Translation

ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m
>>> Translate this to Swahili: "Hello, how are you today?"

Swahili Conversation

ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m
>>> Hujambo! Je, unaweza kunisaidia leo?

With Custom System Prompt

import ollama

response = ollama.chat(
    model='crane-ai-labs/swahili-gemma-1b:q4-k-m',
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful Swahili conversation assistant.'
        },
        {
            'role': 'user', 
            'content': 'Can you help me learn basic Swahili greetings?'
        }
    ]
)

API Usage

curl http://localhost:11434/api/chat -d '{
  "model": "crane-ai-labs/swahili-gemma-1b:q4-k-m",
  "messages": [
    {
      "role": "user",
      "content": "Translate: Good morning, how did you sleep?"
    }
  ]
}'

🎯 Capabilities

Translation: English-to-Swahili translation
Conversational AI: Natural dialogue in Swahili
Summarization: Text summarization in Swahili
Writing: Creative and informational writing in Swahili
Question Answering: General knowledge responses in Swahili

🔧 Technical Specifications

Model Family: Gemma 3
Parameters: 1 billion
Precision: Multiple quantization levels available
Context Window: 4,096 tokens
Architecture: Transformer with optimized attention patterns
Tokenizer: SentencePiece with 262K vocabulary

🏆 Performance

Swahili Gemma 1B delivers: - ✅ High-quality English-to-Swahili translation - ✅ Natural Swahili conversation - ✅ Effective text summarization - ✅ Fast inference on consumer hardware - ✅ Efficient memory usage

🔄 Chat Template

The model uses the official Gemma 3 chat template:

<start_of_turn>user
Your message here<end_of_turn>
<start_of_turn>model
Model response<end_of_turn>

🛠️ Installation & Setup

Install Ollama (if not already installed):

curl -fsSL https://ollama.com/install.sh | sh

Pull the model:

ollama pull crane-ai-labs/swahili-gemma-1b:q4-k-m

Start chatting:

ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m

📚 Integration Examples

Python with ollama-python

import ollama

client = ollama.Client()

response = client.chat(
    model='crane-ai-labs/swahili-gemma-1b:q4-k-m',
    messages=[
        {'role': 'user', 'content': 'Translate to Swahili: Hello!'}
    ]
)

print(response['message']['content'])

JavaScript with ollama-js

import { Ollama } from 'ollama'

const ollama = new Ollama({ host: 'http://127.0.0.1:11434' })

const response = await ollama.chat({
  model: 'crane-ai-labs/swahili-gemma-1b:q4-k-m',
  messages: [{ role: 'user', content: 'Translate to Swahili: Good morning!' }],
})

console.log(response.message.content)

🎨 Use Cases

Language Learning: English-Swahili translation practice
Cultural Preservation: Swahili content creation and documentation
Educational Tool: Learning assistant in Swahili
Content Localization: Translating content to Swahili
Conversational Practice: Swahili dialogue practice
Text Summarization: Summarizing content in Swahili

⚠️ Limitations

Language Output: Responds only in Swahili
Factual Knowledge: General knowledge only, not trained on specific factual datasets
No Coding/Math: Not designed for programming or mathematical tasks
Knowledge Cutoff: Training data has a knowledge cutoff date
Context Length: Limited to 4,096 tokens for optimal performance
Specialized Domains: May require domain-specific fine-tuning for technical fields

📄 License

This model is released under the Gemma Terms of Use. Please review the terms before use.

🤝 Contributing

Found an issue or want to improve the model? We welcome: - Bug reports and feedback - Performance evaluations and benchmarks - Use case examples and integration guides - Documentation improvements

🔗 Links

Model Page: https://ollama.com/crane-ai-labs/swahili-gemma-1b
Ollama: https://ollama.com
Gemma Family: https://ai.google.dev/gemma
Base Model: google/gemma-3-1b-it

🙏 Acknowledgments

Google DeepMind: For the Gemma 3 base model, support and guidance.
Community: For Swahili language resources and datasets
Gilbert Korir (Msingi AI, Nairobi, Kenya)
Alfred Malengo Kondoro (Hanyang University, Seoul, South Korea)

Citation

If you use these LiteRT models in your research or mobile applications, please cite:

@misc{crane_ai_labs_2025,
    author    = {Bakunga Bronson and Kato Steven Mubiru and Lwanga Caleb and Gimei Alex and Kavuma Lameck and Roland Ganafa and Sibomana Glorry and Atuhaire Collins and JohnRoy Nangeso and Tukamushaba Catherine},
    title     = {Swahili Gemma: A Fine-tuned Gemma 3 1B Model for Swahili conversational AI},
    year      = {2025},
    url       = {https://huggingface.co/CraneAILabs/swahili-gemma-1b},
    organization = {Crane AI Labs}
}

Built with ❤️ by Crane AI Labs

Swahili Gemma - Your helpful Swahili AI companion