38 1 month ago

A fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

Models

View all →

Readme

Swahili Gemma 1B

A fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

🚀 Quick Start

# Run the recommended Q4_K_M quantization
ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m

# Try different quantizations based on your needs
ollama run crane-ai-labs/swahili-gemma-1b:q8-0    # Higher quality
ollama run crane-ai-labs/swahili-gemma-1b:q4-k-s  # Smaller size
ollama run crane-ai-labs/swahili-gemma-1b:f16     # Original quality

🌍 Language Capabilities

  • Input Languages: English + Swahili
  • Output Language: Swahili only
  • Primary Focus: English-to-Swahili translation and Swahili conversation

📊 Performance Metrics

swahili_gemma_ascending_chart.png

Model Comparison

Model Parameters BLEU chrF++ Efficiency*
Gemma 3 4B 4B 10.9 44.1 2.7
Swahili Gemma 1B 1B 27.6 56.8 27.6
Gemma 3 27B 27B 29.4 60.0 1.1
GPT-5 Mini ~8B 31.8 62.4 4.0
Gemini 2.0 Flash Large 35.6 64.6 N/A

*Efficiency = BLEU Score / Parameters (in billions)

Key Performance Insights

🏆 Efficiency Leader: Achieves the highest BLEU-to-parameter ratio (27.6 BLEU per billion parameters)
🚀 Size Advantage: Outperforms Gemma 3 4B (4x larger) by 153% on BLEU score
💎 Competitive Quality: Achieves 94% of Gemma 3 27B performance with 27x fewer parameters
Practical Deployment: Runs efficiently on consumer hardware while maintaining quality

Evaluation Details

  • Dataset: FLORES-200 English→Swahili (1,012 translation pairs)
  • Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
  • Evaluation: Zero-shot translation performance

📊 Available Quantizations

Quantization Size Quality Use Case
f16 ~1.9GB Highest Maximum quality inference
f32 ~3.8GB Highest Research & benchmarking
q8-0 ~1.0GB Very High Production with ample resources
q5-k-m ~812MB High Balanced quality/size
q4-k-m ~769MB Good Recommended for most users
q4-k-s ~745MB Good Resource-constrained environments
q3-k-m ~689MB Fair Mobile/edge deployment
q2-k ~658MB Lower Minimal resource usage

🤖 Model Details

  • Base Model: Gemma 3 1B Instruction Tuned
  • Specialization: English-to-Swahili translation and Swahili conversation
  • Context Length: 32K tokens
  • Architecture: Transformer with sliding window attention
  • Input Languages: English + Swahili
  • Output Language: Swahili only

⚙️ Generation Parameters

The model is optimized with the following parameters:

temperature: 0.3      # Focused, coherent responses
top_p: 0.95          # Nucleus sampling
top_k: 64            # Top-k sampling  
max_tokens: 128      # Response length limit
repeat_penalty: 1.1  # Reduces repetition

💻 Usage Examples

Basic Translation

ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m
>>> Translate this to Swahili: "Hello, how are you today?"

Swahili Conversation

ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m
>>> Hujambo! Je, unaweza kunisaidia leo?

With Custom System Prompt

import ollama

response = ollama.chat(
    model='crane-ai-labs/swahili-gemma-1b:q4-k-m',
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful Swahili conversation assistant.'
        },
        {
            'role': 'user', 
            'content': 'Can you help me learn basic Swahili greetings?'
        }
    ]
)

API Usage

curl http://localhost:11434/api/chat -d '{
  "model": "crane-ai-labs/swahili-gemma-1b:q4-k-m",
  "messages": [
    {
      "role": "user",
      "content": "Translate: Good morning, how did you sleep?"
    }
  ]
}'

🎯 Capabilities

  • Translation: English-to-Swahili translation
  • Conversational AI: Natural dialogue in Swahili
  • Summarization: Text summarization in Swahili
  • Writing: Creative and informational writing in Swahili
  • Question Answering: General knowledge responses in Swahili

🔧 Technical Specifications

  • Model Family: Gemma 3
  • Parameters: 1 billion
  • Precision: Multiple quantization levels available
  • Context Window: 4,096 tokens
  • Architecture: Transformer with optimized attention patterns
  • Tokenizer: SentencePiece with 262K vocabulary

🏆 Performance

Swahili Gemma 1B delivers: - ✅ High-quality English-to-Swahili translation - ✅ Natural Swahili conversation - ✅ Effective text summarization - ✅ Fast inference on consumer hardware - ✅ Efficient memory usage

🔄 Chat Template

The model uses the official Gemma 3 chat template:

<start_of_turn>user
Your message here<end_of_turn>
<start_of_turn>model
Model response<end_of_turn>

🛠️ Installation & Setup

  1. Install Ollama (if not already installed):

    curl -fsSL https://ollama.com/install.sh | sh
    
  2. Pull the model:

    ollama pull crane-ai-labs/swahili-gemma-1b:q4-k-m
    
  3. Start chatting:

    ollama run crane-ai-labs/swahili-gemma-1b:q4-k-m
    

📚 Integration Examples

Python with ollama-python

import ollama

client = ollama.Client()

response = client.chat(
    model='crane-ai-labs/swahili-gemma-1b:q4-k-m',
    messages=[
        {'role': 'user', 'content': 'Translate to Swahili: Hello!'}
    ]
)

print(response['message']['content'])

JavaScript with ollama-js

import { Ollama } from 'ollama'

const ollama = new Ollama({ host: 'http://127.0.0.1:11434' })

const response = await ollama.chat({
  model: 'crane-ai-labs/swahili-gemma-1b:q4-k-m',
  messages: [{ role: 'user', content: 'Translate to Swahili: Good morning!' }],
})

console.log(response.message.content)

🎨 Use Cases

  • Language Learning: English-Swahili translation practice
  • Cultural Preservation: Swahili content creation and documentation
  • Educational Tool: Learning assistant in Swahili
  • Content Localization: Translating content to Swahili
  • Conversational Practice: Swahili dialogue practice
  • Text Summarization: Summarizing content in Swahili

⚠️ Limitations

  • Language Output: Responds only in Swahili
  • Factual Knowledge: General knowledge only, not trained on specific factual datasets
  • No Coding/Math: Not designed for programming or mathematical tasks
  • Knowledge Cutoff: Training data has a knowledge cutoff date
  • Context Length: Limited to 4,096 tokens for optimal performance
  • Specialized Domains: May require domain-specific fine-tuning for technical fields

📄 License

This model is released under the Gemma Terms of Use. Please review the terms before use.

🤝 Contributing

Found an issue or want to improve the model? We welcome: - Bug reports and feedback - Performance evaluations and benchmarks - Use case examples and integration guides - Documentation improvements

🔗 Links

🙏 Acknowledgments

  • Google DeepMind: For the Gemma 3 base model, support and guidance.
  • Community: For Swahili language resources and datasets
  • Gilbert Korir (Msingi AI, Nairobi, Kenya)
  • Alfred Malengo Kondoro (Hanyang University, Seoul, South Korea)

Citation

If you use these LiteRT models in your research or mobile applications, please cite:

@misc{crane_ai_labs_2025,
    author    = {Bakunga Bronson and Kato Steven Mubiru and Lwanga Caleb and Gimei Alex and Kavuma Lameck and Roland Ganafa and Sibomana Glorry and Atuhaire Collins and JohnRoy Nangeso and Tukamushaba Catherine},
    title     = {Swahili Gemma: A Fine-tuned Gemma 3 1B Model for Swahili conversational AI},
    year      = {2025},
    url       = {https://huggingface.co/CraneAILabs/swahili-gemma-1b},
    organization = {Crane AI Labs}
}

Built with ❤️ by Crane AI Labs

Swahili Gemma - Your helpful Swahili AI companion