1,198 5 months ago

A 4B-parameter Persian-specialized language model built on Google's Gemma 3 architecture, fine-tuned on high-quality Persian text data while preserving multimodal capabilities for native-quality responses.

5 months ago

297e7b73b9be · 4.1GB

gemma3
·
3.88B
·
Q8_0
{{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 }} {{- if or (eq .Rol
شما یک دستیار هوش مصنوعی پیشرفته، متخصص در زبان فارسی و
{ "stop": [ "<end_of_turn>" ], "temperature": 0.6 }

Readme

🌟 Gemma 3 Persian (v0)

Model Size

gemma-3-persian is a Persian-specialized language model built on Google’s Gemma 3 architecture. This model has been fine-tuned on high-quality Persian text data to provide native-quality responses for Persian speakers while maintaining the multimodal capabilities of the base model.

The model uses QLoRA with 4-bit quantization to optimize for performance on consumer hardware while preserving the quality of responses in Persian.

🇮🇷 این مدل برای زبان فارسی بهینه‌سازی شده و می‌تواند به سوالات شما به صورت طبیعی پاسخ دهد.

⚡ Quick Start

Installation

First, ensure Ollama is installed on your system:

Linux/macOS:

curl -fsSL https://ollama.ai/install.sh | sh

Windows: Download from the official website

Pull the Model

ollama pull mshojaei77/gemma3persian

Run the Model

ollama run mshojaei77/gemma3persian

💬 Example Usage

Basic Conversation

> سلام، می‌توانی درباره تاریخ ایران به من اطلاعاتی بدهی؟
(Hello, can you give me information about the history of Iran?)

> می‌توانی یک شعر کوتاه برای من بنویسی؟
(Can you write a short poem for me?)

> این تصویر را توصیف کن: [IMAGE]
(Describe this image:)

Advanced Parameters

Run with custom parameters:

ollama run mshojaei77/gemma3persian \
  --temperature 0.7 \
  --top_p 0.9 \
  --context_length 8192

🖥️ Programmatic Usage

Python with Ollama Library

Integrate the model directly in your Python applications:

from ollama import chat

# Initialize chat with the Persian Gemma model
response = chat(model='mshojaei77/gemma3persian', messages=[
  {
    'role': 'user',
    'content': 'سلام، می‌توانی خودت را معرفی کنی؟', # "Hello, can you introduce yourself?"
  },
])

# Print the model's response
print(response['message']['content'])
# Or access fields directly from the response object
print(response.message.content)

JavaScript/Node.js with Ollama

Use the model in your JavaScript or TypeScript applications:

import ollama from 'ollama'

async function chatWithGemmaPersian() {
  const response = await ollama.chat({
    model: 'mshojaei77/gemma3persian',
    messages: [{ 
      role: 'user', 
      content: 'لطفاً یک داستان کوتاه بنویس.' // "Please write a short story."
    }],
  })
  console.log(response.message.content)
}

chatWithGemmaPersian()

Python with LangChain Integration

Create more complex applications using LangChain’s conversational memory:

from langchain_ollama import ChatOllama
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

# Set up the Gemma Persian model
llm = ChatOllama(
    model="mshojaei77/gemma3persian",
    temperature=0.7,
    num_predict=256
)

# Add a memory component for contextual conversations
memory = ConversationBufferMemory(return_messages=True)

# Create a conversation chain with memory
conversation = ConversationChain(llm=llm, memory=memory)

# Start chatting with memory in Persian
print(conversation.run(input="سلام، حالت چطور است؟"))  # "Hello, how are you?"
print(conversation.run(input="من درباره تاریخ ایران کنجکاو هستم."))  # "I'm curious about the history of Iran."
print(conversation.run(input="می‌توانی آخرین سوال من را یادآوری کنی؟"))  # "Can you remind me of my last question?"

🔍 Capabilities

Feature Support Notes
🇮🇷 Persian text generation ✅ Excellent Optimized for natural Persian language
🖼️ Image understanding ✅ Supported Inherited from base Gemma 3 model
🎯 Instruction following ✅ Strong Fine-tuned on instruction datasets
💭 Creative writing ✅ Good Poetry, stories, and creative content
🧠 Knowledge retrieval ✅ Basic Limited to training data
💻 Code generation ⚠️ Limited Better in English than Persian

🔧 Technical Details

  • Base Model: Google Gemma 3-4B
  • Training Dataset: mshojaei77/Persian_sft (681,000+ Persian texts)
  • Fine-Tuning: QLoRA with 4-bit quantization
  • Hardware Used: T4 GPU
  • Context Length: 8,192 tokens
  • Libraries: Hugging Face Transformers, PEFT, bitsandbytes

📊 Hardware Requirements

Hardware Minimum Recommended
RAM 8GB 16GB+
GPU VRAM 4GB 8GB+
Disk 4GB free 10GB+ free

⚠️ Limitations

  • 4-bit quantization may occasionally reduce precision in complex reasoning
  • Limited by the training data available in Persian
  • May generate plausible but incorrect information
  • Not extensively safety-tuned for all scenarios
  • Knowledge cutoff from the base model training

🌐 Community & Support

📜 License

This model is subject to the Gemma license from Google.