gemma-2-2b-jpn-it-translate is a model tuned for translation tasks based on google/gemma-2-2b-jpn-it released by Google.

84 5 weeks ago

Readme

gemma-2-2b-jpn-it-translate is a model tuned for translation tasks based on google/gemma-2-2b-jpn-it released by Google.

Although it has 2 billion (2B) parameters, in some fields it provides translation quality approaching that of the 7 billion (7B) model from a year ago. The file size is relatively small at around 5GB, allowing for fast execution.

Links

Original:

GGUF:

Examples

This model is trained to output translated text (Japanese/English) in response to user input after being given an initial system prompt-like text (Japanese/English).

$ ollama run 7shi/gemma-2-jpn-translate:2b-instruct-q8_0
>>> Translate Japanese to English.
OK

>>> 吾輩は猫である。
The narrator is a cat.

>>> 名前はまだ無い。
I have no name yet.

>>> /clear
Cleared session context
>>> Translate English to Japanese.
OK

>>> The narrator is a cat.
語り手は猫です。

>>> I have no name yet.
まだ名前がありません。

Japanese-English Translation sample script

import re
import ollama

model_name = "7shi/gemma-2-jpn-translate:2b-instruct-q8_0"

system_prompt = "You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating."
instruct = """Translate Japanese to English.\nWhen translating, please use the following hints:\n[writing_style: casual]"""

# 文章を区切る関数
def split_sentences(text):
    sentences = []
    last = 0
    # 句点で文を分割
    for match in re.finditer(r'[。!?…]', text):
        end = match.end()
        # 句点の直後に続く改行を含める
        while end < len(text) and text[end] == '\n':
            end += 1
        sentence = text[last:end]
        sentences.append(sentence)
        last = end
    # 残りのテキストを追加
    if last < len(text):
        remaining = text[last:]
        sentences.append(remaining)
    # 各文内の改行を適切に分割
    final_sentences = []
    for s in sentences:
      if '\n' in s:
          parts = s.split('\n')
          for i, part in enumerate(parts):
              if part:
                  # 最後の部分でなければ改行を追加
                  if i < len(parts) - 1:
                      final_sentences.append(part + '\n')
                  else:
                      final_sentences.append(part)
              # 改行自体を保持
              if i < len(parts) - 1:
                  final_sentences.append('\n')
      else:
          final_sentences.append(s)
    return final_sentences

# 翻訳処理を行う関数
def translate_sentence(sentence, previous_context):
    if sentence.strip() == '':
        return sentence

    # 過去のコンテキストと新しい文を配列に格納
    messages = previous_context + [
        {"role": "user", "content": sentence}
    ]

    response = ollama.chat(model=model_name, messages=messages)
    return response["message"]["content"]

from collections import deque

# メイン処理
def main(text):
    sentences = split_sentences(text)
    translated_sentences = []

    # Initialize context with system prompt
    instructs = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": instruct},
        {"role": "assistant", "content": "OK"}
    ]
    context = deque(maxlen=4)  # Maximum 4 elements (2 user, 2 assistant)

    for sentence in sentences:
        translation_context = instructs + list(context)
        translated_sentence = translate_sentence(sentence, translation_context)
        translated_sentences.append(translated_sentence)

        # Add new interactions to the context
        if sentence.strip() != '':
            context.append({"role": "user", "content": sentence})
            context.append({"role": "assistant", "content": translated_sentence})

    return translated_sentences

text = """こんにちは。私は田中です。今日はとても良い天気ですね。朝ごはんはパンとコーヒーを食べました。そのあとに散歩に行きました。公園にはたくさんの人がいました。子供たちは遊んでいました。
犬を連れている人もいました。私はベンチに座って本を読みました。風がとても気持ちよかったです。その後、友達とカフェに行きました。
カフェではコーヒーを飲みながらおしゃべりをしました。友達は最近引っ越したばかりだと言いました。新しい家の写真を見せてくれました。
とてもきれいな家でした。時間が経つのがあっという間でした。夕方になり、私は家に帰りました。夕食にはカレーを作りました。カレーはとても美味しかったです。今日一日、とても楽しかったです。"""

translated = main(text)
print(translated)

result

['Hello.', 'My name is Tanaka.', "It's a nice day today, isn't it?", 'I had breakfast with bread and coffee.', 'After that I went for a walk.', 'There were many people in the park.', 'The children were playing there.', '\n', 'Some of them were walking their dogs.', 'I sat on a bench and read my book.', 'The wind felt so nice.', 'After that, I went to a cafe with my friends.', '\n', 'We talked and drank coffee at the cafe.', 'My friend said she had just moved recently.', 'She showed me a photo of her new house.', '\n', 'The house was very beautiful.', "Time flies when you're having fun.", 'As the evening fell, I went back home.', 'For dinner, I made curry.', 'The curry was delicious.', "Today's one day was very fun."]

English-Japanese Translation sample script

import re
import ollama

model_name = "7shi/gemma-2-jpn-translate:2b-instruct-q8_0"

system_prompt = "You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating."
instruct = """Translate English to Japanese.\nWhen translating, please use the following hints:\n[writing_style: business]"""

# Function to split English sentences
def split_sentences(text):
    sentences = []
    # Split by newlines, periods, exclamation marks, question marks, or two or more consecutive spaces
    pattern = r'(?:\r?\n|\.|\!|\?|(?:\s{2,}))'
    splits = re.split(pattern, text)

    for split in splits:
        split = split.strip()
        if split:
            sentences.append(split)

    return sentences

# Function to translate a sentence
def translate_sentence(sentence, previous_context):
    if sentence.strip() == '':
        return sentence

    messages = previous_context + [
        {"role": "user", "content": sentence}
    ]

    response = ollama.chat(model=model_name, messages=messages)
    return response["message"]["content"]

from collections import deque

# Main processing function
def main(text):
    sentences = split_sentences(text)
    translated_sentences = []

    # Initialize context with system prompt
    instructs = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": instruct},
        {"role": "assistant", "content": "OK"}
    ]
    context = deque(maxlen=4)  # Maximum 4 elements (2 user, 2 assistant)

    for sentence in sentences:
        translation_context = instructs + list(context)
        translated_sentence = translate_sentence(sentence, translation_context)
        translated_sentences.append(translated_sentence)

        # Add new interactions to the context
        if sentence.strip() != '':
            context.append({"role": "user", "content": sentence})
            context.append({"role": "assistant", "content": translated_sentence})

    return translated_sentences

# Sample English text for translation (business context)
text = """Dear valued clients and partners,

I hope this email finds you well. I am writing to provide you with an important update regarding our company's recent developments and future plans.

Firstly, I am pleased to announce that our Q3 financial results have exceeded expectations, with a 15% increase in revenue compared to the same period last year. This success is largely attributed to the launch of our new product line and the expansion of our services into emerging markets.

In light of this growth, we are planning to implement several strategic initiatives in the coming months:

1. Expansion of our R&D department: We will be investing significantly in research and development to maintain our competitive edge in the market.

2. Sustainability efforts: We are committed to reducing our carbon footprint by 30% over the next five years. This includes transitioning to renewable energy sources and implementing eco-friendly practices across all our operations.

3. Digital transformation: We will be upgrading our IT infrastructure to enhance efficiency and provide better service to our clients.

Additionally, we are excited to announce our upcoming annual conference, which will be held virtually this year due to ongoing global health concerns. The conference will take place on November 15-16, 2024, and will feature keynote speeches from industry leaders, interactive workshops, and networking opportunities.

We value your continued support and partnership. If you have any questions or would like further information about any of these initiatives, please don't hesitate to reach out to your account manager or contact our customer support team.

Thank you for your trust in our company. We look forward to achieving new milestones together.

Best regards,
John Smith
CEO, XYZ Corporation"""

translated = main(text)
print(translated)

result

['ご大切なお客様、そしてパートナーの皆様へ', 'このメールは、お元気でいることを願っています', '弊社が取り組む最近の動向や今後の計画について、重要なアップデートを伝えたいと思っております。', 'まず、弊社3Qの決算は予想を上回ったと発表させていただきます。去年同時期比で売上高が15%増加したのです。', 'この成功の大半は、弊社の新製品ラインと、新興市場へのサービス拡大 に起因していると言えるでしょう。', 'この成長を踏まえ、当社は今後数ヶ月の間に、いくつかの戦略的イニシアティブの実施について計画しております。', '1', '当社R&D部門の拡大: 競争力維持のため市場に投入する資金を大幅に拡充', '2', 'サステナビリティ取り組み: 5年間にわたって、当社は炭素フットプリントを30%削減することを目指します', 'これには再生可能エネルギー源への移行や、社内のすべての部門において環境に優しい取り組みを実施するということなども含まれます', '3', 'デジタルトランスフォーメーション:ITインフラを刷新して効率の向上や顧客満足度の高いサービスを提供すること', 'さらに、今年度は継続中の世界的な健康問題のためオンラインで開催されますが、来年開催される当社初のAnnual Conferenceを皆さんに発表します', '当社の初めて のAnnual Conferenceは、2024年11月15日~16日に開催され、業界のリーダーからのキーノート講演や、ワークショップを参加し相互 に交流できる機会が設けられます。', '引き続きのご支援とご協力をお待ちしております。', 'これらの取り組みについて何か質問がありましたら、または詳しい情報はご要望でしたら、まず担当マネジャーにお問い合わせください。また、当社のカスタマーサポートチームに連絡してください。', '弊社にご信任いただきありがとうございます。', '新しい目標を一緒に達成できることを楽しみにしています。', '引き続きよろしくお願い申し上げます。', 'ジョン・スミス', 'XYZ社のCEO']

Modelfile

FROM gemma-2-2b-jpn-it-translate-Q8_0.gguf

SYSTEM "You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating."

TEMPLATE """<start_of_turn>user
{{ if .System }}{{ .System }}

{{ end }}{{ .Prompt }}<end_of_turn>
<start_of_turn>model
{{ .Response }}<end_of_turn>
"""

PARAMETER stop <start_of_turn>
PARAMETER stop <end_of_turn>