27 3 weeks ago

TowerInstruct is an Open Multilingual Large Language Model for Translation-Related Tasks by Unbabel.

Models

View all →

Readme

TowerInstruct-13B-v0.1

Model Description

TowerInstruct-13B is a language model that results from fine-tuning TowerBase on the TowerBlocks supervised fine-tuning dataset. TowerInstruct-13B-v0.1 is the first model in the series. The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation. We will release more details in the upcoming technical report. For now, you can check results obtained with the model here.

  • Developed by: Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
  • Model type: A 13B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
  • Language(s) (NLP): English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, Russian
  • License: CC-BY-NC-4.0, Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
  • Finetuned from model: TowerBase

Intended uses & limitations

The model was initially fine-tuned on a filtered and preprocessed supervised fine-tuning dataset (TowerBlocks-v0.2), which contains a diverse range of data sources:

  • Translation (sentence and paragraph-level)
  • Automatic Post Edition
  • Machine Translation Evaluation
  • Context-aware Translation
  • Terminology-aware Translation
  • Multi-reference Translation
  • Named-entity Recognition
  • Paraphrase Generation
  • Synthetic Chat data
  • Code instructions