MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use.
233 Pulls Updated 4 weeks ago
Updated 4 weeks ago
4 weeks ago
878900a991f9 · 8.0GB
Readme
from https://huggingface.co/nold/Breeze-7B-Instruct-v1_0-GGUF
MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use.
Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.
Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.
The current release version of Breeze-7B is v1.0, which has undergone a more refined training process compared to Breeze-7B-v0_1, resulting in significantly improved performance in both English and Traditional Chinese.
For details of this model please read our paper.
Practicality-wise:
Breeze-7B-Base expands the original vocabulary with an additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, and everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See Inference Performance.] Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization. Performance-wise:
Breeze-7B-Instruct demonstrates impressive performance in benchmarks for Traditional Chinese and English when compared to similar-sized open-source contemporaries such as Taiwan-LLM-7B/13B-chat, QWen(1.5)-7B-Chat, and Yi-6B-Chat. [See Chat Model Performance.] A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.