MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use.

233 Pulls Updated 4 weeks ago

Readme

from https://huggingface.co/nold/Breeze-7B-Instruct-v1_0-GGUF

MediaTek Research Breeze-7B (hereinafter referred to as Breeze-7B) is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use.

Breeze-7B-Base is the base model for the Breeze-7B series. It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.

Breeze-7B-Instruct derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.

The current release version of Breeze-7B is v1.0, which has undergone a more refined training process compared to Breeze-7B-v0_1, resulting in significantly improved performance in both English and Traditional Chinese.

For details of this model please read our paper.

Practicality-wise:

Breeze-7B-Base expands the original vocabulary with an additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, and everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See Inference Performance.] Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization. Performance-wise:

Breeze-7B-Instruct demonstrates impressive performance in benchmarks for Traditional Chinese and English when compared to similar-sized open-source contemporaries such as Taiwan-LLM-7B/13B-chat, QWen(1.5)-7B-Chat, and Yi-6B-Chat. [See Chat Model Performance.] A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.