144 9 months ago

Typhoon2 T1 - 3B Thai / English Bilingual Reasoning LLM.

tools

Models

View all →

Readme

Typhoon T1 3B (Research Preview) - Q4_K_M

Typhoon T1 3B (Research Preview) is the first in a new family of open reasoning model “Typhoon T”. Reasoning model is a novel type of model that think longer before giving a final answer.

Typhoon T1 3B (Research Preview) is built on top of Typhoon 2 3B Instruct. It has improved performance on challenging benchmarks like GPQA, MMLU Pro, and AI Mathematics Olympiad validation set.

Run

ollama run scb10x/llama3.2-typhoon2-t1-3b-research-preview

Key Points

  • Typhoon T1 is a new family of open reasoning models developed by SCB 10X
  • Typhoon T1 3B (Research Preview), the first in the Typhoon T family, shows improved performance across challenging benchmarks compared to the original Typhoon 2 3B Instruct
  • Typhoon T1 3B (Research Preview) offers a fast, low-compute requirements model, yet is capable in a variety of tasks by scaling test-time compute, enabling the model to think longer before giving a final answer. Typhoon T1 3B (Research Preview) is able to reason across domains, unlike many open reasoning models limited to mathematics and coding
  • We open our recipe for data pipeline and training this model without distilling from other reasoning models
  • We introduce a new thinking paradigm for reasoning models, structured thinking, where we add auxiliary tokens to help structure the thinking process of the model. This approach shows an increase in performance over a common variant of separating only thought and response parts based on our experiments
  • Typhoon T1 3B (Research Preview) v2025-02-01 is the first reasoning model where we intentionally equipped the model with the ability to generate Thai reasoning traces, improving transparency and interpretability of the model.

For more technical details, please visit our technical blog.

  • To acknowledge Meta’s effort in creating the foundation model and to comply with the license, we explicitly include llama-3.2 in the model name.

Performance

Model name GSM8K (↑), 8-shot HumanEval+ (↑), Pass@10 GPQA (↑), 0CoT AIME (↑)
Typhoon 2 3B Instruct 56.63 66 27.01 0
Typhoon T1 3B (semi) 59.59 68.99 25.89 0
Typhoon T1 3B (Research Preview) v2025-01-23 62.40 69.87 31.7 2.22

MMLU Pro (↑), 5-shot

Model name Average Math Health Physics Business Biology Chemistry Computer Science Economics Engineering Philosophy Other History Psychology Law
Typhoon 2 3B Instruct 26.7 26.8 33.62 23.4 25.35 43.38 19.88 28.29 35.43 18.37 28.06 27.92 25.72 37.84 13.17
Typhoon T1 3B (Research Preview) v2025-01-23 30.65 30.57 36.19 27.1 31.69 50.77 22.17 31.22 38.86 21.98 30.66 32.79 26.51 43.36 17.26

Link

Arxiv

Huggingface