759 Downloads Updated 1 month ago
ollama run youtu/youtu
Updated 1 month ago
1 month ago
a0fe102b0112 Β· 2.1GB Β·
This repository hosts the Ollama model package for Tencentβs Youtu-LLM-2B.
Youtu-LLM is a new, small, yet powerful LLM, contains only 1.96B parameters, supports 128k long context, and has native agentic talents. On general evaluations, Youtu-LLM significantly outperforms SOTA LLMs of similar size in terms of Commonsense, STEM, Coding and Long Context capabilities; in agent-related testing, Youtu-LLM surpasses larger-sized leaders and is truly capable of completing multiple end2end agent tasks.
Youtu-LLM has the following features:
| Model Name | Description | Download |
|---|---|---|
| Youtu-LLM-2B-Base | Base model of Youtu-LLM-2B | π€ Model |
| Youtu-LLM-2B | Instruct model of Youtu-LLM-2B | π€ Model |
| Youtu-LLM-2B-GGUF | Instruct model of Youtu-LLM-2B, in GGUF format | π€ Model |
| Benchmark | DeepSeek-R1-Distill-Qwen-1.5B | Qwen3-1.7B | SmolLM3-3B | Qwen3-4B | DeepSeek-R1-Distill-Llama-8B | Youtu-LLM-2B |
|---|---|---|---|---|---|---|
| Commonsense Knowledge Reasoning | ||||||
| MMLU-Redux | 53.0% | 74.1% | 75.6% | 83.8% | 78.1% | 75.8% |
| MMLU-Pro | 36.5% | 54.9% | 53.0% | 69.1% | 57.5% | 61.6% |
| Instruction Following & Text Reasoning | ||||||
| IFEval | 29.4% | 70.4% | 60.4% | 83.6% | 34.6% | 81.2% |
| DROP | 41.3% | 72.5% | 72.0% | 82.9% | 73.1% | 86.7% |
| MUSR | 43.8% | 56.6% | 54.1% | 60.5% | 59.7% | 57.4% |
| STEM | ||||||
| MATH-500 | 84.8% | 89.8% | 91.8% | 95.0% | 90.8% | 93.7% |
| AIME 24 | 30.2% | 44.2% | 46.7% | 73.3% | 52.5% | 65.4% |
| AIME 25 | 23.1% | 37.1% | 34.2% | 64.2% | 34.4% | 49.8% |
| GPQA-Diamond | 33.6% | 36.9% | 43.8% | 55.2% | 45.5% | 48.0% |
| BBH | 31.0% | 69.1% | 76.3% | 87.8% | 77.8% | 77.5% |
| Coding | ||||||
| HumanEval | 64.0% | 84.8% | 79.9% | 95.4% | 88.1% | 95.9% |
| HumanEval+ | 59.5% | 76.2% | 74.7% | 87.8% | 82.5% | 89.0% |
| MBPP | 51.5% | 80.5% | 66.7% | 92.3% | 73.9% | 85.0% |
| MBPP+ | 44.2% | 67.7% | 56.7% | 77.6% | 61.0% | 71.7% |
| LiveCodeBench v6 | 19.8% | 30.7% | 30.8% | 48.5% | 36.8% | 43.7% |
| Benchmark | Qwen3-1.7B | SmolLM3-3B | Qwen3-4B | Youtu-LLM-2B |
|---|---|---|---|---|
| Deep Research | ||||
| GAIA | 11.4% | 11.7% | 25.5% | 33.9% |
| xbench | 11.7% | 13.9% | 18.4% | 19.5% |
| Code | ||||
| SWE-Bench-Verified | 0.6% | 7.2% | 5.7% | 17.7% |
| EnConda-Bench | 10.8% | 3.5% | 16.1% | 21.5% |
| Tool | ||||
| BFCL V3 | 55.5% | 31.5% | 61.7% | 58.0% |
| ΟΒ²-Bench | 2.6% | 9.7% | 10.9% | 15.0% |