Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
6.3M Pulls 133 Tags Updated 6 months ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
4.9M Pulls 196 Tags Updated 4 months ago
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
4.5M Pulls 379 Tags Updated 11 months ago
Qwen2 is a new series of large language models from Alibaba group
4.2M Pulls 97 Tags Updated 7 months ago
Qwen2 Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
123.8K Pulls 52 Tags Updated 7 months ago
QwQ is the reasoning model of the Qwen series.
1.2M Pulls 8 Tags Updated 3 weeks ago
CodeQwen1.5 is a large language model pretrained on a large amount of code data.
139.4K Pulls 30 Tags Updated 9 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
72.9K Pulls 5 Tags Updated 8 weeks ago
A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.
52K Pulls 5 Tags Updated 3 months ago
MiniCPM-V surpasses proprietary models such as GPT-4V, Gemini Pro, Qwen-VL and Claude 3 in overall performance, and support multimodal conversation for over 30 languages.
41K Pulls 8 Tags Updated 10 months ago
Qwen2.5 coder tools model can work with Cline (prev. Claude Dev). Update 0.5b, 1.5b, 3b, 7b, 14b, 32b coder models.
29.7K Pulls 15 Tags Updated 4 months ago
Qwen2.5-1M is the long-context version of the Qwen2.5 series models, supporting a context length of up to 1M tokens.
19.5K Pulls 11 Tags Updated 2 months ago
18.9K Pulls 1 Tag Updated 2 months ago
15.6K Pulls 13 Tags Updated 9 months ago
12.9K Pulls 36 Tags Updated 4 months ago
12.8K Pulls 1 Tag Updated 2 months ago
8,713 Pulls 1 Tag Updated 2 months ago
Qwen2.5 coder tools model can work with Cline (prev. Claude Dev).
7,941 Pulls 13 Tags Updated 4 months ago
6,531 Pulls 1 Tag Updated 9 months ago