The most powerful vision-language model in the Qwen model family to date.
6,249 Pulls 1 Tag Updated 3 days ago
Alibaba's performant long context models for agentic and coding tasks.
512.6K Pulls 10 Tags Updated 3 weeks ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
10.3M Pulls 58 Tags Updated 1 week ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
15.3M Pulls 133 Tags Updated 1 year ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
7.7M Pulls 199 Tags Updated 4 months ago
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
5M Pulls 379 Tags Updated 1 year ago
Qwen2 is a new series of large language models from Alibaba group
4.4M Pulls 97 Tags Updated 1 year ago
Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.
833K Pulls 17 Tags Updated 4 months ago
Qwen2 Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
174.5K Pulls 52 Tags Updated 1 year ago
Building upon the foundational models of the Qwen3 series, Qwen3 Embedding provides a comprehensive range of text embeddings models in various sizes
67.7K Pulls 12 Tags Updated 3 weeks ago
QwQ is the reasoning model of the Qwen series.
1.7M Pulls 8 Tags Updated 7 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
516.2K Pulls 5 Tags Updated 8 months ago
CodeQwen1.5 is a large language model pretrained on a large amount of code data.
180.2K Pulls 30 Tags Updated 1 year ago
A new small reasoning model fine-tuned from the Qwen 2.5 3B Instruct model.
84.3K Pulls 5 Tags Updated 9 months ago
Alibaba's text reranking model.Qwen3-Reranker-8B has the following features: Model Type: Text Reranking. Supported Languages: 100+ Languages. Number of Paramaters: 8B. Context Length: 32k.
190.1K Pulls 5 Tags Updated 4 months ago
Qwen2.5-7B/14B-Instruct-1M
115.8K Pulls 11 Tags Updated 8 months ago
Qwen3, but Josiefied and uncensored.
68K Pulls 47 Tags Updated 3 months ago
66.8K Pulls 74 Tags Updated 2 months ago
MiniCPM-V surpasses proprietary models such as GPT-4V, Gemini Pro, Qwen-VL and Claude 3 in overall performance, and support multimodal conversation for over 30 languages.
44.7K Pulls 8 Tags Updated 1 year ago
DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
41.7K Pulls 2 Tags Updated 9 months ago