DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
326.5K Pulls 3 Tags Updated 3 months ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
43.9K Pulls 1 Tag Updated 2 months ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
426.7K Pulls 8 Tags Updated 5 months ago
DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
688.4K Pulls 9 Tags Updated 11 months ago
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
79.4M Pulls 35 Tags Updated 8 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3.6M Pulls 5 Tags Updated 1 year ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.1M Pulls 5 Tags Updated 1 year ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
3.2M Pulls 102 Tags Updated 2 years ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.7M Pulls 64 Tags Updated 1 year ago
A strong, economical, and efficient Mixture-of-Experts language model.
547.1K Pulls 34 Tags Updated 1 year ago
An advanced language model crafted with 2 trillion bilingual tokens.
549.3K Pulls 64 Tags Updated 2 years ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
167.9K Pulls 7 Tags Updated 1 year ago
Cogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.
1.5M Pulls 20 Tags Updated 11 months ago
EXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.
525.7K Pulls 13 Tags Updated 11 months ago
Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1
6M Pulls 102 Tags Updated 1 year ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
832.9K Pulls 15 Tags Updated 11 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
265.8K Pulls 9 Tags Updated 1 year ago
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
434.7K Pulls 6 Tags Updated 2 months ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
105.7K Pulls 6 Tags Updated 2 months ago
Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
1.2M Pulls 30 Tags Updated 1 week ago