DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
74.5M Pulls 35 Tags Updated 5 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3M Pulls 5 Tags Updated 11 months ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
2.3M Pulls 102 Tags Updated 1 year ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
1.3M Pulls 64 Tags Updated 1 year ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
867.3K Pulls 5 Tags Updated 10 months ago
DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
455.6K Pulls 9 Tags Updated 8 months ago
An advanced language model crafted with 2 trillion bilingual tokens.
237.4K Pulls 64 Tags Updated 2 years ago
A strong, economical, and efficient Mixture-of-Experts language model.
226.9K Pulls 34 Tags Updated 1 year ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
207K Pulls 8 Tags Updated 2 months ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
92.2K Pulls 7 Tags Updated 1 year ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
64.7K Pulls 3 Tags Updated 4 weeks ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
4,664 Pulls 1 Tag Updated 1 week ago
Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1
5.6M Pulls 102 Tags Updated 1 year ago
Cogito v1 Preview is a family of hybrid reasoning models by Deep Cogito that outperform the best available open models of the same size, including counterparts from LLaMA, DeepSeek, and Qwen across most standard benchmarks.
944.8K Pulls 20 Tags Updated 8 months ago
A fully open-source family of reasoning models built using a dataset derived by distilling DeepSeek-R1.
632.7K Pulls 15 Tags Updated 8 months ago
EXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.
312.4K Pulls 13 Tags Updated 9 months ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
156.3K Pulls 9 Tags Updated 10 months ago
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
39.5K Pulls 6 Tags Updated 5 days ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
9,175 Pulls 6 Tags Updated 6 days ago
Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
5,425 Pulls 15 Tags Updated 2 days ago