DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
2.2M Pulls 1 Tag Updated 5 months ago
DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.
95.1K Pulls 1 Tag Updated 1 month ago
DeepSeek-V4-Pro is a frontier Mixture-of-Experts model with a 1M-token context window and three reasoning modes.
87.2K Pulls 1 Tag Updated 1 month ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
453.2K Pulls 3 Tags Updated 6 months ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
690K Pulls 8 Tags Updated 8 months ago
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
86.5M Pulls 35 Tags Updated 11 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3.8M Pulls 5 Tags Updated 1 year ago
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
4.2M Pulls 102 Tags Updated 2 years ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
2.6M Pulls 64 Tags Updated 1 year ago
A strong, economical, and efficient Mixture-of-Experts language model.
1.1M Pulls 34 Tags Updated 1 year ago
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
276.6K Pulls 7 Tags Updated 1 year ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.2M Pulls 5 Tags Updated 1 year ago
A version of the DeepSeek-R1 model that has been post trained to provide unbiased, accurate, and factual information by Perplexity.
405.2K Pulls 9 Tags Updated 1 year ago
Jackrong/Qwen3.5-9B-DeepSeek-V4-Flash - is an efficient reasoning model distilled using high-quality data from DeepSeek-V4. + vision. ollama v.0.30.0-rc20 +
1,342 Pulls 1 Tag Updated 1 week ago
Jackrong/Qwen3.5-9B-DeepSeek-V4-Flash - is an efficient reasoning model distilled using high-quality data from DeepSeek-V4.
1,197 Pulls 1 Tag Updated 2 weeks ago
I have just enabled both calling and thinking to existing deepseek-r1 models.
1,358 Pulls 6 Tags Updated 1 month ago
145 Pulls 1 Tag Updated 1 month ago
DeepSeek-V2.5-1210 is an upgraded version of DeepSeek-V2.5, offering enhanced mathematical, coding, writing, and reasoning capabilities.
384 Pulls 3 Tags Updated 1 year ago
9 Pulls 1 Tag Updated 2 days ago
A custom Deepseek-coder:6.7b model with Deepseek-coder:1.3b as a speculative fill model to speed up inference. Primarily built for TabbyML usage.
74 Pulls 1 Tag Updated 2 weeks ago