DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
2.2M Pulls 1 Tag Updated 5 months ago
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
847.2K Pulls 6 Tags Updated 5 months ago
DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.
95.1K Pulls 1 Tag Updated 1 month ago
DeepSeek-V4-Pro is a frontier Mixture-of-Experts model with a 1M-token context window and three reasoning modes.
87.2K Pulls 1 Tag Updated 1 month ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
228.5K Pulls 6 Tags Updated 5 months ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
453.2K Pulls 3 Tags Updated 6 months ago
DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.
690K Pulls 8 Tags Updated 8 months ago
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
86.5M Pulls 35 Tags Updated 11 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3.8M Pulls 5 Tags Updated 1 year ago
Dolphin 3.0 Llama 3.1 8B 🐬 is the next generation of the Dolphin series of instruct-tuned models designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
4.2M Pulls 102 Tags Updated 2 years ago
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
2.6M Pulls 64 Tags Updated 1 year ago
Devstral: the best open source model for coding agents
949.1K Pulls 5 Tags Updated 10 months ago
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.2M Pulls 5 Tags Updated 1 year ago
DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
872.4K Pulls 9 Tags Updated 1 year ago
Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
1.9M Pulls 53 Tags Updated 2 years ago
Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.
1.8M Pulls 70 Tags Updated 1 year ago
2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.
1.6M Pulls 15 Tags Updated 2 years ago
The uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.
1.5M Pulls 120 Tags Updated 2 years ago
A strong, economical, and efficient Mixture-of-Experts language model.
1.1M Pulls 34 Tags Updated 1 year ago