
-
deepseek-r1
DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.
1.5b 7b 8b 14b 32b 70b 671b32.9M Pulls 29 Tags Updated 8 weeks ago
-
deepseek-v3
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
671b942.5K Pulls 5 Tags Updated 2 months ago
-
deepseek-coder-v2
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
16b 236b762.8K Pulls 64 Tags Updated 7 months ago
-
deepseek-coder
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
1.3b 6.7b 33b637.6K Pulls 102 Tags Updated 15 months ago
-
deepseek-llm
An advanced language model crafted with 2 trillion bilingual tokens.
7b 67b137.8K Pulls 64 Tags Updated 15 months ago
-
deepseek-v2
A strong, economical, and efficient Mixture-of-Experts language model.
16b 236b135.5K Pulls 34 Tags Updated 9 months ago
-
deepscaler
A fine-tuned version of Deepseek-R1-Distilled-Qwen-1.5B that surpasses the performance of OpenAI’s o1-preview with just 1.5B parameters on popular math evaluations.
1.5b71.9K Pulls 5 Tags Updated 7 weeks ago
-
deepseek-v2.5
An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
236b51.5K Pulls 7 Tags Updated 6 months ago
-
exaone-deep
EXAONE Deep exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research.
2.4b 7.8b 32b22.6K Pulls 13 Tags Updated 2 weeks ago