library

deepseek-r1

DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.

81.7M Pulls 35 Tags Updated 9 months ago

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

3.7M Pulls 5 Tags Updated 1 year ago

DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.

3.7M Pulls 102 Tags Updated 2 years ago

An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

2.1M Pulls 64 Tags Updated 1 year ago

An advanced language model crafted with 2 trillion bilingual tokens.

866.9K Pulls 64 Tags Updated 2 years ago

A strong, economical, and efficient Mixture-of-Experts language model.

866.3K Pulls 34 Tags Updated 1 year ago

DeepSeek-V3.1-Terminus is a hybrid model that supports both thinking mode and non-thinking mode.

534.4K Pulls 8 Tags Updated 6 months ago

DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.

375.6K Pulls 3 Tags Updated 4 months ago

An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.

234.7K Pulls 7 Tags Updated 1 year ago

DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.

66.2K Pulls 1 Tag Updated 3 months ago