org

qwen2.5-1m

The long-context version of Qwen2.5, supporting 1M-token context lengths

894 Pulls 2 Tags Updated 8 months ago

deepseek-r1-uncensored

DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.

818 Pulls 4 Tags Updated 8 months ago

deepseek-v3-fast

Single file version with (Dynamic Quants) A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

90 Pulls 4 Tags Updated 8 months ago