DeepSeek-V4-Flash is a preview of the DeepSeek-V4 series, a Mixture-of-Experts model with 284B total parameters and 13B activated, built for efficient reasoning across a 1M-token context window.
82.2K Pulls 1 Tag Updated 4 weeks ago
DeepSeek-V4-Pro is a frontier Mixture-of-Experts model with a 1M-token context window and three reasoning modes.
69.4K Pulls 1 Tag Updated 3 weeks ago
Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration.
257.5K Pulls 1 Tag Updated 1 month ago
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.
2.1M Pulls 1 Tag Updated 1 month ago
MiniMax's M2-series model for coding, agentic workflows, and professional productivity.
2.1M Pulls 1 Tag Updated 2 months ago
Gemma 4 models are designed to deliver frontier-level performance at each size. They are well-suited for reasoning, agentic workflows, coding, and multimodal understanding.
9.8M Pulls 34 Tags Updated 22 hours ago
NVIDIA Nemotron 3 Super is a 120B open MoE model activating just 12B parameters to deliver maximum compute efficiency and accuracy for complex multi-agent applications.
2.3M Pulls 7 Tags Updated 2 months ago
Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
12.2M Pulls 64 Tags Updated 22 hours ago
A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
2.2M Pulls 1 Tag Updated 3 months ago
MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
2.1M Pulls 1 Tag Updated 3 months ago
Qwen3-Coder-Next is a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
1.3M Pulls 4 Tags Updated 3 months ago
Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.
291.5K Pulls 1 Tag Updated 3 months ago
Advancing the Coding Capability
2.1M Pulls 1 Tag Updated 4 months ago
Exceptional multilingual capabilities to elevate code engineering
2M Pulls 1 Tag Updated 5 months ago
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
2.1M Pulls 2 Tags Updated 5 months ago
Nemotron-3-Nano is a new Standard for Efficient, Open, and Intelligent Agentic Models, now updated with a 4B parameter count model.
451K Pulls 9 Tags Updated 2 months ago
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
837.5K Pulls 6 Tags Updated 5 months ago
Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
465.6K Pulls 6 Tags Updated 5 months ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
222.7K Pulls 6 Tags Updated 5 months ago