MiniMax M3: Coding & Agentic Frontier. 1M context window. Native Multimodality.
5,273 Pulls 1 Tag Updated 15 hours ago
Gemma 4 models are designed to deliver frontier-level performance at each size. They are well-suited for reasoning, agentic workflows, coding, and multimodal understanding.
11.4M Pulls 34 Tags Updated 1 week ago
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.
2.2M Pulls 1 Tag Updated 1 month ago
MiniMax's M2-series model for coding, agentic workflows, and professional productivity.
2.2M Pulls 1 Tag Updated 2 months ago
MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
2.2M Pulls 1 Tag Updated 3 months ago
Advancing the Coding Capability
2.2M Pulls 1 Tag Updated 5 months ago
Qwen3-Coder-Next is a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
1.4M Pulls 4 Tags Updated 3 months ago
Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration.
277.6K Pulls 1 Tag Updated 1 month ago
Alibaba's performant long context models for agentic and coding tasks.
5.8M Pulls 10 Tags Updated 8 months ago
MiniMax M2 is a high-efficiency large language model built for coding and agentic workflows.
2.2M Pulls 1 Tag Updated 7 months ago
Advanced agentic, reasoning and coding capabilities.
Exceptional multilingual capabilities to elevate code engineering
2.1M Pulls 1 Tag Updated 5 months ago
24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
850.7K Pulls 6 Tags Updated 5 months ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
230.2K Pulls 6 Tags Updated 5 months ago
DeepSeek-V4-Pro is a frontier Mixture-of-Experts model with a 1M-token context window and three reasoning modes.
91.5K Pulls 1 Tag Updated 1 month ago
The current, most capable model that runs on a single GPU.
37.3M Pulls 29 Tags Updated 5 months ago
Gemma 4 31B (Google DeepMind) with thinking mode enabled. Best for complex reasoning, math, coding, and multi-step analysis. Knowledge cutoff: January 2025. Sampling: temperature 1.0 / top_p 0.95 / top_k 64.
404 Pulls 1 Tag Updated 1 month ago
A reasoning-focused refinement of gemma4:31B, optimized for epistemic honesty, safety, and human-aligned decision making. Designed to prioritize factual accuracy over creative embellishment and maintain transparency in uncertain contexts.
74 Pulls 1 Tag Updated 1 month ago