FunctionGemma is a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.
50.2K Pulls 4 Tags Updated 1 month ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
58.8K Pulls 6 Tags Updated 1 month ago
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
45.9K Pulls 2 Tags Updated 1 month ago
The Cogito v2.1 LLMs are instruction tuned generative models. All models are released under MIT license for commercial use.
53.1K Pulls 6 Tags Updated 2 months ago
gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are safety reasoning models built-upon gpt-oss
54.6K Pulls 3 Tags Updated 3 months ago
Advancing the Coding Capability
20.8K Pulls 1 Tag Updated 1 month ago
nomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.
24.8K Pulls 1 Tag Updated 1 month ago
Advanced agentic, reasoning and coding capabilities.
52.3K Pulls 1 Tag Updated 3 months ago
MiniMax M2 is a high-efficiency large language model built for coding and agentic workflows.
44K Pulls 1 Tag Updated 3 months ago
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
18K Pulls 1 Tag Updated 1 month ago
Exceptional multilingual capabilities to elevate code engineering
11.4K Pulls 1 Tag Updated 1 month ago
Kimi K2 Thinking, Moonshot AI's best open-source thinking model.
24.2K Pulls 1 Tag Updated 2 months ago
A state-of-the-art mixture-of-experts (MoE) language model. Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
33.1K Pulls 1 Tag Updated 4 months ago
A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.
13.3K Pulls 1 Tag Updated 1 month ago
The current, most capable model that runs on a single GPU.
30.8M Pulls 29 Tags Updated 1 month ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
18.1M Pulls 58 Tags Updated 3 months ago
An update to Mistral Small that improves on function calling, instruction following, and less repetition errors.
1.2M Pulls 5 Tags Updated 7 months ago
Gemma 3n models are designed for efficient execution on everyday devices such as laptops, tablets or phones.
1.1M Pulls 9 Tags Updated 7 months ago
Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.
1.2M Pulls 17 Tags Updated 8 months ago
Magistral is a small, efficient reasoning model with 24B parameters.
1M Pulls 5 Tags Updated 7 months ago