NVIDIA Nemotron 3 Super is a 120B open MoE model activating just 12B parameters to deliver maximum compute efficiency and accuracy for complex multi-agent applications.
180.3K Pulls 7 Tags Updated 3 weeks ago
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
3.8M Pulls 17 Tags Updated 8 months ago
Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
784.9K Pulls 84 Tags Updated 1 year ago
123B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
173K Pulls 6 Tags Updated 3 months ago
gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are safety reasoning models built-upon gpt-oss
119.2K Pulls 3 Tags Updated 5 months ago
Sailor2 are multilingual language models made for South-East Asia. Available in 1B, 8B, and 20B parameter sizes.
323K Pulls 13 Tags Updated 1 year ago
An experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.
580.5K Pulls 18 Tags Updated 2 years ago
8 Pulls 2 Tags Updated 5 days ago
138 Pulls 1 Tag Updated 3 weeks ago
An optimized version of Google's TranslateGemma-12B-it (Gemma 3) designed for high-fidelity translation. This build features hard-coded Temperature=0.1 and English Anchor support to eliminate output redundancy and maximize accuracy.
32.9K Pulls 1 Tag Updated 2 months ago
72 Pulls 1 Tag Updated 1 week ago
Source: https://huggingface.co/mradermacher/LFM2.5-1.2B-MEGABRAIN2-Thinking-Kimi-V2-DISTILL-GGUF
82 Pulls 1 Tag Updated 3 weeks ago
44 Pulls 1 Tag Updated 3 weeks ago
A raw, completely uncensored creative partner based on Gemma 3 (12B). Fine-tuned on a custom 300M token dataset for SEO, explicit content generation, and structured songwriting (Suno.ai). Hallucinations are a feature, not a bug.
1,319 Pulls 1 Tag Updated 4 months ago
LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.
931 Pulls 4 Tags Updated 2 months ago
LFM2.5 is a new family of hybrid models designed for on-device deployment.
285 Pulls 2 Tags Updated 2 weeks ago
Suite of weighted quantizations for Magnum V4 models. Available in 9b, 12b, 22b, and 27b. Made by Anthracite-org (Huggingface).
625 Pulls 26 Tags Updated 3 months ago
389 Pulls 1 Tag Updated 4 months ago
195 Pulls 1 Tag Updated 1 month ago
202 Pulls 1 Tag Updated 1 month ago