Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.
7,300 Pulls 7 Tags Updated 1 week ago
This model extends LLama-3 8B's context length from 8k to over 1m tokens.
935.2K Pulls 35 Tags Updated 2 years ago
LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.
1.1M Pulls 6 Tags Updated 2 months ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
114.1M Pulls 93 Tags Updated 1 year ago
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
3.8M Pulls 14 Tags Updated 1 year ago
NVIDIA Nemotron 3 Nano Omni is a multimodal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows.
569.9K Pulls 4 Tags Updated 1 week ago
Qwen3-Coder-Next is a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development.
1.3M Pulls 4 Tags Updated 3 months ago
LFM2.5 is a new family of hybrid models designed for on-device deployment.
1.2M Pulls 5 Tags Updated 3 months ago
MiniMax-M2.5 is a state-of-the-art large language model designed for real-world productivity and coding tasks.
508.1K Pulls 1 Tag Updated 2 months ago
Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
423.3K Pulls 15 Tags Updated 4 months ago
DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.
439K Pulls 3 Tags Updated 5 months ago
269.7K Pulls 10 Tags Updated 4 months ago
A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.
49.1K Pulls 1 Tag Updated 5 months ago
The most powerful vision-language model in the Qwen model family to date.
3.7M Pulls 59 Tags Updated 6 months ago
Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.
1.9M Pulls 17 Tags Updated 11 months ago
MiniMax M2 is a high-efficiency large language model built for coding and agentic workflows.
441.4K Pulls 1 Tag Updated 6 months ago
A state-of-the-art mixture-of-experts (MoE) language model. Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
65.4K Pulls 1 Tag Updated 7 months ago
Meta's Llama 3.2 goes small with 1B and 3B models.
68.4M Pulls 63 Tags Updated 1 year ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
28.5M Pulls 58 Tags Updated 7 months ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
29.5M Pulls 133 Tags Updated 1 year ago