This model extends LLama-3 8B's context length from 8k to over 1m tokens.
121.5K Pulls 35 Tags Updated 1 year ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
105.5M Pulls 93 Tags Updated 11 months ago
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
2.7M Pulls 14 Tags Updated 11 months ago
The most powerful vision-language model in the Qwen model family to date.
112.1K Pulls 59 Tags Updated 1 week ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
12.8M Pulls 58 Tags Updated 4 weeks ago
Meta's Llama 3.2 goes small with 1B and 3B models.
44.3M Pulls 63 Tags Updated 1 year ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
16.3M Pulls 133 Tags Updated 1 year ago
Meta Llama 3: The most capable openly available LLM to date
11.7M Pulls 68 Tags Updated 1 year ago
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
11.4M Pulls 98 Tags Updated 1 year ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
8.2M Pulls 199 Tags Updated 5 months ago
State-of-the-art large embedding model from mixedbread.ai
5.3M Pulls 4 Tags Updated 1 year ago
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
5M Pulls 379 Tags Updated 1 year ago
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
4.5M Pulls 102 Tags Updated 1 year ago
Qwen2 is a new series of large language models from Alibaba group
4.4M Pulls 97 Tags Updated 1 year ago
A large language model that can use text prompts to generate and discuss code.
3.3M Pulls 199 Tags Updated 1 year ago
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
2.9M Pulls 9 Tags Updated 5 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2.7M Pulls 5 Tags Updated 9 months ago
Mistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.
2.1M Pulls 21 Tags Updated 9 months ago
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
2.1M Pulls 49 Tags Updated 1 year ago
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
2M Pulls 4 Tags Updated 1 year ago