This model extends LLama-3 8B's context length from 8k to over 1m tokens.
141.6K Pulls 35 Tags Updated 1 year ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
107.7M Pulls 93 Tags Updated 1 year ago
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
2.8M Pulls 14 Tags Updated 1 year ago
Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
5,835 Pulls 15 Tags Updated 3 days ago
The most powerful vision-language model in the Qwen model family to date.
796.3K Pulls 59 Tags Updated 1 month ago
Meta's Llama 3.2 goes small with 1B and 3B models.
50M Pulls 63 Tags Updated 1 year ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
18.2M Pulls 133 Tags Updated 1 year ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
15.2M Pulls 58 Tags Updated 2 months ago
Meta Llama 3: The most capable openly available LLM to date
13M Pulls 68 Tags Updated 1 year ago
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
11.9M Pulls 98 Tags Updated 1 year ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
9.3M Pulls 199 Tags Updated 6 months ago
State-of-the-art large embedding model from mixedbread.ai
5.7M Pulls 4 Tags Updated 1 year ago
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
5.1M Pulls 379 Tags Updated 1 year ago
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
4.8M Pulls 102 Tags Updated 1 year ago
Qwen2 is a new series of large language models from Alibaba group
4.5M Pulls 97 Tags Updated 1 year ago
A large language model that can use text prompts to generate and discuss code.
3.7M Pulls 199 Tags Updated 1 year ago
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
3.3M Pulls 9 Tags Updated 7 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
3M Pulls 5 Tags Updated 11 months ago
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
2.2M Pulls 49 Tags Updated 1 year ago
Mistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.
2.2M Pulls 21 Tags Updated 10 months ago