This model extends LLama-3 8B's context length from 8k to over 1m tokens.
114.1K Pulls 35 Tags Updated 1 year ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
101.9M Pulls 93 Tags Updated 9 months ago
New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
2.4M Pulls 14 Tags Updated 9 months ago
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.
7.8M Pulls 56 Tags Updated 1 month ago
Meta's Llama 3.2 goes small with 1B and 3B models.
34M Pulls 63 Tags Updated 11 months ago
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.
13.6M Pulls 133 Tags Updated 11 months ago
Meta Llama 3: The most capable openly available LLM to date
10.8M Pulls 68 Tags Updated 1 year ago
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
9.4M Pulls 98 Tags Updated 1 year ago
The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing.
6.9M Pulls 199 Tags Updated 3 months ago
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
4.9M Pulls 379 Tags Updated 1 year ago
State-of-the-art large embedding model from mixedbread.ai
4.8M Pulls 4 Tags Updated 1 year ago
Qwen2 is a new series of large language models from Alibaba group
4.3M Pulls 97 Tags Updated 12 months ago
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
4M Pulls 102 Tags Updated 1 year ago
A large language model that can use text prompts to generate and discuss code.
2.9M Pulls 199 Tags Updated 1 year ago
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
2.4M Pulls 9 Tags Updated 3 months ago
A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
2.2M Pulls 5 Tags Updated 7 months ago
Mistral Small 3 sets a new benchmark in the “small” Large Language Models category below 70B.
1.7M Pulls 21 Tags Updated 7 months ago
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
1.7M Pulls 49 Tags Updated 10 months ago
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
1.6M Pulls 4 Tags Updated 1 year ago
Uncensored Llama 2 model by George Sung and Jarrad Hope.
1.3M Pulls 34 Tags Updated 1 year ago