🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
14.1M Pulls 98 Tags Updated 2 years ago
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
2.3M Pulls 4 Tags Updated 2 years ago
A new small LLaVA model fine-tuned from Phi 3 Mini.
283.6K Pulls 4 Tags Updated 2 years ago
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
114.8M Pulls 93 Tags Updated 1 year ago
Meta Llama 3: The most capable openly available LLM to date
24M Pulls 68 Tags Updated 2 years ago
BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.
849.4K Pulls 17 Tags Updated 2 years ago
Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
1.9M Pulls 53 Tags Updated 2 years ago
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
937.9K Pulls 33 Tags Updated 1 year ago
A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).
965.4K Pulls 35 Tags Updated 2 years ago
Conversational model based on Llama 2 that performs competitively on various benchmarks.
911.7K Pulls 80 Tags Updated 2 years ago
Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.
12.5K Pulls 7 Tags Updated 3 weeks ago
Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.
4.5M Pulls 9 Tags Updated 12 months ago
SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.
3.4M Pulls 49 Tags Updated 1 year ago
Llama 2 based model fine tuned to improve Chinese dialogue ability.
1M Pulls 35 Tags Updated 2 years ago
NVIDIA Nemotron 3 Nano Omni is a multimodal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows.
588.8K Pulls 4 Tags Updated 3 weeks ago
DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.
870.3K Pulls 9 Tags Updated 1 year ago
An advanced language model crafted with 2 trillion bilingual tokens.
1.1M Pulls 64 Tags Updated 2 years ago
A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.
1.1M Pulls 38 Tags Updated 2 years ago
Sailor2 are multilingual language models made for South-East Asia. Available in 1B, 8B, and 20B parameter sizes.
390.8K Pulls 13 Tags Updated 1 year ago
oyCaption is an image captioning Visual Language Model (VLM) built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
4,455 Pulls 4 Tags Updated 9 months ago