Llava

🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.

vision 7b 13b 34b

14.1M Pulls 98 Tags Updated 2 years ago

llava-llama3

A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.

vision 8b

2.3M Pulls 4 Tags Updated 2 years ago

llava-phi3

A new small LLaVA model fine-tuned from Phi 3 Mini.

vision 3.8b

286.2K Pulls 4 Tags Updated 2 years ago

llama3.1

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.

tools 8b 70b 405b

115.3M Pulls 93 Tags Updated 1 year ago

llama3

Meta Llama 3: The most capable openly available LLM to date

8b 70b

24.2M Pulls 68 Tags Updated 2 years ago

dolphin-llama3

Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.

8b 70b

1.9M Pulls 53 Tags Updated 2 years ago

xwinlm

Conversational model based on Llama 2 that performs competitively on various benchmarks.

7b 13b

914.2K Pulls 80 Tags Updated 2 years ago

llama3-groq-tool-use

A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.

tools 8b 70b

941.5K Pulls 33 Tags Updated 1 year ago

llama3-chatqa

A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).

8b 70b

969.1K Pulls 35 Tags Updated 2 years ago

bakllava

BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.

vision 7b

851.7K Pulls 17 Tags Updated 2 years ago

smollm2

SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters.

tools 135m 360m 1.7b

3.4M Pulls 49 Tags Updated 1 year ago

llama3.2-vision

Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.

vision 11b 90b

4.6M Pulls 9 Tags Updated 1 year ago

llama2-chinese

Llama 2 based model fine tuned to improve Chinese dialogue ability.

7b 13b

1M Pulls 35 Tags Updated 2 years ago

laguna-xs.2

Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine.

tools thinking

14.2K Pulls 7 Tags Updated 1 month ago

nemotron3

NVIDIA Nemotron 3 Nano Omni is a multimodal large language model that unifies video, audio, image, and text understanding to support enterprise-grade Q&A, summarization, transcription, and document intelligence workflows.

vision tools thinking audio 33b

596.5K Pulls 4 Tags Updated 1 month ago

deepcoder

DeepCoder is a fully open-Source 14B coder model at O3-mini level, with a 1.5B version also available.

1.5b 14b

873.4K Pulls 9 Tags Updated 1 year ago

deepseek-llm

An advanced language model crafted with 2 trillion bilingual tokens.

7b 67b

1.1M Pulls 64 Tags Updated 2 years ago

falcon

A large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.

7b 40b 180b

1.1M Pulls 38 Tags Updated 2 years ago

pugmail/phiLlava

vision

6 Pulls 1 Tag Updated 1 year ago

aha2025/llama-joycaption-beta-one-hf-llava

oyCaption is an image captioning Visual Language Model (VLM) built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.

vision

4,578 Pulls 4 Tags Updated 10 months ago