-
llava
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
vision 7b 13b 34b2.1M Pulls 98 Tags Updated 10 months ago
-
llava-llama3
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
vision 8b231.7K Pulls 4 Tags Updated 7 months ago
-
bakllava
BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.
vision 7b100.2K Pulls 17 Tags Updated 12 months ago
-
llava-phi3
A new small LLaVA model fine-tuned from Phi 3 Mini.
vision 3.8b59.2K Pulls 4 Tags Updated 7 months ago