Ollama
Models Docs Pricing
Sign in Download
Models Download Docs Pricing Sign in
⇅
Vison · Ollama
Search for models on Ollama.
  • llama3.2-vision

    Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes.

    vision 11b 90b

    4.1M  Pulls 9  Tags Updated  10 months ago

  • qwen3-vl

    The most powerful vision-language model in the Qwen model family to date.

    vision tools thinking cloud 2b 4b 8b 30b 32b 235b

    2.5M  Pulls 59  Tags Updated  4 months ago

  • kimi-k2.5

    Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.

    cloud

    187.9K  Pulls 1  Tag Updated  1 month ago

  • deepseek-ocr

    DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.

    vision 3b

    362K  Pulls 3  Tags Updated  4 months ago

  • qwen2.5vl

    Flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL.

    vision 3b 7b 32b 72b

    1.5M  Pulls 17  Tags Updated  10 months ago

  • mistral-small3.1

    Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.

    vision tools 24b

    669.6K  Pulls 5  Tags Updated  11 months ago

  • llava

    🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.

    vision 7b 13b 34b

    13.4M  Pulls 98  Tags Updated  2 years ago

  • minicpm-v

    A series of multimodal LLMs (MLLMs) designed for vision-language understanding.

    vision 8b

    4.8M  Pulls 17  Tags Updated  1 year ago

  • granite3.2-vision

    A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

    vision tools 2b

    838.6K  Pulls 5  Tags Updated  1 year ago

  • moondream

    moondream2 is a small vision language model designed to run efficiently on edge devices.

    vision 1.8b

    858.2K  Pulls 18  Tags Updated  1 year ago

  • nemotron-cascade-2

    An open 30B MoE model from NVIDIA with 3B activated parameters that delivers strong reasoning and agentic capabilities.

    tools thinking 30b

    24.6K  Pulls 3  Tags Updated  6 days ago

  • VisionVTAI/Aria-sama

    tools

    20  Pulls 1  Tag Updated  6 months ago

  • vitali87/shell-commands-qwen2-1.5b

    fine-tuned model on Linux Command Library (https://linuxcommandlibrary.com/basic/oneliners)

    355  Pulls 1  Tag Updated  1 year ago

  • vickiovikthompson/uncensored-qwen

    Uncensored version of qwen 2.5

    tools

    60  Pulls 1  Tag Updated  5 days ago

  • deepseek-v3.2

    DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.

    cloud

    59.8K  Pulls 1  Tag Updated  3 months ago

  • vishalraj/dark-champion-21b

    tools

    37  Pulls 1  Tag Updated  1 month ago

  • visharxd/coupon-generator

    tools

    2  Pulls 1  Tag Updated  1 year ago

  • ViperAI/viper-coder.v.01

    ViperCoder is an advanced developer-focused AI built on a modern code model and optimized for real-world software engineering.

    tools

    108  Pulls 1  Tag Updated  1 month ago

  • vanilj/reflection-70b-iq2_xxs

    Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

    342  Pulls 1  Tag Updated  1 year ago

  • sorc/qwen3.5-instruct-uncensored

    Q8_0 Non-thinking Uncensored Non-Vision

    tools 2b 4b 9b

    2,350  Pulls 4  Tags Updated  1 week ago

© 2026 Ollama
Blog Contact