Ollama
Models Docs Pricing
Sign in Download
Models Download Docs Pricing Sign in
⇅
Tools, Vision models · Ollama
Tools, Vision models on Ollama.
  • glm-ocr

    GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

    vision tools

    22.3K  Pulls 3  Tags Updated  1 week ago

  • qwen3-vl

    The most powerful vision-language model in the Qwen model family to date.

    vision tools thinking cloud 2b 4b 8b 30b 32b 235b

    1.4M  Pulls 59  Tags Updated  3 months ago

  • ministral-3

    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.

    vision tools cloud 3b 8b 14b

    400.7K  Pulls 16  Tags Updated  2 months ago

  • devstral-small-2

    24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.

    vision tools cloud 24b

    138.3K  Pulls 6  Tags Updated  1 month ago

  • mistral-small3.2

    An update to Mistral Small that improves on function calling, instruction following, and less repetition errors.

    vision tools 24b

    1.2M  Pulls 5  Tags Updated  7 months ago

  • llama4

    Meta's latest collection of multimodal models.

    vision tools 16x17b 128x17b

    1.2M  Pulls 11  Tags Updated  7 months ago

  • granite3.2-vision

    A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

    vision tools 2b

    729.1K  Pulls 5  Tags Updated  11 months ago

  • mistral-small3.1

    Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.

    vision tools 24b

    583.3K  Pulls 5  Tags Updated  10 months ago

© 2026 Ollama
Blog Contact