granite3.2-vision:2b

927.5K Downloads Updated 1 year ago

A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

vision tools 2b

ollama run granite3.2-vision:2b

curl http://localhost:11434/api/chat \
  -d '{
    "model": "granite3.2-vision:2b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='granite3.2-vision:2b',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'granite3.2-vision:2b',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Details

Updated 1 year ago

1 year ago

3be41a661804 · 2.4GB ·

model

archgranite

parameters2.53B

quantizationQ4_K_M

1.5GB

projector

archclip

parameters442M

quantizationF16

893MB

template

{{- /* Tools */ -}} {{- if .Tools -}} <|start_of_role|>available_tools<|end_of_role|> {{- range $ind

1.3kB

system

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful,

154B

license

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US

11kB

params

{ "num_ctx": 16384, "temperature": 0 }

34B

Readme

Note: this model requires Ollama 0.5.13.

A compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more. The model was trained on a meticulously curated instruction-following dataset, comprising diverse public datasets and synthetic datasets tailored to support a wide range of document understanding and general image tasks. It was trained by fine-tuning a Granite large language model with both image and text modalities.

References

Hugging Face