Qwen3-VL

October 14, 2025

An illustration Qwen capybara taking a picture of Ollama drinking water out of a fancy wine glass.

Qwen3-VL, the most powerful vision language model in the Qwen series is now available on Ollama’s cloud. The models will be made available locally soon.

Model capabilities

Get started

  1. Download Ollama

  2. Run the model

ollama run qwen3-vl:235b-cloud   

Prompt the model with a message and image path(s). It is possible to use multiple images and drag and drop in images to make it easier to automatically type the file path.

demo CLI

Examples

Flower identification

flower

Prompt: What is this flower? Is it poisonous to cats?

flower identification example

night market menu

Prompt: Show me the menu in English!

night market menu example

Basic linear algebra

math question

Prompt: what’s the answer?

math example

Using Qwen3-VL 235B

You can use Ollama’s cloud for free to get started with the full model using Ollama’s CLI, API, and JavaScript / Python libraries.

JavaScript Library

Install Ollama’s JavaScript library

npm i ollama 

Pull the model

ollama pull qwen3-vl:235b-cloud

Example non-streaming output with image

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'qwen3-vl:235b-cloud',
  messages: [{ 
	  role: 'user', 
	  content: 'What is this?', 
	  images: ['./image.jpg']
	  }],
})
console.log(response.message.content)

Example streaming the output with image

import ollama from 'ollama'

const message = { 
	role: 'user', 
	content: 'What is this?', 
	images: ['./image.jpg'] 
	}
const response = await ollama.chat({
  model: 'qwen3-vl:235b-cloud',
  messages: [message],
  stream: true,
})
for await (const part of response) {
  process.stdout.write(part.message.content)
}

Ollama’s JavaScript library page on GitHub has more examples and API documentation.

Python Library

Install Ollama’s Python library

pip install ollama

Pull the model

ollama pull qwen3-vl:235b-cloud

Example non-streaming output with image

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(
	model='qwen3-vl:235b-cloud', 
	messages=[
  {
    'role': 'user',
    'content': 'What is this?',
    'images': ['./image.jpg']
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

Example streaming the output with image

from ollama import chat

stream = chat(
    model='qwen3-vl:235b-cloud',
    messages=[{
    'role': 'user', 
    'content': 'What is this?',
    'images': ['./image.jpg']
    }],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

Ollama’s Python library page on GitHub has more examples and API documentation.

API

The model can also be accessed directly on ollama.com’s API.

  1. Generate an API key from Ollama.

  2. Set OLLAMA_API_KEY environment variable using your API key.

export OLLAMA_API_KEY=your_api_key
  1. Generate a response using API examples.

OpenAI Compatible API

Ollama has OpenAI compatible API endpoints that support the chat completions endpoint, completions endpoint, and the embeddings endpoint.

  1. Generate an API key from Ollama.

  2. The base_url should be set to https://ollama.com/v1 and api_key set to the one generated from above.