Opan

Opan requires Ollama 0.12.7

Opan is a friendly, multimodal vision-language assistant built on Qwen3-VL:2b. It combines the powerful perception and reasoning abilities of Qwen3-VL with a warm, supportive conversational style designed by Shushank.

In this generation, Opan inherits major improvements in many areas: understanding and generating text, perceiving and reasoning about visual content, handling long contexts, understanding spatial relationships, interpreting documents, and assisting with everyday tasks — offering a smooth and helpful AI experience.

Models

Opan (based on Qwen3-VL:2b)

Run

ollama run aeline/opan

Key features

Friendly Conversational Intelligence.

Opan is tuned with a custom system prompt that makes it warm, polite, and easy to understand. It explains concepts clearly and provides thoughtful, supportive responses designed for all users.

Multimodal Vision-Language Understanding.

Since Opan runs on Qwen3-VL, it can understand images + text together, recognize objects, interpret documents, describe scenes, and answer questions about visual content.

Improved Text Understanding & Generation.

Opan benefits from early-stage joint pretraining, giving it strong text reasoning skills. It handles general knowledge, guided explanations, learning assistance, and everyday conversation with ease.

Spatial & Visual Reasoning.

Opan can understand object relationships, positions, shapes, diagrams, and UI layouts. It performs much better at analyzing structured images, charts, or designs.

Enhanced OCR in 32 Languages.

Like Qwen3-VL, Opan can read text from images in over 32 languages, even in challenging conditions such as blur, tilt, or low light. It can also interpret long documents with preserved structure.

Long Context Support.
Opan supports 256K tokens, extendable toward 1M depending on configuration. This allows input such as:
- Full textbooks
- Long PDF documents
- Extended conversations
  - Multi-image sequences
  - Opan can recall details across very long contexts more reliably.

Visual Coding Capabilities

Opan can generate code based on visual input — for example:

Convert UI mockups into HTML/CSS
Generate JavaScript from diagrams
Interpret flowcharts and transform them into code
This enables “what you see is what you get” style visual-to-code workflows.

Stronger Reasoning (Inherited from Qwen Thinking Models)

While not fine-tuned as a dedicated thinking model, Opan still benefits from Qwen3-VL’s improved logical structure. It can break down problems, analyze steps, and give clear reasoning, especially in STEM-related questions.

How Run

ollama run aeline/opan

This is a Highly Specialized Vision Model With More Then 2B Parameters.

Models

Readme

Opan

Models

Run

Key features

Visual Coding Capabilities

Stronger Reasoning (Inherited from Qwen Thinking Models)

How Run