45 Downloads Updated 3 days ago
Name
7 models
kimi-vl-a3b-thinking:latest
11GB · 128K context window · Text · 1 week ago
kimi-vl-a3b-thinking:Q4_K_M
11GB · 128K context window · Text · 1 week ago
kimi-vl-a3b-thinking:Q5_K_M
12GB · 128K context window · Text · 3 days ago
kimi-vl-a3b-thinking:Q6_K
14GB · 128K context window · Text · 3 days ago
kimi-vl-a3b-thinking:q8_0
17GB · 128K context window · Text · 3 days ago
kimi-vl-a3b-thinking:iq4_xs
8.8GB · 128K context window · Text · 3 days ago
kimi-vl-a3b-thinking:f16
32GB · 128K context window · Text · 1 week ago
Kimi-VL-A3B-Thinking is a powerful vision-language model from Moonshot AI featuring extended thinking capabilities for complex visual reasoning. Built on the DeepSeek2 architecture with Mixture of Experts (MoE), it excels at solving math problems from images, analyzing documents, and performing step-by-step visual reasoning with chain-of-thought explanations.
Kimi-VL-A3B-Thinking is a powerful vision-language model from Moonshot AI featuring extended thinking capabilities. Built on the DeepSeek2 architecture with Mixture of Experts (MoE), it excels at complex visual reasoning tasks, mathematical problem-solving from images, and detailed image analysis with chain-of-thought explanations.
| Tag | Size | RAM Required | Description |
|---|---|---|---|
q4_k_m |
9.8 GB | ~16GB | Recommended - best quality/size ratio |
f16 |
30 GB | ~40GB | Full precision, maximum quality |
# Recommended version (Q4_K_M)
ollama run richardyoung/kimi-vl-a3b-thinking "Solve this math problem step by step"
# Full precision version
ollama run richardyoung/kimi-vl-a3b-thinking:f16 "Analyze this diagram in detail"
ollama run richardyoung/kimi-vl-a3b-thinking "Solve this equation and show your work"
ollama run richardyoung/kimi-vl-a3b-thinking "Extract key information from this document"
ollama run richardyoung/kimi-vl-a3b-thinking "What can you infer about this scene?"
ollama run richardyoung/kimi-vl-a3b-thinking "Analyze the trends shown in this chart"
MIT License - Free for commercial and personal use.
Note: For vision tasks, use with an Ollama client that supports image input (e.g., Open WebUI, Ollama API with base64 images). The model performs best when asked to “think step by step”.