45 Downloads Updated 3 days ago
Updated 3 days ago
3 days ago
b882f5fc5198 Β· 8.8GB Β·
Kimi-VL-A3B-Thinking is a powerful vision-language model from Moonshot AI featuring extended thinking capabilities for complex visual reasoning. Built on the DeepSeek2 architecture with Mixture of Experts (MoE), it excels at solving math problems from images, analyzing documents, and performing step-by-step visual reasoning with chain-of-thought explanations.
Kimi-VL-A3B-Thinking is a powerful vision-language model from Moonshot AI featuring extended thinking capabilities. Built on the DeepSeek2 architecture with Mixture of Experts (MoE), it excels at complex visual reasoning tasks, mathematical problem-solving from images, and detailed image analysis with chain-of-thought explanations.
| Tag | Size | RAM Required | Description |
|---|---|---|---|
q4_k_m |
9.8 GB | ~16GB | Recommended - best quality/size ratio |
f16 |
30 GB | ~40GB | Full precision, maximum quality |
# Recommended version (Q4_K_M)
ollama run richardyoung/kimi-vl-a3b-thinking "Solve this math problem step by step"
# Full precision version
ollama run richardyoung/kimi-vl-a3b-thinking:f16 "Analyze this diagram in detail"
ollama run richardyoung/kimi-vl-a3b-thinking "Solve this equation and show your work"
ollama run richardyoung/kimi-vl-a3b-thinking "Extract key information from this document"
ollama run richardyoung/kimi-vl-a3b-thinking "What can you infer about this scene?"
ollama run richardyoung/kimi-vl-a3b-thinking "Analyze the trends shown in this chart"
MIT License - Free for commercial and personal use.
Note: For vision tasks, use with an Ollama client that supports image input (e.g., Open WebUI, Ollama API with base64 images). The model performs best when asked to βthink step by stepβ.