Models
GitHub
Discord
Docs
Cloud
Sign in
Download
Models
Download
GitHub
Discord
Docs
Cloud
Sign in
ahmadwaqar
/
smolvlm2-2.2b-instruct
63
Downloads
Updated
2 weeks ago
SmolVLM2-2.2B-Instruct is a compact multimodal model for image and video understanding. Built on SmolLM2-1.7B with SigLIP vision encoder. Supports visual QA, OCR, and video analysis. Available in Q8 and FP16 quantizations. Apache 2.0 license.
SmolVLM2-2.2B-Instruct is a compact multimodal model for image and video understanding. Built on SmolLM2-1.7B with SigLIP vision encoder. Supports visual QA, OCR, and video analysis. Available in Q8 and FP16 quantizations. Apache 2.0 license.
Cancel
vision
Name
2 models
Size
Context
Input
smolvlm2-2.2b-instruct:latest
80475fa44dcd
• 2.5GB • 8K context window •
Text, Image input • 2 weeks ago
Text, Image input • 2 weeks ago
smolvlm2-2.2b-instruct:latest
2.5GB
8K
Text, Image
80475fa44dcd
· 2 weeks ago
smolvlm2-2.2b-instruct:fp16
82d4cccc4c9b
• 4.5GB • 8K context window •
Text, Image input • 2 weeks ago
Text, Image input • 2 weeks ago
smolvlm2-2.2b-instruct:fp16
4.5GB
8K
Text, Image
82d4cccc4c9b
· 2 weeks ago