Tags · ahmadwaqar/smolvlm2-2.2b-instruct

ahmadwaqar/ smolvlm2-2.2b-instruct

803 Downloads Updated 7 months ago

SmolVLM2-2.2B-Instruct is a compact multimodal model for image and video understanding. Built on SmolLM2-1.7B with SigLIP vision encoder. Supports visual QA, OCR, and video analysis. Available in Q8 and FP16 quantizations. Apache 2.0 license.

vision

Name

2 models

Size / Usage

Context

Input

smolvlm2-2.2b-instruct:latest

2.5GB

Text, Image

80475fa44dcd · 7 months ago

smolvlm2-2.2b-instruct:fp16

4.5GB

Text, Image

82d4cccc4c9b · 7 months ago