Models
GitHub
Discord
Docs
Cloud
Sign in
Download
Models
Download
GitHub
Discord
Docs
Cloud
Sign in
ahmadwaqar
/
smolvlm2-500m-video
:latest
53
Downloads
Updated
1 week ago
Compact 500M vision-language model for video/image understanding. Supports visual QA, captioning, OCR, video analysis. Only 1.8GB VRAM. Built on SigLIP + SmolLM2. Available in Q8 and FP16. Apache 2.0 license.
Compact 500M vision-language model for video/image understanding. Supports visual QA, captioning, OCR, video analysis. Only 1.8GB VRAM. Built on SigLIP + SmolLM2. Available in Q8 and FP16. Apache 2.0 license.
Cancel
vision
smolvlm2-500m-video:latest
...
/
template
e5ca433beddf · 280B
{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ .Content }}<|im_end|>
{{ end }}{{- end }}<|im_start|>assistant