ahmadwaqar/smolvlm2-500m-video/template

ahmadwaqar/

smolvlm2-500m-video:latest

53 Downloads Updated 1 week ago

Compact 500M vision-language model for video/image understanding. Supports visual QA, captioning, OCR, video analysis. Only 1.8GB VRAM. Built on SigLIP + SmolLM2. Available in Q8 and FP16. Apache 2.0 license.

vision

template

e5ca433beddf · 280B

{{- if .System }}<|im_start|>system

{{ .System }}<|im_end|>

{{ end }}{{- range .Messages }}{{- if eq .Role "user" }}<|im_start|>user

{{ .Content }}<|im_end|>

{{ else if eq .Role "assistant" }}<|im_start|>assistant

{{ .Content }}<|im_end|>

{{ end }}{{- end }}<|im_start|>assistant