73 1 week ago

SmolVLM2-2.2B-Instruct is a lightweight yet powerful vision-language model that can understand images, read documents, and analyze video frames. At just 2.2B parameters, it runs efficiently on consumer hardware including laptops and smartphones, making

f6830037c822 · 75B
{
"num_ctx": 8192,
"stop": [
"<|im_end|>",
"<|endoftext|>"
]
}