93 1 week ago

SmolVLM2-2.2B-Instruct is a lightweight yet powerful vision-language model that can understand images, read documents, and analyze video frames. At just 2.2B parameters, it runs efficiently on consumer hardware including laptops and smartphones, making

e63d7bd18026 · 158B
# SmolVLM2-2.2B-Instruct F16
Full precision version (3.4 GB).
## Links
- **Original Model:** HuggingFaceTB/SmolVLM2-2.2B-Instruct
## License
Apache 2.0