richardyoung/smolvlm2-2.2b-instruct/license

richardyoung/ smolvlm2-2.2b-instruct:latest

384 Downloads Updated 2 months ago

SmolVLM2-2.2B-Instruct is a lightweight yet powerful vision-language model that can understand images, read documents, and analyze video frames. At just 2.2B parameters, it runs efficiently on consumer hardware including laptops and smartphones, making

license

7b559ea6765a · 1.1kB

# SmolVLM2-2.2B-Instruct

A compact 2.2B parameter vision-language model from HuggingFace with video understanding capabilities.

## Key Features

- **Compact & Fast** - Only 2.2B parameters, runs efficiently on consumer hardware

- **Vision & Video** - Understands images and video frames

- **Instruction-tuned** - Optimized for following user instructions

- **Apache 2.0** - Fully open source

## Quick Start

```bash

ollama run richardyoung/smolvlm2-2.2b-instruct

```

## Available Formats

| Format | Size | Description |

|--------|------|-------------|

| Q4_K_M (default) | 1.0 GB | Best balance of quality and speed |

| Q8_0 | 1.8 GB | Higher quality |

| F16 | 3.4 GB | Full precision |

## Technical Requirements

- **Minimum:** 4GB RAM, any modern CPU

- **Recommended:** 8GB RAM or Apple Silicon Mac

## Links

- **Original Model:** HuggingFaceTB/SmolVLM2-2.2B-Instruct

- **GGUF Quantizations:** richardyoung/SmolVLM2-2.2B-Instruct-GGUF

## Credits

- **Original Model:** HuggingFace

- **Quantization:** Richard Young (deepneuro.ai/richard)

## License

Apache 2.0