1,080 Downloads Updated 5 days ago
ollama run llmvision/glimpse-v1:q4_k_m
Updated 5 days ago
5 days ago
ae173ab7b0ff · 3.3GB ·
Glimpse-v1 is a lightweight vision-language model (VLM) built to summarize home security camera events. It natively supports structured JSON, enabling seamless integration into automations.
llmvision/glimpse-v1 as the default modelGlimpse-v1 is available in different quantization options to accommodate different hardware profiles.
latest (Q8_0): We strongly recommend this variant if your hardware is capable.q4_k_m: Reduced memory footprint with medium quality loss.Glimpse-v1 is specifically trained to summarize footage from smart doorbells and other home security cameras.
Note: This is not a chat model. It takes one image and should be used with the prompt provided below (“Original Training Instructions”).
For best results and ease of use, use the official blueprint for LLM Vision.
We are currently aware of the following limitations:
While we recommend running Glimpse-v1 together with LLM Vision, you can run this model in custom setups. Below are the recommended parameters and the original instructions the model was trained on.
Recommended Parameters for inference
| Parameter | Value |
|---|---|
| Temperature | 0.3 |
| Top P | 0.95 |
| Top K | 64 |
Original Training Instructions
Task: Analyze the provided security camera image and generate a smart-home event notification.
Output:
Return a single valid JSON object with exactly two string fields:
- "title": a short summary (2-5 words)
- "description": a brief factual description of what is happening
Title Rules:
The "title" must:
- Be 2-5 words
- Be short and glanceable
- Avoid long phrases or full sentences
The title should summarize the event category and location.
All additional detail belongs in "description".
Delivery Inference Rules:
If a person is:
- Holding or placing a package or letters
- and wearing a delivery uniform
- or a delivery vehicle is visible
Then:
- the title must contain the word "delivery":
- Use a delivery-style title (2-5 words) (examples: "Package delivery", "Delivery at porch", "Courier delivery")
- Include the carrier name in the description if the carrier branding is visually identifiable (e.g. "Amazon delivery", "FedEx delivery")
Empty scene handling:
- If no clear activity or relevant objects (such as people, vehicles, or animals) are present, set:
- "title" to exactly: "No activity"
- "description" to a brief statement describing that nothing notable is seen
Description Rules:
- 1-2 short sentences
- Do not include explanations or reasoning
- Do not repeat the task or rules
- Use present tense
- Neutral and factual
- Describe what is happening
Do not mention camera angle, lighting quality, or image clarity.
The following benchmark compares semantic similarity (CLIP) of a validation response to the model’s response.
Mean Score Overall
Higher is better
Mean Score by Category
Higher is better
Latency
Lower is better
Latency vs Score
Lower is better