30 1 week ago

Lightweight 2.2B vision model for GUI automation - clicks, types, scrolls on screenshots. Fine-tuned on aguvis datasets for agentic reasoning. Available in Q8 and FP16 quantizations. Apache 2.0 license.

vision
53ed932be8fa · 57B
{
"num_ctx": 8192,
"stop": [
"<end_of_utterance>"
]
}