30 1 week ago

Lightweight 2.2B vision model for GUI automation - clicks, types, scrolls on screenshots. Fine-tuned on aguvis datasets for agentic reasoning. Available in Q8 and FP16 quantizations. Apache 2.0 license.

vision
c79ccfb2c250 · 121B
{
"num_predict": 256,
"stop": [
"<|im_end|>",
"<end_of_utterance>",
"</code>"
],
"temperature": 0.1
}