171 1 month ago

Lightweight 2.2B vision model for GUI automation - clicks, types, scrolls on screenshots. Fine-tuned for agentic reasoning with normalized [0,1] coordinate output. Available in Q4_K_M, Q8_0, and FP16 quantizations. Apache 2.0 license.

vision
506ae592d539 · 64B
Apache License 2.0 - https://www.apache.org/licenses/LICENSE-2.0