106 2 weeks ago

A 4B GUI Agent for autonomous Android device control with zero-shot generalization.

vision

Models

View all →

Readme

GELab-Zero-4B-preview

GELab-Zero is the first fully open-source GUI Agent for Android device automation.

Built on Qwen3-VL-4B, it navigates and controls any Android app through visual understanding — no app-specific APIs needed. Runs entirely locally with complete privacy.

Capabilities

  • Operate mobile interfaces — recognize UI elements, understand button functions, complete tasks
  • Zero-shot generalization across unseen applications
  • Multi-step workflows: food delivery, shopping, transit, social media, and more

Usage

ollama run ahmadwaqar/gelab-zero-4b-preview

With images:

ollama run ahmadwaqar/gelab-zero-4b-preview "What should I click to open Settings?" --images screenshot.png

Android Device Control

For full Android automation with ADB connection and task execution, see the GitHub repo:

https://github.com/stepfun-ai/gelab-zero

Links

License

Apache 2.0


Tags

vision, tools, 4b, gui-agent, android, automation