107 2 weeks ago

A 4B GUI Agent for autonomous Android device control with zero-shot generalization.

vision

2 weeks ago

2e36b21bf981 · 5.1GB ·

qwen3vl
·
4.44B
·
Q8_0
{{- if .System }} <|im_start|>system {{ .System }}<|im_end|> {{ end }} {{- range .Messages }} {{- if
You are GELab-Zero, an advanced GUI Agent designed for navigating and controlling graphical user int
Apache License 2.0 Copyright 2025 StepFun AI / GELab Team Licensed under the Apache License, Version
{ "num_ctx": 8192, "repeat_penalty": 1.1, "stop": [ "<|im_end|>", "<|end

Readme

GELab-Zero-4B-preview

GELab-Zero is the first fully open-source GUI Agent for Android device automation.

Built on Qwen3-VL-4B, it navigates and controls any Android app through visual understanding — no app-specific APIs needed. Runs entirely locally with complete privacy.

Capabilities

  • Operate mobile interfaces — recognize UI elements, understand button functions, complete tasks
  • Zero-shot generalization across unseen applications
  • Multi-step workflows: food delivery, shopping, transit, social media, and more

Usage

ollama run ahmadwaqar/gelab-zero-4b-preview

With images:

ollama run ahmadwaqar/gelab-zero-4b-preview "What should I click to open Settings?" --images screenshot.png

Android Device Control

For full Android automation with ADB connection and task execution, see the GitHub repo:

https://github.com/stepfun-ai/gelab-zero

Links

License

Apache 2.0


Tags

vision, tools, 4b, gui-agent, android, automation