Models
GitHub
Discord
Docs
Cloud
Sign in
Download
Models
Download
GitHub
Discord
Docs
Cloud
Sign in
ahmadwaqar
/
gui-owl
48
Downloads
Updated
2 weeks ago
GUI-Owl is a multimodal vision-language model by mPLUG/Alibaba for GUI understanding and automation. State-of-the-art on ScreenSpot, OSWorld, AndroidWorld benchmarks. Detects UI elements and automates tasks on desktop and mobile devices.
GUI-Owl is a multimodal vision-language model by mPLUG/Alibaba for GUI understanding and automation. State-of-the-art on ScreenSpot, OSWorld, AndroidWorld benchmarks. Detects UI elements and automates tasks on desktop and mobile devices.
Cancel
vision
Name
2 models
Size
Context
Input
gui-owl:7b-q8
765a6d6f83e0
• 8.8GB • 32K context window •
Text, Image input • 2 weeks ago
Text, Image input • 2 weeks ago
gui-owl:7b-q8
8.8GB
32K
Text, Image
765a6d6f83e0
· 2 weeks ago
gui-owl:32b-q8
ff07db05eacd
• 36GB • 32K context window •
Text, Image input • 2 weeks ago
Text, Image input • 2 weeks ago
gui-owl:32b-q8
36GB
32K
Text, Image
ff07db05eacd
· 2 weeks ago