U

Uground

Developed by osunlp
UGround is a powerful GUI visual positioning model trained with a streamlined recipe, developed by the Ohio State University NLP Group in collaboration with Orby AI.
Downloads 208
Release Time : 8/2/2024

Model Overview

UGround is a multimodal model focused on GUI visual positioning, capable of precisely locating various elements in user interfaces, such as text and icons.

Model Features

Powerful GUI Visual Positioning Capability
Outstanding performance on the ScreenSpot benchmark, achieving an average accuracy of 73.3%
Multi-Platform Support
Supports GUI element positioning for mobile, desktop, and web platforms
Streamlined Training Recipe
Utilizes efficient data synthesis and training methods without complex architectures

Model Capabilities

GUI Element Positioning
Multimodal Understanding
Cross-Platform Interface Analysis
Vision-Language Alignment

Use Cases

Automated Testing
Interface Element Detection
Automatically identifies and locates various elements in user interfaces
Achieved 82.8% mobile text positioning accuracy in ScreenSpot testing
Smart Assistants
Vision-Based Instruction Execution
Helps users complete operations through visual interfaces
Achieved an average accuracy of 81.4% in agent settings
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase