U

UI TARS 2B SFT

Developed by bytedance-research
UI-TARS is a next-generation native graphical user interface (GUI) agent model designed to seamlessly interact with GUIs through human-like perception, reasoning, and action capabilities.
Downloads 5,792
Release Time : 1/20/2025

Model Overview

UI-TARS integrates all key components—perception, reasoning, positioning, and memory—into a single vision-language model (VLM), enabling end-to-end task automation without predefined workflows or manual rules.

Model Features

End-to-End Task Automation
Integrates perception, reasoning, positioning, and memory into a single model, eliminating the need for predefined workflows or manual rules.
Native GUI Interaction
Seamlessly interacts with graphical user interfaces through human-like perception, reasoning, and action capabilities.
Multimodal Capabilities
Combines visual and language understanding to handle complex GUI tasks.

Model Capabilities

Graphical User Interface Interaction
Vision-Language Understanding
End-to-End Task Automation
Multimodal Reasoning

Use Cases

Automated Testing
GUI Automated Testing
Automatically executes GUI testing tasks without human intervention.
Improves testing efficiency and coverage
Intelligent Assistant
GUI Operation Assistant
Assists users in completing complex GUI operation tasks.
Enhances user operation efficiency
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase