O

Openvla 7b

Developed by openvla
OpenVLA 7B is an open-source vision-language-action model trained on the Open X-Embodiment dataset, capable of generating robot actions based on language instructions and camera images.
Downloads 1.7M
Release Time : 6/10/2024

Model Overview

OpenVLA 7B is a multimodal model that takes language instructions and camera images of the robot workspace as input, predicting 7-degree-of-freedom end-effector displacements. It supports various robot controls and can quickly adapt to new robot domains through fine-tuning.

Model Features

Multi-robot Support
Out-of-the-box control for various robots included in the pretraining mixed dataset
Parameter-efficient Fine-tuning
Efficiently adapt to new tasks and robot configurations with minimal demonstration data
Open-source Training Code
Complete training codebase released under MIT license, supporting custom training
Multimodal Input
Processes both language instructions and visual inputs to generate precise robot actions

Model Capabilities

Robot action prediction
Vision-language understanding
Multimodal task processing
Robot control

Use Cases

Robot Control
Widow-X Robot Control
Control the Widow-X robot to execute instructions in the BridgeV2 environment
Zero-shot execution of tasks included in the pretraining mixed dataset
New Robot Adaptation
Fine-tune with minimal demonstration data to adapt to new robot configurations
Quick adaptation to new tasks and robot environments
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase