Openvla 7b Finetuned Libero 10
This model is a vision-language-action model obtained by fine-tuning the OpenVLA 7B model using the LoRA method on the LIBERO-10 dataset, suitable for the field of robotics.
Downloads 1,779
Release Time : 9/3/2024
Model Overview
A multimodal model optimized for robotics, capable of handling image-text-to-text tasks, particularly suited for vision-language-action scenarios.
Model Features
LIBERO-10 Dataset Fine-Tuning
Specifically optimized for the LIBERO-Long version of the LIBERO simulation benchmark
LoRA Efficient Fine-Tuning
Utilizes LoRA (rank=32) for parameter-efficient fine-tuning, maintaining model performance while reducing computational resource requirements
Multimodal Capabilities
Combines visual and language understanding, suitable for complex tasks in robotics
Large-Scale Pretraining Foundation
Built upon the powerful OpenVLA 7B model, inheriting its rich vision-language understanding capabilities
Model Capabilities
Image Understanding
Text Generation
Robot Action Planning
Multimodal Task Processing
Use Cases
Robotics
Task Planning in Simulation Environments
Executing complex multi-step tasks in the LIBERO simulation environment
Optimized task completion rate and execution efficiency
Vision-Language Navigation
Making navigation decisions based on visual input and language instructions
Featured Recommended AI Models
Š 2025AIbase