Openvla 7b Finetuned Libero Spatial
OpenVLA 7B vision-language-action model fine-tuned with LoRA on the LIBERO-Spatial dataset
Downloads 4,009
Release Time : 9/3/2024
Model Overview
This is a multimodal vision-language-action model designed for robotics, capable of processing image and text inputs to generate corresponding action commands.
Model Features
LIBERO-Spatial Dataset Fine-Tuning
Model performance optimized specifically for robotic spatial tasks
LoRA Efficient Fine-Tuning
Parameter-efficient fine-tuning using LoRA with rank=32, adapting to new tasks while preserving original model capabilities
Multimodal Processing Capability
Capable of processing both visual and language inputs to output action commands
Model Capabilities
Vision-Language Understanding
Robotic Action Generation
Multimodal Reasoning
Spatial Task Processing
Use Cases
Robotic Control
Spatial Navigation Task
Generates robotic navigation actions based on visual input and text instructions
Performs well on the LIBERO-Spatial benchmark
Object Manipulation Task
Completes object grasping and placement tasks by combining visual and language inputs
Featured Recommended AI Models
Š 2025AIbase