O

Openvla 7b Finetuned Libero Spatial

Developed by openvla
OpenVLA 7B vision-language-action model fine-tuned with LoRA on the LIBERO-Spatial dataset
Downloads 4,009
Release Time : 9/3/2024

Model Overview

This is a multimodal vision-language-action model designed for robotics, capable of processing image and text inputs to generate corresponding action commands.

Model Features

LIBERO-Spatial Dataset Fine-Tuning
Model performance optimized specifically for robotic spatial tasks
LoRA Efficient Fine-Tuning
Parameter-efficient fine-tuning using LoRA with rank=32, adapting to new tasks while preserving original model capabilities
Multimodal Processing Capability
Capable of processing both visual and language inputs to output action commands

Model Capabilities

Vision-Language Understanding
Robotic Action Generation
Multimodal Reasoning
Spatial Task Processing

Use Cases

Robotic Control
Spatial Navigation Task
Generates robotic navigation actions based on visual input and text instructions
Performs well on the LIBERO-Spatial benchmark
Object Manipulation Task
Completes object grasping and placement tasks by combining visual and language inputs
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase