Nora Open-Source Vision-Language-Action Model - Generate robot actions from images based on instructions for free.

Nora

Developed by declare-lab

Nora is an open-source vision-language-action model trained on Qwen 2.5 VL - 3B, capable of generating robot actions based on language instructions and camera images.

Multimodal Fusion

Transformers

#Robot action generation #Vision-language model #Zero-shot instruction following

Downloads 7,063

Release Time : 4/28/2025

Model Overview

Nora is a vision-language-action model that takes language instructions and camera images as inputs and predicts robot actions composed of 7-DOF end-effector increments.

Model Features

Vision-language-action integration

Capable of simultaneously processing visual inputs (camera images) and language instructions to output robot actions

Open-source availability

All checkpoints and training codebases are publicly available under the MIT license

Trained on large-scale data

Trained using robot manipulation segments from the Open X-Embodiment dataset

7-DOF action prediction

Capable of predicting 7-DOF robot actions including position and pose

Model Capabilities

Vision-language understanding

Robot action prediction

Instruction following

Zero-shot learning

Use Cases

Robot control

Instruction-based robot operation

Control the robot to perform specific tasks based on natural language instructions

Capable of generating 7-DOF actions suitable for robot execution

Zero-shot instruction following

Perform tasks under unseen instructions and scenarios

🚀 Nora

Nora is an open vision-language-action model that can generate robot actions based on language instructions and camera images, trained on the Open X-Embodiment dataset.

🚀 Quick Start

Nora take a language instruction and a camera image of a robot workspace as input, and predict (normalized) robot actions consisting of 7-DoF end-effector deltas of the form (x, y, z, roll, pitch, yaw, gripper). To execute on an actual robot platform, actions need to be un-normalized subject to statistics computed on a per-robot, per-dataset basis.

✨ Features

Nora is an open vision-language-action model trained on robot manipulation episodes from the Open X-Embodiment dataset. The model takes language instructions and camera images as input and generates robot actions. It is trained directly from Qwen 2.5 VL-3B.

📦 Installation

To get started with loading and running Nora for inference, we provide a lightweight interface that with minimal dependencies.

git clone https://github.com/declare-lab/nora
cd inference
pip install -r requirements.txt

💻 Usage Examples

Basic Usage

# Load VLA
from inference.nora import Nora
nora = Nora(device='cuda')

# Get Inputs
image: Image.Image = camera(...)
instruction: str = <INSTRUCTION>
# Predict Action (7-DoF; un-normalize for BridgeData V2)
actions = nora.inference(
    image=image,  # Dummy image
    instruction=instruction,
    unnorm_key='bridge_orig'  # Optional, specify if needed
)
# Execute...
robot.act(action, ...)

Advanced Usage

# For example, to load Nora for zero-shot instruction following in the BridgeData V2 environments with a WidowX robot
# Load VLA
from inference.nora import Nora
nora = Nora(device='cuda')

# Get Inputs
image: Image.Image = camera(...)
instruction: str = <INSTRUCTION>
# Predict Action (7-DoF; un-normalize for BridgeData V2)
actions = nora.inference(
    image=image,  # Dummy image
    instruction=instruction,
    unnorm_key='bridge_orig'  # Optional, specify if needed
)
# Execute...
robot.act(action, ...)

📚 Documentation

Model Description

Model type: Vision-language-action (language, image => robot actions)
Language(s) (NLP): english
License: MIT
Finetuned from model : Qwen 2.5 VL-3B

Model Sources

Repository: https://github.com/declare-lab/nora
Paper : https://www.arxiv.org/abs/2504.19854
Demo: https://declare-lab.github.io/nora

📄 License

All Nora checkpoints, as well as our training codebase are released under an MIT License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご