O

Octo Base

Developed by rail-berkeley
Octo is a robot control foundation model trained on diffusion policy, capable of predicting future actions and processing multimodal inputs.
Downloads 215
Release Time : 12/13/2023

Model Overview

The Octo Base Model is a Transformer-based model for robot control that processes visual and language inputs to predict future actions. It supports multi-camera inputs and language instructions, suitable for various robotic manipulation tasks.

Model Features

Multimodal Input Processing
Capable of processing visual inputs from both primary and wrist cameras, as well as language instructions
Diffusion Policy Training
Utilizes advanced diffusion policy for model training, improving action prediction accuracy
Large-Scale Dataset Training
Trained on the Open X-Embodiment mixed dataset, covering 26 different robot datasets
Flexible Input Support
During inference, accepts any subset of observations and task key-values, supporting up to 2 timesteps of historical window

Model Capabilities

Visual Data Processing
Language Instruction Understanding
Multi-Step Action Prediction
Multi-Camera Input Processing
Robot Control

Use Cases

Industrial Robots
Assembly Line Operations
Controlling industrial robotic arms to complete product assembly tasks
Material Handling
Guiding robots to perform object grasping and placement operations
Service Robots
Home Assistant
Performing daily household tasks such as organizing items
Food Service
Completing food preparation and delivery tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase