R

Rdt 170m

Developed by robotics-diffusion-transformer
RDT-170M is a 170-million-parameter imitation learning diffusion Transformer model designed for robot vision-language-action tasks.
Downloads 278
Release Time : 10/23/2024

Model Overview

RDT-170M is a Transformer-based diffusion policy model capable of predicting the next 64 robot actions based on language instructions and multi-view RGB images, compatible with various mobile robotic arm platforms.

Model Features

Multimodal Input Support
Supports language instructions and up to three-view RGB image inputs.
Broad Compatibility
Compatible with single-arm/dual-arm, joint space/end-effector space, position control/velocity control, and various other robotic platforms.
Unified Action Space
Supports multiple robot control methods through a unified action space.
Large-scale Pretraining
Pretrained on 46 robot datasets.

Model Capabilities

Vision-language understanding
Robot action prediction
Multimodal fusion
Diffusion model inference

Use Cases

Robot Control
Mobile Robotic Arm Control
Controls a mobile robotic arm to perform tasks based on language instructions and visual inputs.
Can predict the next 64 robot actions.
Dual-arm Coordination
Controls a dual-arm robot to complete coordinated manipulation tasks.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase