Open-source Deep Reinforcement Learning Model for tqc-PandaPickAndPlace-v1 - Empowering Robot Manipulator Pick-and-Place Tasks

Tqc PandaPickAndPlace V1

Developed by sb3

This is a deep reinforcement learning model based on the TQC algorithm, specifically designed for the PandaPickAndPlace-v1 environment, used for robotic arm grasping and placing tasks.

Molecular Model #Robotic Arm Grasping #Off-Policy Learning #Multi-Objective Optimization

Downloads 14

Release Time : 6/2/2022

Model Overview

This model is trained using the TQC algorithm and is suitable for robotic arm grasping and placing tasks, capable of learning complex operational strategies.

Model Features

HER-Based Sample Efficient Learning

Uses HER (Hindsight Experience Replay) technology to improve learning efficiency in sparse reward environments.

Multi-Objective Policy

Capable of handling multi-objective reinforcement learning tasks, adapting to different grasping and placing scenarios.

Stable Training

Adopts the TQC algorithm to enhance training stability through truncated quantile regression.

Model Capabilities

Robotic Arm Control

Object Grasping

Object Placing

Reinforcement Learning Task Solving

Use Cases

Industrial Automation

Production Line Item Sorting

Performing item grasping and categorized placement on automated production lines

Average reward -12.90±8.87

Robotics Research

Robotic Arm Manipulation Research

Used to study the fine manipulation capabilities of robotic arms

🚀 TQC Agent for PandaPickAndPlace-v1

This is a trained TQC agent for the PandaPickAndPlace-v1 environment, leveraging the stable-baselines3 library and the RL Zoo.

🚀 Quick Start

This is a trained model of a TQC agent playing PandaPickAndPlace-v1 using the stable-baselines3 library and the RL Zoo.

The RL Zoo is a training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

✨ Features

Trained TQC agent for the PandaPickAndPlace-v1 environment.
Utilizes the stable-baselines3 library and RL Zoo for training and deployment.

📦 Installation

RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo
SB3: https://github.com/DLR-RM/stable-baselines3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

💻 Usage Examples

Basic Usage

# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo tqc --env PandaPickAndPlace-v1 -orga sb3 -f logs/
python enjoy.py --algo tqc --env PandaPickAndPlace-v1  -f logs/

Advanced Usage

python train.py --algo tqc --env PandaPickAndPlace-v1 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo tqc --env PandaPickAndPlace-v1 -f logs/ -orga sb3

🔧 Technical Details

Hyperparameters

OrderedDict([('batch_size', 2048),
             ('buffer_size', 1000000),
             ('env_wrapper', 'sb3_contrib.common.wrappers.TimeFeatureWrapper'),
             ('gamma', 0.95),
             ('learning_rate', 0.001),
             ('n_timesteps', 1000000.0),
             ('policy', 'MultiInputPolicy'),
             ('policy_kwargs', 'dict(net_arch=[512, 512, 512], n_critics=2)'),
             ('replay_buffer_class', 'HerReplayBuffer'),
             ('replay_buffer_kwargs',
              "dict( online_sampling=True, goal_selection_strategy='future', "
              'n_sampled_goal=4, )'),
             ('tau', 0.05),
             ('normalize', False)])

Related Research

Panda Gym environments: arxiv.org/abs/2106.13687

📄 License

No license information provided in the original document.

📊 Model Information

Property	Details
Model Type	TQC
Training Data	PandaPickAndPlace-v1
Mean Reward	-12.90 +/- 8.87
Task	Reinforcement Learning
Dataset	PandaPickAndPlace-v1

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご