PPO-HalfCheetah-v3 Open-source Reinforcement Learning Model - Freely Boost Training and Optimization in the HalfCheetah-v3 Environment

Ppo HalfCheetah V3

Developed by sb3

This is a reinforcement learning model based on the PPO algorithm, specifically designed for the HalfCheetah-v3 environment and trained using the stable-baselines3 library.

Physics Model #Reinforcement Learning Control #Robot Motion Training #High-Reward Strategy

Downloads 51

Release Time : 6/2/2022

Model Overview

The model is trained using the PPO (Proximal Policy Optimization) algorithm in the HalfCheetah-v3 environment, capable of controlling a simulated half-cheetah robot for motion tasks.

Model Features

High-Performance Motion Control

Achieved an average reward of 5836.27 in the HalfCheetah-v3 environment, demonstrating outstanding performance.

Optimized Hyperparameters

Utilizes an optimized hyperparameter configuration, including learning rate and batch size.

Stable Training

Employs the PPO algorithm to ensure training stability.

Model Capabilities

Robot Motion Control

Reinforcement Learning Task Execution

Continuous Action Space Handling

Use Cases

Robot Simulation

Half-Cheetah Robot Motion Control

Controls a simulated half-cheetah robot to perform motion tasks such as running.

Average reward reaches 5836.27

Algorithm Research

Reinforcement Learning Algorithm Comparison

Serves as a benchmark model for comparing the performance of different reinforcement learning algorithms.

🚀 PPO Agent for HalfCheetah-v3

This is a trained PPO agent designed to play HalfCheetah-v3. It utilizes the stable-baselines3 library and the RL Zoo. The RL Zoo serves as a training framework for Stable Baselines3 reinforcement learning agents, featuring hyperparameter optimization and pre-trained agents.

🚀 Quick Start

✨ Features

Trained PPO agent for the HalfCheetah-v3 environment.
Utilizes the stable-baselines3 library and RL Zoo for training and deployment.

📦 Installation

This section doesn't have specific installation steps for dependencies. However, you need to ensure that the stable-baselines3, rl_zoo3 and related libraries are installed to use the provided code.

💻 Usage Examples

Basic Usage

# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo ppo --env HalfCheetah-v3 -orga sb3 -f logs/
python enjoy.py --algo ppo --env HalfCheetah-v3  -f logs/

Advanced Usage

python train.py --algo ppo --env HalfCheetah-v3 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo ppo --env HalfCheetah-v3 -f logs/ -orga sb3

🔧 Technical Details

Hyperparameters

OrderedDict([('batch_size', 64),
             ('clip_range', 0.1),
             ('ent_coef', 0.000401762),
             ('gae_lambda', 0.92),
             ('gamma', 0.98),
             ('learning_rate', 2.0633e-05),
             ('max_grad_norm', 0.8),
             ('n_envs', 1),
             ('n_epochs', 20),
             ('n_steps', 512),
             ('n_timesteps', 1000000.0),
             ('normalize', True),
             ('policy', 'MlpPolicy'),
             ('policy_kwargs',
              'dict( log_std_init=-2, ortho_init=False, activation_fn=nn.ReLU, '
              'net_arch=[dict(pi=[256, 256], vf=[256, 256])] )'),
             ('vf_coef', 0.58096),
             ('normalize_kwargs', {'norm_obs': True, 'norm_reward': False})])

📚 Documentation

Model Information

Property	Details
Model Type	PPO
Training Environment	HalfCheetah-v3
Mean Reward	5836.27 +/- 171.68

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご