PPO-LunarLanderContinuous-v2 Open-source Agent: Helping Control the Lunar Lander to Land Smoothly

Ppo LunarLanderContinuous V2

Developed by sb3

This is a reinforcement learning agent based on the PPO algorithm, specifically trained for the LunarLanderContinuous-v2 environment, capable of controlling the lunar lander for smooth landing.

Physics Model #Continuous Control Landing #Multi-environment Parallel Training #High Reward Stability

Downloads 15

Release Time : 6/2/2022

Model Overview

This model is trained using the PPO algorithm from the stable-baselines3 library, suitable for continuous action space lunar lander control tasks.

Model Features

High-performance Continuous Control

Optimized for the LunarLanderContinuous-v2 environment, capable of handling continuous action space control problems.

Stable Training

Uses the PPO algorithm to ensure training stability.

Parallel Training

Supports training across 16 parallel environments to improve training efficiency.

Model Capabilities

Continuous action space control

Reinforcement learning decision-making

Autonomous landing control

Use Cases

Space Simulation

Lunar Lander Control

Simulates controlling a lunar lander for smooth landing on the moon's surface.

Average reward 274.47 ± 24.37

Education and Research

Reinforcement Learning Teaching

Serves as a teaching example for the PPO algorithm.

🚀 PPO Agent for LunarLanderContinuous-v2

This project presents a trained PPO agent designed to play the LunarLanderContinuous-v2 environment. It utilizes the stable-baselines3 library and the RL Zoo for training and deployment. The RL Zoo serves as a comprehensive training framework for Stable Baselines3 reinforcement learning agents, featuring hyperparameter optimization and pre - trained agents.

✨ Features

Trained PPO agent for the LunarLanderContinuous - v2 environment.
Utilizes the Stable - Baselines3 library and RL Zoo for efficient training and deployment.
Hyperparameter optimization included in the RL Zoo framework.

📦 Installation

Ensure you have the necessary libraries installed. You can find the relevant repositories here:

RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo
SB3: https://github.com/DLR-RM/stable-baselines3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

💻 Usage Examples

Basic Usage

# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo ppo --env LunarLanderContinuous-v2 -orga sb3 -f logs/
python enjoy.py --algo ppo --env LunarLanderContinuous-v2  -f logs/

Advanced Usage

python train.py --algo ppo --env LunarLanderContinuous-v2 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo ppo --env LunarLanderContinuous-v2 -f logs/ -orga sb3

🔧 Technical Details

Hyperparameters

OrderedDict([('batch_size', 64),
             ('ent_coef', 0.01),
             ('gae_lambda', 0.98),
             ('gamma', 0.999),
             ('n_envs', 16),
             ('n_epochs', 4),
             ('n_steps', 1024),
             ('n_timesteps', 1000000.0),
             ('policy', 'MlpPolicy'),
             ('normalize', False)])

📄 License

No license information provided in the original document.

Property	Details
Library Name	stable - baselines3
Tags	LunarLanderContinuous - v2, deep - reinforcement - learning, reinforcement - learning, stable - baselines3
Model Name	PPO
Mean Reward	274.47 +/- 24.37
Task Type	reinforcement - learning
Dataset Name	LunarLanderContinuous - v2
Dataset Type	LunarLanderContinuous - v2

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご