ppo-BreakoutNoFrameskip-v4 Open-source Agent - Built for Game Training and Evaluation, Free to Use

Ppo BreakoutNoFrameskip V4

Developed by sb3

This is a reinforcement learning agent based on the PPO algorithm, specifically designed for training and evaluation in the BreakoutNoFrameskip-v4 game environment.

Image Generation #Atari game control #Multi-environment parallel training #Frame stacking processing

Downloads 22

Release Time : 6/2/2022

Model Overview

This model is trained using the stable-baselines3 library and RL Zoo framework, achieving high average reward scores in the Atari Breakout game.

Model Features

High-performance Game Control

Achieved an average reward score of 398.00 ± 16.30 in the BreakoutNoFrameskip-v4 environment

Parallel Training

Supports training with 8 parallel environments to improve training efficiency

Frame Stacking Processing

Uses 4-frame stacking technology to process game screens, helping the agent understand dynamic changes

Model Capabilities

Atari Game Control

Reinforcement Learning Training

Game Strategy Optimization

Use Cases

Game AI

Atari Breakout Game AI

Training an agent to automatically play the Breakout game

Average reward reached 398.00 ± 16.30

Reinforcement Learning Research

PPO Algorithm Benchmark

Serves as a performance benchmark for the PPO algorithm in Atari environments

🚀 Stable-Baselines3 PPO Agent for BreakoutNoFrameskip-v4

This project showcases a trained PPO agent designed to play the BreakoutNoFrameskip-v4 game. It leverages the stable-baselines3 library and the RL Zoo, a training framework for Stable Baselines3 reinforcement learning agents that includes hyperparameter optimization and pre-trained agents.

🚀 Quick Start

Usage (with SB3 RL Zoo)

You can easily use the pre - trained model with the following steps:

RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo
SB3: https://github.com/DLR-RM/stable-baselines3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib

# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo ppo --env BreakoutNoFrameskip-v4 -orga sb3 -f logs/
python enjoy.py --algo ppo --env BreakoutNoFrameskip-v4  -f logs/

Training (with the RL Zoo)

If you want to train the model from scratch, use the following commands:

python train.py --algo ppo --env BreakoutNoFrameskip-v4 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo ppo --env BreakoutNoFrameskip-v4 -f logs/ -orga sb3

✨ Features

Trained PPO Agent: The PPO agent has been trained to play the BreakoutNoFrameskip - v4 game effectively.
Utilizes RL Zoo: The RL Zoo provides a convenient framework for training and hyperparameter optimization.

📦 Installation

This README does not explicitly provide installation steps for the libraries. However, you can refer to the official repositories:

💻 Usage Examples

Basic Usage

The basic usage involves downloading the pre - trained model and running it:

# Download model and save it into the logs/ folder
python -m rl_zoo3.load_from_hub --algo ppo --env BreakoutNoFrameskip-v4 -orga sb3 -f logs/
python enjoy.py --algo ppo --env BreakoutNoFrameskip-v4  -f logs/

Advanced Usage

For advanced usage, you can train the model from scratch and upload it to the hub:

python train.py --algo ppo --env BreakoutNoFrameskip-v4 -f logs/
# Upload the model and generate video (when possible)
python -m rl_zoo3.push_to_hub --algo ppo --env BreakoutNoFrameskip-v4 -f logs/ -orga sb3

🔧 Technical Details

Hyperparameters

The following hyperparameters were used for training the PPO agent:

OrderedDict([('batch_size', 256),
             ('clip_range', 'lin_0.1'),
             ('ent_coef', 0.01),
             ('env_wrapper',
              ['stable_baselines3.common.atari_wrappers.AtariWrapper']),
             ('frame_stack', 4),
             ('learning_rate', 'lin_2.5e-4'),
             ('n_envs', 8),
             ('n_epochs', 4),
             ('n_steps', 128),
             ('n_timesteps', 10000000.0),
             ('policy', 'CnnPolicy'),
             ('vf_coef', 0.5),
             ('normalize', False)])

📄 License

This README does not provide license information. You should refer to the official repositories for license details.

📚 Documentation

Model Performance

Property	Details
Model Type	PPO
Mean Reward	398.00 +/- 16.30
Task	Reinforcement Learning
Dataset	BreakoutNoFrameskip - v4

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご