P

Ppo Pendulum V1

Developed by ernestumorga
This is a reinforcement learning model based on the PPO algorithm, designed to solve control problems in the Pendulum-v1 environment.
Downloads 16
Release Time : 6/7/2022

Model Overview

The model is trained using the PPO (Proximal Policy Optimization) algorithm in the Pendulum-v1 environment, aiming to achieve stable control of an inverted pendulum.

Model Features

Based on PPO Algorithm
Trained using the PPO algorithm, an advanced policy optimization method that ensures training stability while achieving efficient learning.
Multi-environment Parallel Training
Supports training in 4 parallel environments (n_envs=4), improving training efficiency.
State-Dependent Exploration
Uses State-Dependent Exploration (use_sde=True) to enhance exploration capabilities.

Model Capabilities

Inverted Pendulum Control
Continuous Action Space Handling
Reinforcement Learning Policy Optimization

Use Cases

Control Problems
Inverted Pendulum Balance Control
Control the inverted pendulum to maintain an upright position
Average reward: -227.99 +/- 144.65
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase