Ppo Pendulum V1
P
Ppo Pendulum V1
Developed by ernestumorga
This is a reinforcement learning model based on the PPO algorithm, designed to solve control problems in the Pendulum-v1 environment.
Downloads 16
Release Time : 6/7/2022
Model Overview
The model is trained using the PPO (Proximal Policy Optimization) algorithm in the Pendulum-v1 environment, aiming to achieve stable control of an inverted pendulum.
Model Features
Based on PPO Algorithm
Trained using the PPO algorithm, an advanced policy optimization method that ensures training stability while achieving efficient learning.
Multi-environment Parallel Training
Supports training in 4 parallel environments (n_envs=4), improving training efficiency.
State-Dependent Exploration
Uses State-Dependent Exploration (use_sde=True) to enhance exploration capabilities.
Model Capabilities
Inverted Pendulum Control
Continuous Action Space Handling
Reinforcement Learning Policy Optimization
Use Cases
Control Problems
Inverted Pendulum Balance Control
Control the inverted pendulum to maintain an upright position
Average reward: -227.99 +/- 144.65
Featured Recommended AI Models
Š 2025AIbase