Reinforcement Learning

Practical Tips for Reliable Reinforcement Learning

Talk at the Industrial RL Workshop in Saclay about the lessons learned while developping Stable-Baselines3 to have reliable implementations and reproducible experiments.

A Simple Open-Loop Baseline for Reinforcement Learning Locomotion Tasks

In search of the simplest baseline capable of competing with Deep Reinforcement Learning on locomotion tasks, we propose a biologically inspired model-free open-loop strategy. Drawing upon prior knowledge and harnessing the elegance of simple …

Knowledge Guided Reinforcement Learning for Robotics

DQN Tutorial

From tabular Q-learning to Deep Q-Network (DQN)

Learning to Exploit Elastic Actuators for Quadruped Locomotion

Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design. Whereas previous works have focused on extensive modeling and simulation to find optimal controllers for …

Training RL agents directly on real robots

Presentation on applying Reinforcement Learning directly on real robots

Tutorial: Tools for Robotic Reinforcement Learning

Hands-on RL for Robotics with EAGERx and Stable-Baselines3

The 37 Implementation Details of Proximal Policy Optimization

Proximal policy optimization (PPO) has become one of the most popular deep reinforcement learning (DRL) algorithms. Yet, reproducing the PPO's results has been challenging in the community. While recent works conducted ablation studies to provide …

Stable-Baselines3: Reliable Reinforcement Learning Implementations

Stable-Baselines3 provides open-source implementations of deep reinforcement learning (RL) algorithms in Python. The implementations have been benchmarked against reference codebases, and automated unit tests cover 95% of the code. The algorithms …

Smooth Exploration for Robotic Reinforcement Learning

We extend the original state-dependent exploration (SDE) to apply deep reinforcement learning algorithms directly on real robots. The resulting method, gSDE, yields competitive results in simulation but outperforms the unstructured exploration on the real robot.