Papers
A curated collection of papers focusing on Reinforcement Learning.
-
Proximal Policy Optimization Algorithms
Author: John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov
Published: July 2017We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a 'surrogate' objective function using stochastic gradient ascent.
-
Trust Region Policy Optimization
Author: John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel
Published: April 2017We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement. By making several approximations to the theoretically-justified procedure, we develop a practical algorithm, called Trust Region Policy Optimization (TRPO).
-
Deep Reinforcement Learning with Double Q-Learning
Authors: Hasselt, Hado van; Guez, Arthur; Silver, David
Published: April 2015Introduces the Double Q-Learning algorithm to address the overestimation bias in traditional Q-learning methods.
-
Policy Gradient Methods for RL
Author: Richard S. Sutton
Published: June 2000Seminal work on policy gradient methods for reinforcement learning.