DQN vs REINFORCE

Implemented REINFORCE as an intro to policy optimzation methods.

November, 2024

This project was my introduction to policy gradient methods. As a final project for my intro to reinforcment learning class I chose to compare the performance of deep q-learning with a basic policy gradient method: the REINFORCE algorithm. In my setup (pytorch, lunar lander) DQN consistantly converged with less samples. In a future project I would like to implement more modern policy gradient methods like PPO, and also implement distributional DQN (QR-DQN).

A full detailed write up of this project along with code will be available in the future.