Making some FlapJax
In this post, we will implement a FlappyBird-like game, implement the Proximal-Policy-Optimization (PPO) algorithm, and use PPO to train an agent to play the game.
-
FlapJax Part 1 - Making a Flappy Bird game with PyGame
The flappy bird game is quite simple. Each step, the user can take one of two options: do nothing or flap. There are pipes that slide by which have a gap which the player is supposed to go through.
-
FlapJax Part 2 - Reinforcement Learning, Policy Gradients, and Proximal Policy Optimization
In this section, we will be implementing an Actor-Critic reinforcement algorithm. Reinforcement Learning Terminology and Deep-Q Learning First, let us set up terminology and pose the problem a reinforcement algorithm aims to solve.
-
FlapJax Part 3 - Implementation of the proximal policy optimization algorithm using Jax and Flax
In this part of the post, we will actually write all the code to implement the Proximal-Policy-Optimization algorithm. We will be using jax, flax and optax to implement the algorithm, network and optimization.