Making some FlapJax

In this post, we will implement a FlappyBird-like game, implement the Proximal-Policy-Optimization (PPO) algorithm, and use PPO to train an agent to play the game.

All the code is available in this repo.
Trained network playing Flappy Bird
Trained network playing Flappy Bird