⚠️ 🛑 ML Mentors: Please connect with Kai (via email) to discuss whether jumping into the reinforcement learning content makes sense for your student.
Learning Objective: Learn the foundations of reinforcement learning (RL) and understand when and how to implement RL algorithms in your projects
Learning RL can be difficult as most content is not accessible to younger students. That's why we built our own RL curriculum to enable our students to learn the foundations of RL and leverage RL algorithms in their projects! The curriculum is hands-on and interactive, just like all the other Breakout Mentors machine learning curriculums! It's designed to be accessible for students who have completed sections 1-4 of our deep learning foundations course. Students interested in game development, robotics, recommendation systems (e.g., Spotify, Netflix, etc.), dynamic decision-making (finance, economics, etc.), and/or machine learning with human feedback may benefit from our RL curriculum. Talk to your ML Mentor to see if learning RL makes sense for your goals and timeline (ML Mentors, see note at the beginning of this section).
About: Reinforcement Learning (RL) is a machine learning paradigm designed to enable agents (e.g., computers) to learn optimal decision-making strategies by interacting with an environment. The primary goal of RL is to train agents to make a sequence of actions that achieve the desired outcome over time. RL is invaluable in situations where explicit instructions or labeled data are scarce or impractical, as it allows agents to autonomously discover optimal behaviors through trial and error. It operates on the principle of feedback: an agent takes actions, observes the environment's response, and uses this feedback to adjust its decision-making process over successive interactions. Through this iterative process of exploration and exploitation, RL algorithms learn to make intelligent decisions, making them particularly suitable for applications in robotics, game playing, recommendation systems, and other domains where dynamic decision-making is essential.
Learning Objective: In these lessons, you will embark on your journey of Reinforcement Learning! You will learn basic RL terminologies and how these terminologies relate to a Markov Decision Process.
Learning Objective: In this lesson, we will talk about - Monte Carlo Method (learn by sampling) - Temporal Difference Method (a different way of value-update than Monte Carlo) - Q-Learning (using a table to store useful value) - Deep Q-Learning (Q-Learning using a neural network)
Learning Objective: In this lesson, we will compare value-based and policy-based methods. Also, we will discuss the high-level process of policy gradient and introduce one of the policy gradient algorithms: REINFORCE.
Learning Objective: In these lessons, we will discuss some additional topics. First, we will talk about two advanced reinforcement learning algorithms: A2C and PPO, and how to use them in practice. Next. we will discuss how to train more than one agent to play a game.
-
Challenges:
-
Learning Resources
-
RL Tools
- Stable-Baseline3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
- RL Baselines3 Zoo
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
- CleanRL
High-quality single file implementation of Deep Reinforcement Learning .algorithms
- Gymnasium
An API standard for reinforcement learning with a diverse collection of reference environments.
- PettingZoo
An API standard for multi-agent reinforcement learning.
- Optuna
An open source hyperparameter optimization framework to automate hyperparameter search.
- Stable-Baseline3
