Reinforcement Learning Specialization
Master the Concepts of Reinforcement Learning. Implement a complete RL solution and understand how to apply AI tools to solve real-world problems.
Instructors: Adam White +1 more
What you'll learn
Skills you'll gain
Specialization - 4 course series
Understand how to formalize your task as a RL problem, and how to begin implementing a solution.
By the end of this course you will be able to: - Understand Temporal-Difference learning and Monte Carlo as two strategies for estimating value functions from sampled experience - Understand the importance of exploration, when using sampled experience rather than dynamic programming sweeps within a model - Understand the connections between Monte Carlo and Dynamic Programming and TD. - Implement and apply the TD algorithm, for estimating value functions - Implement and apply Expected Sarsa and Q-learning (two TD methods for control) - Understand the difference between on-policy and off-policy control - Understand planning with simulated experience (as opposed to classic planning strategies) - Implement a model-based approach to RL, called Dyna, which uses simulated experience - Conduct an empirical study to see the improvements in sample efficiency when using Dyna
Prerequisites: This course strongly builds on the fundamentals of Courses 1 and 2, and learners should have completed these before starting this course. Learners should also be comfortable with probabilities & expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), and implementing algorithms from pseudocode. By the end of this course, you will be able to: -Understand how to use supervised learning approaches to approximate value functions -Understand objectives for prediction (value estimation) under function approximation -Implement TD with function approximation (state aggregation), on an environment with an infinite state space (continuous state space) -Understand fixed basis and neural network approaches to feature construction -Implement TD with neural network function approximation in a continuous state environment -Understand new difficulties in exploration when moving to function approximation -Contrast discounted problem formulations for control versus an average reward problem formulation -Implement expected Sarsa and Q-learning with function approximation on a continuous state control task -Understand objectives for directly estimating policies (policy gradient objectives) -Implement a policy gradient method (called Actor-Critic) on a discrete state environment
To be successful in this course, you will need to have completed Courses 1, 2, and 3 of this Specialization or the equivalent. By the end of this course, you will be able to: Complete an RL solution to a problem, starting from problem formulation, appropriate algorithm selection and implementation and empirical study into the effectiveness of the solution.
Sample-based Learning Methods
Prediction and Control with Function Approximation
A Complete Reinforcement Learning System (Capstone)
©2025 ementorhub.com. All rights reserved