Fundamentals of Reinforcement Learning

Summer semester
5 Credit Points
Lecturer: Prof. Dr.-Ing. Anja Klein; Dr. rer. nat. Sabrina Klos; Dr.-Ing. Andrea Ortiz
Language: English
Lecture times and locations are available in TUCaN.
Components: Lecture, exercise (theory + programming)
The lecture slides and exercises are available in moodle

Review of Probability Theory
Markov Property and Markov Decision Processes
The Multi-Armed Bandit Problem vs. the Full Reinforcement Learning Problem
Taxonomy of Multi-Armed Bandit Problems (e.g., Stochastic vs. Adversarial Rewards, Contextual MAB)
Algorithms for Multi-Armed Bandit Problems (e.g., Upper Confidence Interval (UCB), Epsilon-Greedy, SoftMax, LinUCB) and their Application to Cyber-Physical Networking
Fundamentals of Dynamic Programming and Bellman Equations
Taxonomy of Approaches for the Full Reinforcement Learning Problem (e.g., Temporal-Difference Learning, Policy Gradient and Actor-Critic)
Algorithms for the Full Reinforcement Learning Problem (e.g., Q-Learning, SARSA, Policy Gradient, Actor-Critic) and their Application to Cyber-Physical Networking
Linear Function Approximation
Non-linear Function Approximation