Fundamentals of Reinforcement Learning

General information

  • Summer semester
  • 4 Credit Points
  • Lecturer: Dr. rer. nat. Sabrina Klos; Dr.-Ing. Andrea Ortiz
  • Language: English
  • Lecture times and locations are available in TUCaN.
  • Components: Lecture, exercise (theory + programming)
  • The lecture slides and exercises are available in moodle

Content

  • Review of Probability Theory
  • Markov Property and Markov Decision Processes
  • The Multi-Armed Bandit Problem vs. the Full Reinforcement Learning Problem
  • Taxonomy of Multi-Armed Bandit Problems (e.g., Stochastic vs. Adversarial Rewards, Contextual MAB)
  • Algorithms for Multi-Armed Bandit Problems (e.g., Upper Confidence Interval (UCB), Epsilon-Greedy, SoftMax, LinUCB) and their Application to Cyber-Physical Networking
  • Fundamentals of Dynamic Programming and Bellman Equations
  • Taxonomy of Approaches for the Full Reinforcement Learning Problem (e.g., Temporal-Difference Learning, Policy Gradient and Actor-Critic)
  • Algorithms for the Full Reinforcement Learning Problem (e.g., Q-Learning, SARSA, Policy Gradient, Actor-Critic) and their Application to Cyber-Physical Networking
  • Linear Function Approximation
  • Non-linear Function Approximation