Reinforcement Learning Through the Lens of Optimization
Adrian Müller (Ph.D. Student)
Reinforcement learning offers a solution to learning problems that require planning and has led to several breakthroughs in recent years. However, many of these breakthroughs were achieved in controlled setups. In such setups, it is common that a) one does not require a theoretical understanding of the algorithms and b) only the eventually trained policy but not performance during learning matters. This PhD project aims to provide reinforcement learning algorithms that allow for the desired mathematical guarantees. Crucially, these algorithms are supposed to provably scale to large Markov decision processes at the same time. The key idea is to view the learning problems from the viewpoint of online optimization theory.
Primary Host: | Volkan Cevher (EPFL) |
Exchange Host: | Gergely Neu (Universitat Pompeu Fabra) |
PhD Duration: | 01 September 2023 - 01 September 2027 |
Exchange Duration: | 01 September 2025 - 01 March 2026 - Ongoing |