Thumb ticker md photo

Exploring Inductive Biases in Reinforcement Learning Through Action Spaces

Jan Schneider (Ph.D. Student)

When applying reinforcement learning to robotics, there is a multitude of choices for the action representations. Our experiments demonstrate that this choice generally has a significant impact on learning performance. In this project, we conduct an in-depth analysis of the causes of these effects for policy gradient algorithms. Particularly, we study effects on the optimization landscape and the variance of the gradient estimator through visualization and analysis techniques. Our experiments reveal significant structure in the learning process and identify effects that impact the optimization performance. We will use the insights gained from these analyses to find optimized action representations to improve learning performance. Furthermore, we will investigate to which extent our findings generalize across similar tasks and to learning on a real robotic platform.

Primary Host: Bernhard Schölkopf (ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems)
Exchange Host: Marc Deisenroth (University College London)
PhD Duration: 01 July 2022 - 30 June 2026
Exchange Duration: 01 April 2025 - 30 September 2025 - Ongoing