Relaxed gradient estimators for structured probabilistic models
Max Paulus (Ph.D. Student)
Gradient computation is the methodological backbone of deep learning, but computing gradients can be challenging, in particular for some structured probabilistic models. Such models are of interest for a number of reasons, including improving interpretability, incorporating problem-specific constraints or improving generalization. Relaxed gradient estimators are a method to learn such models by optimizing a relaxed (surrogate) objective function during training. They incorporate bias in order to reduce variance, are easy to implement and often work well in practice. In my research, I develop relaxed gradient estimators and demonstrate their use for learning structured probabilistic models.
Primary Host: | Andreas Krause (ETH Zürich) |
Exchange Host: | Chris J. Maddison (University of Toronto & DeepMind) |
PhD Duration: | 01 April 2018 - Ongoing |
Exchange Duration: | 01 June 2019 - 31 December 2019 - Ongoing |