Thumb ticker md a

Towards Bayesian Neural Model Selection

Alexander Immer (Ph.D. Student)

Model selection and comparison, the task of choosing the optimal model for a given data set, is a central problem in machine learning. Model selection is particularly challenging for deep neural networks because of their ability to represent arbitrary functions and their enormous size and overparameterization. For these and other reasons neural model selection is almost exclusively done with held-out validation data, potentially due to the focus on computer vision and natural language processing where data is available in abundance. Bayesian model selection, an alternative approach that trades off data fit and model complexity solely on training data, is promising for neural networks because it does not require validation data and can therefore enable deep learning in new settings and for new applications. However, although effective for small and simple probabilistic models, there are no scalable and effective methods for Bayesian model selection in neural networks, yet. The first goal of this thesis is therefore to bring scalable Bayesian model selection to complex neural networks and enable Bayesian neural model selection. Subsequently, we aim to further show the benefits of the Bayesian approach to model selection by selecting invariances which is possible due to the complexity penalty, and model selection for unsupervised or online learning -- both are settings where validation data is unavailable. We further plan to apply the resulting methods to challenging biomedical applications, in particular to unsupervised models for single cell and omics data.

Primary Host: Gunnar Rätsch (ETH Zürich)
Exchange Host: Bernhard Schölkopf (ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems)
PhD Duration: 01 July 2020 - 31 July 2024
Exchange Duration: 01 February 2022 - 31 January 2023 - Ongoing