Thumb ticker md photo

Mathematical Foundations for Deep Learning

Zhenyu Zhu (Ph.D. Student)

Despite the extraordinary success of deep learning, its emerging weaknesses such as robustness, generalization, and bias, demand an ever closer attention. Unfortunately, many of the existing theories were developed for low capacity models and therefore do not account for the impressive scaling properties of deep learning. For example, enforcing low bias in computer vision tasks using tools developed for linear models have been shown to simply degrade the performance of neural networks. Instead, our work already shows how dedicated theories can shed light on how architectural choices impact robustness and generalization. In this PhD project, we will first investigate how common practices theoretically affect the robustness, generalization, and bias of neural networks. Second, we will propose new grounded approaches improving over existing methods. The overall goal is to improve the reliability of neural networks starting from their theoretical foundations and optimization to improving their behaviour under principled forms of distribution shifts such as covariate and label shifts.

Primary Advisor: Volkan Cevher (EPFL)
Industry Advisor: Francesco Locatello (IST Austria)
PhD Duration: 01 September 2022 - 31 August 2026