Thumb ticker md portrait  1

Dynamically Evolving Deep Architectures

Rupert Mitchell (Ph.D. Student)

My research focuses on the modification of deep architectures live during training. In contrast to the relatively well explored area of the pure removal of parts of an architecture (i.e. pruning), I am particularly interested in combining this with the addition of parts as well, for example increasing the width or depth of a neural network. I believe that progress in this area will allow researchers to focus on defining the fundamental type of architecture they wish to employ, without worrying about as many hyper-parameters. I also hope that this will enable greater creativity in architecture design, since it is no longer required that training can start with a random initialisation at full size, while simultaneously having enough capacity in all areas it may end up being needed. I am also excited by the possibility to extend explanability and interpretability methods to cover not just parameter values, but also architecture choices. For example, I would like to be able to answer the question "why is this neuron/layer even here in the first place?" and expect a reasonable answer, even before I then ask "what does it in fact do in the final network?"

Primary Host: Kristian Kersting (Technical University of Darmstadt)
Exchange Host: Ole Winther (University of Copenhagen & Technical University of Denmark)
PhD Duration: 08 September 2021 - 07 September 2024
Exchange Duration: - Ongoing - Ongoing