Vera Milovanovic
The PhD project aims to investigate how different parametrizations of deep neural models and their training methods influence the computations these models can perform (in theory), their adaptiveness and the inductive biases they impose. It seeks to characterize when and how these choices should be aligned with data statistics, including the available scale or underlying structure. A key aspect is translating these insights into building principled computational models of unknown processes from available data, particularly in computational biology. In more detail, towards these objectives, I plan to first explore some aspects of parametrizations of sequence-to-sequence mappings, focusing on characterizing signal propagation, expressiveness and training dynamics, as a function of architectural and optimization choices. Within this line of research, I aim to develop more efficient sequence modeling blocks capable of capturing patterns along very long sequential inputs. Beyond long-range sequence modeling, I aim to investigate how signal propagation and trainability manifest within the frameworks of iterative computation and adaptive methods. Instead of treating networks as static mappings, I aim to study models where computation or parameter updates depend on intermediate states and/or input, potentially enabling forms of reasoning and algorithm-like behavior. In this context, parametrization and optimization will be analyzed as determinants of the stability and convergence of these iterative mechanisms, as well as their capacity to represent learning procedures within the model itself. Finally, I intend to apply the previously mentioned line of research, focused more on fundamental questions in deep learning, to developing computational methods for learning complex structures and mechanisms in molecular biology and biomedicine. I am particularly interested in learning long-range dependencies and structures, for example in studying the genome, particularly its non-coding region. This task presents a significant challenge to architectural design and optimization, since it requires efficient sequence modeling blocks, capable of compressing and maintaining relevant information in the internal representations over the long horizons. Furthermore, other than unique structure and hierarchies, the amount of available data is typically orders of magnitude smaller than in the conventional applications, such as natural language processing. Therefore, this task serves as a great test bed for understanding and benchmarking different architectural and training choices. Moreover, this line of research connects to learning other cellular mechanisms, such as gene regulatory networks and single-cell differentiation. By grounding model design in a principled understanding of fundamental aspects of parametrization, optimization and signal propagation, I hope to create more efficient and goal-aligned models and pave the way to a more more rapid adoption of such methods in other fields of science.