Thumb ticker md 20230909 163730

Towards Efficient and Reliable Deep Learning Systems using Geometric methods

Santanu Rathod (Ph.D. Student)

This PhD project aspires to explore Geometric Deep Learning with the ultimate goal of developing more efficient and reliable deep learning systems. The outcome of this project can be potentially applied to various application domains, ranging from biotechnology and drug discovery to program induction and graph/network-based problems. Deep learning has achieved remarkable breakthroughs in tasks like image recognition, language synthesis, and image segmentation, owing to factors such as abundant digital training data and powerful computing resources. However, a critical aspect contributing to the success of neural architectures lies in their capability as function approximators, incorporating inductive biases from known heuristics. Geometric Deep Learning (GDL) aims to establish a fundamental approach for developing these inductive biases by analyzing domain characteristics and recognizing underlying symmetries. More recently, these principles have been helpful in tasks like Neural Algorithmic Reasoning (NAR) which seeks to create neural representations of common algorithms, like sorting and dynamic programming, to overcome algorithmic bottlenecks arising from raw data like for example noisy real-time traffic data which still needs to be compressed into scalar values (e.g., edge weights in a path-finding problem). GDL principles are also being used in problems surrounding drug-discovery, like rigid body protein docking, getting embeddings for protein graphs, etc. And lately, models utilizing GDL principles are being used as noise prediction models in diffusion tasks around drug-discovery like inverse protein folding, generating molecular linker designs, structure-based drug design, etc. The work done during the PhD will entail utilizing recent advancements in GDL to delve into the application of GNN-based (Graph Neural Network) and similar architectures for addressing challenges related to drug discovery, program induction, and NAR. The research will primarily involve analyzing domain symmetries and structures to guide the development of GNN-based architectures tailored for drug discovery problems. Moreover, it will explore the integration of recent NAR advancements with the progress in program synthesis, such as AlphaCode or LLMs like GPT-4, with a particular focus on ensuring that the predictive(synthesized) code is syntactically and semantically correct. Furthermore, the project will investigate the practicality of neuralified algorithms in real-world scenarios, especially those involving graphs and networks, such as traffic congestion problems, while emphasizing performance improvements concerning robustness. The insights gained from these investigations will also be applied to tackle challenges in biotechnology and drug discovery, with a specific focus on addressing problems like flexible body transformation protein docking, where GDL inspired equivariant/invariant architectures have shown promising results.

Primary Host: Xiao Zhang (CISPA Helmholtz Center for Information Security)
Exchange Host: Pietro LiĆ² (University of Cambridge)
PhD Duration: 01 October 2023 - 30 September 2027
Exchange Duration: 01 January 2025 - 30 April 2025 01 January 2026 - 31 August 2026