Parham Yazdkhasti
This PhD project explores new optimization techniques to make deep learning more efficient, scalable, and accessible. A key focus is on distributed and federated learning, where large models are trained across multiple devices or institutions without centralizing data. The research will also address challenges such as reducing the memory footprint of optimizers, designing methods that scale well in parallel and distributed settings, and adapting algorithms to advanced architectures like transformers. Beyond large-scale clusters, the project emphasizes making deep learning training feasible on consumer-grade hardware. By combining theory and practice, the work aims to deliver optimization strategies that reduce resource consumption, improve scalability, and broaden access to state-of-the-art machine learning methods.