Generalizeable Video Representation Learning
Michael Dorkenwald (Ph.D. Student)
Video data is a treasure trove for AI models, providing a lens to the intricate dynamics and mechanisms that define our world. The key to unlocking the tremendous amount of data available on the web is to bypass the laborious and expensive process of annotating each video. Yet, extracting knowledge and understanding from these videos without labels poses a significant challenge. This research project aims to address this problem by developing innovative self-supervised methods that leverage multi-modalities (e.g. audio) to achieve a more comprehensive and causally-informed understanding of video data and its reflection on the world. The primary objective is to generate robust representations that can be applied to various video-related tasks, such as video scene understanding or long-term video understanding.
|Primary Advisor:||Cees Snoek (University of Amsterdam)|
|Industry Advisor:||Yuki M. Asano (University of Amsterdam)|
|PhD Duration:||01 June 2022 - 01 June 2027|