Oliver Hahn
Unsupervised Visual Scene Understanding aims to develop methods for interpreting the dynamic 3D world from images and videos without any form of human supervision.
The goal is to achieve semantic understanding, object detection, and geometric reasoning—key components for autonomous driving and robotics.
While supervised learning has significantly advanced scene understanding, it relies heavily on extensive pixel-level annotations, which are costly, time-consuming, and prone to bias.
To overcome these limitations, we leverage recent advances in representation learning, motion analysis, and depth estimation to achieve scalable and annotation-free approaches.
Ultimately, the goal is to push the boundaries of unsupervised scene understanding, enabling machines to autonomously perceive and interpret complex real-world environments effectively without human supervision.