Jan Frederik Meier

PhD
University of Göttingen
Video-based scene understanding using deep learning

This project investigates video-based scene understanding, a domain that despite rapid progress in computer vision still lags behind the maturity and performance of image-based scene understanding. The project explores two complementary research directions. The first is to design new video understanding systems that leverage foundation models, and to critically assess the capabilities and limitations of existing large multimodal models when applied to video-based tasks such as tracking, segmentation, action understanding, and behavior analysis. The second line of inquiry examines whether it is beneficial to lift monocular 2D videos into 3D representations using foundation models, and subsequently solve downstream tasks in this higher-dimensional space, where geometric cues may facilitate more robust reasoning. Through systematic evaluation of generalization, adaptability, and failure modes, the project aims to provide fundamental insights into how foundation models can be effectively adapted or extended for video analysis. Finally, the project includes an application-oriented component in animal conservation, demonstrating how advances in video-based scene understanding can support ecological monitoring, wildlife behavior analysis, and biodiversity research, ultimately bridging methodological innovation with real-world impact.

Track:
Academic Track
PhD Duration:
June 1st, 2025 - May 30th, 2028
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.