Insights from the ELLIS Program Workshop on Machine Learning and Computer Vision

From April 1st to 4th, 2025, a focused ELLIS workshop on Machine Learning and Computer Vision was held in the scenic town of Bad Teinach, Germany. Organized by the ELLIS program of the same name - Machine Learning and Computer Vision - the event brought together leading researchers, postdocs, and students for several days of in-depth discussion on the state and future of the field. The workshop was coordinated by ELLIS Program Directors Bernt Schiele, Cordelia Schmid, and Yair Weiss, with local organization led by ELLIS Fellow Daniel Cremers.
The workshop emphasized recent progress in Vision-Language Models (VLMs), particularly their growing proficiency in visual understanding and multimodal reasoning. Presentations explored applications such as automated scene description for accessibility, spatial reasoning in robotics, and object-centric representations for open-vocabulary tasks. Broader themes included the integration of neuroscience and machine learning, advances in 3D reconstruction from sparse inputs, and novel methods for semantic segmentation using pre-trained models. Several talks also addressed the use of generative models for structured reasoning and data representation, highlighting the expanding scope and real-world applicability of VLMs across disciplines.
Attendees agreed in general that the workshop provided an excellent overview of current research trends and achievements, reaffirming a growing consensus: While Computer Vision has become deeply integrated with the broader field of Machine Learning, it remains a distinct and vital discipline. Many of its traditional challenges continue to be actively researched and are far from fully solved, even as machine learning techniques play an increasingly central role in addressing them.

(ELLIS Program Director Yair Weiss giving a presentation.)
Moreover, vision continues to serve as a uniquely effective testing ground - its strong ties to human perception and the physical world, combined with its inherent visualizability, make it particularly well-suited for exploring and explaining complex model behavior.
The workshop concluded with a feedback session that drew overwhelmingly positive responses. Senior researchers valued the opportunity for candid, high-quality discussion of their work, while students and postdocs emphasized how inspiring it was to engage closely with leading experts in the field. A guest from the U.S. expressed a wish for similar workshop formats back home, praising the focus and depth of the meeting.


(Attendees of the workshop hiking through the beautiful countryside surrounding Bad Teinach.)
As workshop attendee and ELLIS Fellow Thomas Brox (University of Freiburg) stated:
These workshops fully replace going to a large conference, while spending less time and carbon emissions. They are much more efficient, much more fun, and you meet effortlessly with the right people, whereas the conferences are swamped with a lot of noisy work that clutters the few relevant ones. Also the student/postdoc presentations are very good, because the PIs select their very best student/postdoc.
Reflecting the spirit of the event, ELLIS PhD student Shrisha Bharadwaj (Max Planck Institute for Intelligent Systems) shared a personal takeaway that resonated with many junior researchers:
The field has changed rapidly in the past few years due to large Diffusion and Vision-Language models. It is important to be open to change and learn to be adaptable.