The Integration of Visual Context in Human and Artificial Language Use

Anna Bavaresco (Ph.D. Student)

Anna's PhD project is at the intersection of Natural Language Processing, Computer Vision, and Cognitive Science. The project investigates language use grounded in the visual context- i.e., visual information that people describe, interact with or refer to when using natural language. A core aspect of the project is evaluating whether features of human language use can be traced in the outputs and/or computations carried out within large Vision-and-Language Models (VLMs). When possible, cognitive/neuroscientific methods will be leveraged to gain further insights on which aspects of the human processing of visio-linguistic stimuli are reproduced by VLMs. More concretely, some key aspects that are going to be investigated within this PhD project are: 1. Evaluating VLMs with respect to their ability to integrate information from the language and visual modalities and use it to solve complex tasks. 2. Drawing comparisons between how multimodal integration is implemented in VLMs and how it happens within the human brain. 3. lnvestigating VLMs' ability to form abstract representations and update them as the context provides new information.

Primary Host: Raquel Fernández (University of Amsterdam)
Exchange Host: Marie-Francine Moens (KU Leuven)
PhD Duration: 01 January 2023 - 01 January 2027
Exchange Duration: - Ongoing - Ongoing