Research
Members
About
News
Events

ELLIS fosters international collaboration across domains, connecting top researchers while investing in the next generation of AI talent.

PhD & Postdoc Program Sites Research Programs Jobs ELLIS PhD Award Projects Building on ELLIS Conference Contributions Cross-Network Publications

Members

ELLIS Members are leading scientists in machine learning and AI, shaping Europe's global position in these fields.

Become a Member Members List Become a Fellow Fellows List

About

ELLIS is a pan-European AI network of excellence built upon machine learning as the driver for modern AI.

Board Organisation ELLIS FAQ ELLIS Open Letter ELLIS Position Paper Partnerships Sponsorships & Donations For Media Contact

Home
› AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding

2025

openreview.net

Ahmed Masry, Juan A. Rodriguez, Tianyu Zhang, Suyuchen Wang, Chao Wang, Aarash Feizi, Akshay Kalkunte Suresh, Abhay Puri, Xiangru Jian, Pierre-Andre Noel, Sathwik Tejaswi Madhusudhan, Marco Pedersoli, Bang Liu, Nicolas Chapados, Yoshua Bengio, Enamul Hoque, Christopher Pal, Issam H. Laradji, David Vazquez, Perouz Taslakian, Spandana Gella, Sai Rajeswar