ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task

2025

Vittorio Pippi, Matthieu Guillaumin, Silvia Cascianelli, Rita Cucchiara, Maximilian Jaritz, Loris Bazzani

Author Locations

No location data available for the ELLIS authors of this paper.

ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.