José Maria Pombal

PhD
University of Lisbon
Towards Fine-Grained and Unbiased Automatic Evaluation of Multilingual Large Language Models

Despite the widespread adoption of Large Language Models (LLMs), their automatic evaluation still depends mostly on shallow numerical scores. However, such metrics are no longer suited to fulfil the crucial role that automatic evaluation plays in improving models and in providing insights on their limitations, especially in multilingual settings. This thesis aims to bridge this gap by developing automatic, fine-grained, and unbiased metrics for multilingual language models. The proposed evaluation techniques will align closely with human judgments, provide multi-dimensional feedback, enable customisation, prioritise efficiency, and address bias. The work to be developed in this thesis aims to contribute significantly to the open-source community on language model evaluation, as well as to establish Unbabel as a leader in this space. We anticipate that the project will be another successful partnership between Unbabel and the Portuguese scientific community, yielding significant academic and commercial impact.

Track:
Industry Track
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.