Research
Members
About
News
Events

ELLIS fosters international collaboration across domains, connecting top researchers while investing in the next generation of AI talent.

PhD & Postdoc Program Sites Research Programs Jobs ELLIS PhD Award Projects Building on ELLIS Conference Contributions Cross-Network Publications

Members

ELLIS Members are leading scientists in machine learning and AI, shaping Europe's global position in these fields.

Become a Member Members List Become a Fellow Fellows List

About

ELLIS is a pan-European AI network of excellence built upon machine learning as the driver for modern AI.

Board Organisation ELLIS FAQ ELLIS Open Letter ELLIS Position Paper Partnerships Sponsorships & Donations For Media Contact

Home
› LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

2025

aclanthology.org

Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, Andre Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, Alberto Testoni