Research
ELLIS fosters international collaboration across domains, connecting top researchers while investing in the next generation of AI talent.
Members
ELLIS Members are leading scientists in machine learning and AI, shaping Europe's global position in these fields.
About
ELLIS is a network of excellence connecting top AI researchers across European borders to strengthen the leadership of AI made in Europe.
lf deep learning systems are inherently brittle, does this doom robust value alignment in large language models to inevitable failure through jailbreaking attacks? Or should our concern be tempered, given !hat currentjailbreaking ends may notjuslify the computational means they require? This project aims to sharpen our understanding of LLM safety in adversarial scenarios. We focus on rigorous and fair evaluation of existing attacks, assessing the true harmful potential of these models. By incorporating insights from the adversaries' perspective, we aim to identify crilical vulnerabilities in current LLMs and pave the way for more effective safety measures in the next generation of more capable models.
Track:
Academic Track
Primary Advisor
PhD Duration:
May 1st, 2024 - May 1st, 2028
First Exchange:
October 1st, 2025 - May 1st, 2026