Christoph Landolt
This PhD project investigates the development of agentic and self-improving systems for cybersecurity. The research examines the dual-use nature of recent advances in machine learning, particularly Large Language Models (LLMs), which are increasingly capable of generating secure, high-quality code and identifying vulnerabilities. While these capabilities can enhance defensive cybersecurity, they also pose risks, as the same techniques could be leveraged by malicious actors to automate attacks or exploit flaws. To systematically study these risks, this project benchmarks offensive AI capabilities through Capture the Flag (CTF) exercises, providing a controlled environment for analyzing Al-driven adversarial behavior.
Building on these insights, the project aims to design autonomous agents that are robust, adaptive, and resilient in adversarial environments. By integrating generative AI, multi-agent systems, and game-theoretic approaches, the research develops frameworks that enable agents to anticipate, withstand, and counter cyber threats. Generative models are employed both to simulate adversarial strategies and to develop effective defensive measures, while the multi-agent perspective enables modeling of cooperative and competitive interactions that reflect real-world cyber conflicts.
Ultimately, this research seeks to advance AI safety in cybersecurity by developing methods that strengthen the resilience of intelligent systems against evolving threats, providing both theoretical contributions and practical frameworks for secure autonomous operation in adversarial domains.