Nikhil Chandak

PhD
Max Planck Institute for Intelligent Systems (MPI-IS)
Training and Evaluation of Language Models for the Open-Ended World

As AI systems move from narrow benchmarks to real-world use, they increasingly need to act in open-ended environments with evolving goals, tools, and humans in the loop. This thesis will develop principles and data-centric methods to build language model agents that can plan, seek the right information, and work over long horizons. On the training side, it will study methods for synthetic environment/problem generation, strategies for dense reward and curriculum learning to push the model's capabilities on hard exploration problems. On the evaluation side, the work will explore the design of open-ended and long-horizon benchmarks and critically examine scalable evaluation methods like LLM-as-a-judge. Overall, the thesis aims to better measure realistic use cases of language model agents and make progress on them.

Track:
Academic Track
PhD Duration:
September 1st, 2024 - August 31st, 2027
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.