Jan Kompatscher
Deploying ML models in safety-critical settings requires robust policies. Reinforcement learning (RL) can discover such policies, but often demands millions of training steps, which makes it impossible to train in the real world. Supervised learning, in contrast, is constrained by the availability and quality of human-generated data, which may be suboptimal. A practical alternative is to design realistic virtual environments, termed Environment Modeling (EM) in this abstract, that allow RL agents to explore and test different policies. This raises key questions: what aspects of the real world must be modeled, and what can be abstracted? My first work addresses these questions in the domain of anesthesiology through a manually designed EM process. My subsequent work investigates general principles for EM and compares different strategies, including supervised learning from data, differential equation-based models, and models generated by large language models (LLMs).
In parallel, my work with my co-supervisor explores how EM can be used to improve Explainable AI (XAI). XAI is necessary for control, trust, and accountability in safety-critical settings. However, different XAI methods can yield different explanations for the same model output, raising the question of which explanation is correct. We can use EM as a principled approach to this problem. While XAI typically relates inputs to outputs, EM enables counterfactual reasoning to empirically test explanations by changing inputs to validate how the outputs are affected. We apply EM for systematic evaluation of XAI methods. To this end, we design RL environments in which the relevant input features for producing a given policy are known by construction. These environments are collected into an "XAI Arena," a benchmark suite for testing and comparing explanation methods. This work aims to provide a standardized evaluation framework for XAI and to support more effective human-Al interaction through clearer insights into model behavior.