Yasir Barlas
Optimal experimental design is the area dedicated to optimally designing informative experiments, using the few and typically expensive resources available. Recent developments in sequential experimental design look to construct a policy that can efficiently navigate the design space, in a way that maximises the expected information gain. Dynamic programming and reinforcement learning naturally arise as methods to obtain this policy. Whilst there is work on achieving tractable solutions to experimental design problems, there is significantly less work on obtaining generalisable solutions, which itself is a great challenge in reinforcement learning. In this project, we investigate approaches to enhance out-of-distribution generalisation, and seek to do this by leveraging human-in-the-loop feedback. In particular, we aim to employ inverse reinforcement learning to learn from expert demonstrations, address confounders through invariant learning and other methods, and to ultimately develop policies that balance exploration and exploitation, while incorporating domain knowledge provided by humans. Our goal is to develop robust and statistically sound methodologies that can adapt to diverse real-world scenarios, and to improve the decision-making abilities of agents to optimally design informative experiments.