Thumb ticker md julius von k%c3%bcgelgen  phd student

Independent causal mechanisms in machine learning

Julius von Kügelgen (Ph.D. Student)

Machine learning (ML) approaches often operate under the assumption of independent and identically distributed (i.i.d.) random variables, and many of the impressive recent achievements can be phrased as supervised learning problems in such an i.i.d. setting. Due to changes in the environment, different measurement devices, varying experimental conditions, or sample selection bias, this i.i.d. assumption is, however, often violated in practice. This becomes particularly relevant when we move beyond a single dataset or task and instead aim to fuse the many different sources available in the age of big data [1].⠀ ⠀ Causal modelling offers a principled and mathematical way of reasoning about similarities and differences between distributions arising, e.g., from the aforementioned i.i.d. violations. In particular, it takes the perspective of systems (of variables) being comprised of independent modules, or mechanisms, which are robust, or invariant, across different conditions, even if other parts of the system change [2]. This view suggests independent causal mechanisms as the objects to study for learning to perform a variety of tasks under different conditions without forgetting them over time.⠀ ⠀ In my PhD studies, I explore whether and how switching from the more traditional prediction-based paradigm to focusing instead on learning independent causal mechanisms can be beneficial for non-i.i.d. ML tasks such as transfer-, meta-, or continual learning. Another focus is causal representation learning, i.e., learning causal generative models over a small number of meaningful causal variables from high-dimensional observations. I am also interested in using counterfactual reasoning to better understand and interpret ML models (explainable AI), and how to learn causal relations from heterogeneous data (causal discovery).⠀ ⠀ [1] Bareinboim, E., & Pearl, J. (2016). Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences, 113(27), 7345-7352.⠀ ⠀ [2] Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference: foundations and learning algorithms. MIT press.

Primary Host: Bernhard Schölkopf (Max Planck Institute for Intelligent Systems)
Exchange Host: Adrian Weller (University of Cambridge & The Alan Turing Institute)
PhD Duration: 01 September 2018 - 28 February 2023
Exchange Duration: 01 September 2018 - 31 August 2019