Alek Fröhlich
Conditional independence testing (CIT) is an important problem in statistics and machine learning, with applications ranging from causal discovery to feature selection. Given (possibly sets of) random variables X, Y, and Z, the problem consists in testing whether X and Y are statistically independent given Z. While significant progress has been made in the past decade, particularly with kernel methods, current solutions still suffer from various limitations, such as requiring large sample sizes and/or small conditioning sets, or being unstable with respect to hyperparameter selection.
A promising direction involves leveraging the representational power of modern neural networks to learn relevant statistical operators, such as the conditional expectation operator between L2 spaces of observables, through the careful design of loss functions. With these learned objects, the CIT problem can be approached via multiple angles. For instance, one can measure the distance between the conditional densities p(X I Y, Z) and p(X I Z) or discretize Z and perform multiple unconditional tests between X and Y, where the test could be implemented by checking if the truncated learned operator is zero.
This PhD project aims to advance the operator learning framework, which has been extensively studied for transfer operator and infinitesimal generator learning at Prof. Pontil's lab at IIT. The focus is on addressing some of the challenges currently faced by current conditional independence tests, and explore causal discovery in the context of dynamical systems with broad applications across science and engineering.