Omar Swelam
Across language and vision, learned foundation representations have turned static models into general-purpose interfaces: LLMs support zero- and few-shot prompting, tool use, and retrieval-augmented agents, while vision backbones enable open-vocabulary recognition, text-image alignment, and visual question answering. This PhD project aims to bring that shift to tabular data by learning representations that genuinely "understand" tables and can be repurposed across domains and tasks. Research in tabular foundation models is accelerating, yet most work remains task-bound, typically classification or regression, leaving open the challenge of general-purpose, reusable table representations. The project seeks tabular embeddings that capture not only schema, column semantics, units, and constraints but also the underlying data-generating processes, distributional and dependency structures, uncertainty, and other statistical properties. At its core are Tabular Foundation Models (TFMs) whose embeddings align with causal generative factors, yielding identifiable, disentangled abstractions. Such representations align features with the underlying causes, supporting robust reasoning, transfer, and out-of-distribution performance. In turn, in-context learning in TFMs will serve as a mechanism for per-dataset specialization, allowing function definitions and prompts that respect domain conventions while remaining globally reusable, and for injecting domain expertise into table reasoning. When aligned with LLMs, these tabular representations enable agents to be prompted to query, retrieve, and plan over tables with faithful attributions rather than brittle pattern matching. A flagship direction is causal discovery from TFM representations to expose structural relations that enable counterfactual analysis, intervention planning, and fairness-aware auditing. This also enables mechanistic analyses to uncover failure modes in TFMs, turning opaque features into legible concepts that agents and practitioners can inspect. The anticipated outcome is interpretable, audited TFMs that act as reliable, controllable modules within multimodal agents operating on tabular data. Ultimately, the project seeks to transform TFMs from black boxes into causal, transparent, and repurposable components that people can understand and safely use.