Beyond the Thesis with Zhijing Jin

Interview Series with ELLIS PhD Alumni

24 July 2025 News

Zhijing Jin is an incoming Assistant Professor at the University of Toronto, a CIFAR AI Chair, and a faculty member at the Vector Institute, where she works on LLMs, multi-agent AI, and responsible AI for social good, with a focus on AI safety and scientific discovery. She earned her PhD under the co-supervision of Bernhard Schölkopf (Max Planck Institute for Intelligent Systems) and Mrinmaya Sachan (ETH Zurich), where she combined NLP and causal inference to develop methods that make large language models more robust, trustworthy, and capable of reasoning about cause and effect.

Name: Zhijing Jin

Affiliation: University of Toronto, Max Planck Institute for Intelligent Systems, Tübingen

Role: Incoming Assistant Professor in NLP at the University of Toronto, holding a CIFAR AI Chair and being a faculty member at the Vector Institute. Currently a Postdoc at the Max Planck Institute for Intelligent Systems, Tübingen.

Areas of research: Natural Language Processing (NLP), Large Language Models (LLMs),Causal Inference, Multi-Agent LLMs, Responsible AI for Social Good

Further info: https://zhijing-jin.com

PhD Primary Advisor: Bernhard Schölkopf (ELLIS co-founder and ELLIS Fellow), Secondary Advisor: Mrinmaya Sachan (ELLIS Fellow)

ELLIS Experience

Can you briefly describe your PhD research and how the ELLIS network or its resources supported it?

I'm very grateful for the support and incredible research experience I received during my PhD. Studying in Europe has been one of the best experiences of my life. My work focused on natural language processing (NLP), the field that builds computational models of language—ChatGPT being a well-known example. Together with my supervisor, Bernhard Schölkopf, I explored how causal reasoning can strengthen NLP models. My goal was to develop methods that help language models better understand cause-and-effect, making AI more robust and reliable for real-world use.

Through ELLIS, I also connected with my co-supervisor, Mrinmaya Sachan at ETH, who guided the NLP side of my work. Together, we combined causality and NLP to shape a forward-looking research agenda. With the support of Bernhard and Mrinmaya, I had a truly resourceful and fulfilling PhD experience across Germany and Switzerland. I’m deeply appreciative of this journey.

The joint supervision was key to enabling cross-topic research, vital for advancing knowledge. MPI and Bernhard fully supported my curiosity-driven research, trusting me to develop my own direction—even when it wasn’t widely recognized.

What was the most valuable aspect of being part of the ELLIS community during your PhD?

To me, the most valuable aspect of being part of the ELLIS community during my PhD was the invaluable supervision opportunities. The joint supervision has been a major enabler for cross-topic research, which is crucial for advancing knowledge and building a foundation for meaningful research. Without ELLIS connecting me with my two exceptional co-supervisors, I wouldn't have been able to develop my research agenda at the intersection of both fields.

Additionally, the exchange opportunities within the ELLIS network expanded my professional connections, and the training activities enriched my skills and knowledge. This program has significantly enhanced my personal and professional growth in the field of academia.

Can you share some key experiences you gained through the program?

The unique academic training promoted by ELLIS in Europe emphasizes math, physics, and philosophy. This opened a new world for me, as my prior experiences are in societies and universities prioritizing Computer Science. My time in Europe has equipped me with a broader view of foundational knowledge, which culminates in my experience attending the 2024 Lindau Nobel Laureate Meeting, where I represented the Max Planck Institute for Intelligent Systems and met nearly 40 physics laureates. All of these experiences will inspire me for the rest of my academic and personal life.

How did the international exposure and collaborations within ELLIS contribute to your professional development and network?

The international exposure and collaborations within ELLIS motivated me to learn from many researchers’ diverse perspectives, enabling me to look at research problems from various lenses. In the field of research, it is incredibly important to understand the versatility of the research ecosystem. With the rise of technology, researchers can actively collaborate on projects around the world and achieve a common goal of improving the world. As a result of this, I hope to give back to the community that nurtured me. I look forward to collaborating with the talented faculty and students at MPI-IS to expand the latest large language model (LLM) research. I will continue my work on building more trustworthy LLMs with causality, and look forward to co-supervising students within the ELLIS-CIFAR network, bringing state-of-the-art research developments with talents and students from both Europe and Canada.

Current Role & Career Path

Can you briefly describe your career trajectory since graduating from the ELLIS PhD program?

I accepted the assistant professor position at the University of Toronto at the age of 26, as one of the youngest professors in the university. I am also a faculty member at Vector Institute, ELLIS Advisor, and CIFAR AI Chair. I pursued a career in academia, as I have always been aspiring to be a professor to spread knowledge and inspirations to various generations.

I have continued my research at the intersection of Causality and NLP. To further push my work, we have recently developed a Causal AI Scientist agent, driven by LLMs and can go through research questions with provided data like what a human causal scientist can do. We aspire for our AI agent to assist future scientific discovery, in various subjects from natural sciences to social sciences.

Impact of Your Work in AI

Do you have any recent or relevant publications you would like highlight?

I received the Best Paper Awards at the NeurIPS 2024 Workshop on Pluralistic Alignment, for our work on, “Language Model Alignment in Multilingual Trolley Problems” and at the NeurIPS 2024 Workshop on Causality and Language Models (CaLM), for our work on “Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias”.

Other papers I can highlight include:

“Causal AI Scientist: Facilitating Causal Data Science with Large Language Models”: We developed the Causal AI Scientist (CAIS), an end-to-end causal estimation tool that automatically selects appropriate causal inference methods, implements them, and validates the results.

“Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models”: We built the first systematic evaluation of whether LLMs can identify each other and show why understanding this capability is crucial for AI alignment, safety, and efficient multi-agent cooperation.

“Democratic or Authoritarian? Proving a New Dimension of Political Biases in Large Language Models”: We studied the political bias of LLMs, finding that LLMs generally favor democratic values, but exhibit favorability toward authoritarian figures when prompted in Mandarin. We proposed a novel methodology to assess such sociopolitical alignment on the democratic-authoritarianism spectrum, combining (1) the F-scale, a psychometric tool for measuring authoritarian tendencies, (2) FavScore, a newly introduced metric for evaluating model favorability toward world leaders, and (3) role-model probing to assess which figures are cited as general role-models by LLMs.

“Accidental Misalignment: Fine-tuning Language Models Induces Unexpected Vulnerability”: We conducted a comprehensive study into the effects of domain-specific fine-tuning on adversarial robustness, identifying the causal links between dataset-specific factors such as lexical diversity and response length, to LLM safety.

We are also actively working on a democracy defense project line. One example of this effort is our paper “Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing”, where we show how large language models can reverse-engineer content moderation decisions across five countries over a decade, revealing distinct national censorship patterns. By combining Shapley value analyses with LLM-generated explanations, we uncover the key factors behind these moderation rules, and human evaluations confirm the interpretability of our findings.

In what ways does your work help drive progress in AI?

My research areas include Large Language Models (LLMs), Natural Language Processing (NLP), and Causal Inference. I am particularly interested in understanding how causal inference can improve people's daily lives by enhancing our ability to reason about evidence. This capability can help us distinguish between truth and falsehood in news articles, enable scientists to draw accurate conclusions from data and existing theories, and assist policymakers in choosing interventions that positively impact society.

Additionally, my research involves multi-agent LLM simulations. I have developed a series of works in which multiple LLM agents form a simulated society, role-playing and interacting with each other. I am particularly excited about gaining insights into AI behavioral tendencies through multi-agent simulations (see our blogpost introducing our research series GovSim, SanctSim and MoralSim), and I hope to extend our findings to benefit human society.

What are you most passionate about in your current role?

I’m excited to inform a team of many enthusiastic students to keep pushing the research frontier of AI for science and AI safety. I think my deep drives are a constant curiosity for science and responsibility to make our society a better place. For social impact, we also run a yearly workshop on NLP for positive impact since ACL 2020, where we gather a community of committed researchers who care about social good and connect technology to lead to positive social impact. Our joint efforts can be seen in our community white paper “NLP for Social Good: A Survey of Challenges, Opportunities and Responsible Deployment”.

What are the most exciting or promising areas of AI research that you see on the horizon?

Scientifically, we can anticipate a major leap of AI accelerating progress across various research domains. Societally, we must prepare for a new era in which AI becomes deeply integrated into everyday life and shapes public discourse. This upcoming societal shift presents both significant opportunities and notable challenges. It is essential to proactively address and mitigate potential risks while embracing the transformative benefits of AI with openness and thoughtful preparation.

Watch the interview highlights in the video below:

ELLIS Newsletter

If you want to receive the ELLIS newsletter regularly via email, please subscribe here:

Intranet | Imprint | Privacy Policy | Logos | Contact