Beyond the Thesis with Zifeng Ding
Interview Series with ELLIS PhD Alumni

Zifeng Ding is a Research Associate at the University of Cambridge working on NLP and graph representation learning. He completed his PhD as an ELLIS student under ELLIS Fellows Volker Tresp and Michael Bronstein, focusing on temporal knowledge graphs and inductive learning. Now, he also works at Eigent AI, working on Project Loong to generate high-quality synthetic data with multi-agent systems.
Name: Zifeng Ding
Affiliation: University of Cambridge
Role: Research Associate
Area of research: Natural Language Processing, Graph Representation Learning
Further info: zifengding.github.io
PhD Primary Advisor: Volker Tresp (ELLIS Fellow), Secondary Advisor: Michael Bronstein (ELLIS Fellow)
ELLIS Experience
Can you briefly describe your PhD research and how the ELLIS network or its resources supported it?
During my PhD, my research focused on inductive representation learning and natural language question answering over temporal knowledge graphs—an area that was relatively underexplored at the time. Under the guidance of my primary supervisor, Prof. Volker Tresp, I had the freedom to pursue new directions, which led to the publication of several high-quality papers in this field. After being nominated as an ELLIS PhD student, I had the opportunity to visit Prof. Michael Bronstein’s group at the University of Oxford. There, I connected with many brilliant researchers and formed strong collaborations, including a work recently published at Transactions on Machine Learning research (TMLR). Since starting my Postdoc at the University of Cambridge, I have remained actively engaged in these collaborations, with regular research discussions and social meetings in Cambridge, Oxford, and London.
What was the most valuable aspect of being part of the ELLIS community during your PhD?
One of the most valuable aspects of my experience has been the opportunity to be mentored by Prof. Michael Bronstein. He is very cool as a person. I wish I could have his mentality when I am doing research and in my daily life. Of course, being part of the ELLIS community also gives me the chance to collaborate with many brilliant and supportive colleagues, and I genuinely enjoy working with them.
I would say ELLIS provides a platform for us to showcase our research achievements. Also, it is an endorsement of our ability to conduct high quality research. For example, when I was reaching out to professors across Europe, I found that many were already familiar with ELLIS and regarded it as a highly prestigious institution. One reason for this strong reputation is the consistently impressive track record of ELLIS students publishing high-quality papers at top-tier venues each year. Another factor is the rigorous selection criteria for becoming an ELLIS PhD. As a nominated ELLIS PhD student myself, I know that being accepted typically requires having a first-author paper at a leading conference — a clear indicator of the high caliber of ELLIS students. Moreover, ELLIS actively promotes the academic achievements of its students and provides substantial support to help them succeed.
What's one memorable moment from your ELLIS PhD that you'd like to share?
I still remember the first time I heard about ELLIS — it was during a casual conversation with my senior colleague, Dr. Yunpu Ma, at LMU Munich. He told me, “Zifeng, you know what? Volker has been selected as an ELLIS PR.” I was thinking, “Okay, what is ELLIS? Does it really matter?” At the time, I didn’t think much of it. Later on, when I was looking for an internship or exchange opportunity, I discovered that ELLIS supports PhD students in exchange programs. I applied without much expectation — it felt like a random attempt — but fortunately, I was given the opportunity to join the ELLIS community.I didn’t expect much when I first applied, but my experience during my visit to Oxford made me fall in love with doing research in the UK — probably helped by the fact that my German wasn’t good enough to live stress-free in Germany hahaha. That experience ultimately led me to my current position as a postdoc at Cambridge. I would say ELLIS has played a major role in shaping who I am today and where I am living now.
How did the international exposure and collaborations within ELLIS contribute to your professional development and network?
When I was searching for a postdoc position, mentioning that I was an ELLIS PhD candidate often implicitly boosted how potential employers perceived me. During my visit to the University of Oxford, I had the opportunity to meet many brilliant collaborators. Since a large part of my PhD journey took place during the Covid pandemic, being able to connect with more researchers and exchange ideas was especially valuable. I am still collaborating with many of the people I met through ELLIS, and I am very grateful for these ongoing relationships.
What didn’t go as you expected during your ELLIS PhD? What could be improved in the program?
I wish I had had the opportunity to pursue an industry internship. The current job market is quite challenging for many excellent PhD students. Most companies are primarily hiring candidates working on specific, trending topics, such as large language models. However, this narrow focus should not define the entirety of what our society values in AI research. Even for ELLIS PhDs, opportunities to be hired as a student researcher are extremely limited due to the intense competition. I hope that ELLIS can build stronger collaborations with a wider range of enterprises and perhaps offer a fast-track program to give ELLIS PhDs a better chance of securing internships.
Current Role & Career Path
Can you briefly describe your career trajectory since graduating from the ELLIS PhD program?
I am currently a Research Associate (Postdoctoral Researcher) at the University of Cambridge, working with Prof. Andreas Vlachos in the Cambridge Natural Language and Information Processing Research Group within the Department of Computer Science and Technology. In addition, I am a visiting research scientist at Eigent AI, the organization behind the popular multi-agent framework CAMEL-AI.
My career currently spans both academia and the startup world. At heart, what excites me most is the freedom to pursue independent research. I’m very fortunate to work with Prof. Andreas Vlachos at the University of Cambridge, who is a supportive mentor — he encourages me to explore my own research directions, while always being available when I need guidance.
At the same time, I decided to join Eigent AI because I believe agentic AI represents the future of large foundation models, and Eigent AI is a pioneer in this space. I also wanted to gain more practical experience and learn how to develop real-world products — something that is often difficult to achieve within academia. Compared to large enterprises, startups are generally more transparent, making it easier for me to observe and understand the logic behind how tech-driven companies operate.
Impact of Your Work in AI
Do you have any recent or relevant publications you would like highlight?
Since I just completed my PhD at the end of February 2025, I don’t have a large number of publications yet. But there are a few I’m excited about!
First, my paper DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models was recently accepted to Transactions on Machine Learning Research. This work, done during my visit to the University of Oxford with Prof. Michael Bronstein (link to OpenReview), proposes an effective and efficient method for learning graph representations by capturing long-term historical information using state space models.
Another project I’m proud of is our recent preprint, Supposedly Equivalent Facts That Aren’t? Entity Frequency in Pre-training Induces Asymmetry in LLMs, a collaboration with friends in the UK and Germany (link to ArXiv). This work shows that LLM hallucinations are linked to entity frequency biases in pre-training data: facts with high-frequency subjects are recognized more reliably than their inverses. By analyzing the open-source OLMo dataset, we highlight how pre-training data shapes model predictions and offer insights for probing closed models.
Both of these projects grew out of my visit to Oxford, which was supported by ELLIS — and I’m very grateful for that opportunity.
You also work for a startup. What is your startup focused on, and how does your expertise support it?
At Eigent AI, I’ve been deeply involved in Project Loong, which aims to scale the generation of high-quality synthetic data using AI agents. Our goal is to use our multi-agent framework to create synthetic data across a wide range of domains that have been underexplored in previous LLM reasoning research. We hope to help LLMs learn to reason more effectively beyond the most commonly studied areas, like math.
I believe Project Loong is already making a meaningful impact. Since its launch at the beginning of April, it has gained over 300 stars on GitHub. Loong is an ongoing project where we are continuously collecting high-quality seed data and optimizing our verifiable data generation pipeline to build a more advanced system for synthetic data generation across various domains. As many in the community have recognized, the available high-quality data on the web has been largely exhausted in recent years. To continue advancing the capabilities of foundation models, we need new, high-quality data — and this is exactly where Project Loong aims to make a significant difference. I am also putting here the blog post that gives a more detailed description of Loong (which is also drafted by me).
As a visiting research scientist, I serve as a lead member of the Loong project. Drawing on my research experience, I work closely with junior interns and community contributors to design pipelines, develop code, and run experiments. I am also responsible for the graph domain, where I contribute high-quality seed datasets focused on graph theory.
In what ways does your work help drive progress in AI?
My PhD research began with a focus on knowledge graphs (KGs) and graph representation learning. I proposed the first methods for inductive representation learning on temporal KGs and was the first to highlight the task of forecasting question answering over temporal KGs, laying important groundwork for the KG community. As my interests gradually shifted toward natural language processing, I became more focused on large foundation models.
My recent work explores how knowledge is stored within LLMs and how external knowledge sources, such as KGs, can be integrated to enhance their reasoning. This research helps us better understand why and when LLMs hallucinate — and how to mitigate it. More recently, I have also devoted myself to agentic AI, which I believe represents the future of the next generation of foundation models. I am particularly interested in enabling LLMs to use tools to solve complex problems more effectively and in generating high-quality synthetic data through multi-agent systems, and I believe my work in these areas can make a strong contribution.
What are you most passionate about in your current role?
I am most passionate about the opportunity to freely pursue the research I believe is important for the community and society. As I drew closer to the end of my PhD, I found myself increasingly drawn to work that is more practical and closely aligned with real-world needs. The rise of large foundation models has created new opportunities to apply research in ways that are more human-centered and impactful.
Thanks to the support of my mentor, Prof. Andreas Vlachos, my current role allows me the freedom to explore ideas I find meaningful and valuable. I am not focused on simply pushing the boundaries of AI/ML for its own sake; instead, I am exploring how we can conduct research that is genuinely useful — research that can benefit everyone.
What are the most exciting or promising areas of AI research that you see on the horizon?
As an NLP researcher, I can only comment from my own field. I believe the next generation of LLMs will become much more powerful through their ability to use a wide range of tools. Many complex problems cannot be easily solved by LLMs relying solely on their internal capabilities. Numerous studies have shown that effectively leveraging tools significantly improves performance on highly complex and challenging benchmarks. In my view, the most efficient way to enhance foundation models in the near future is not by simply scaling up training data and model size, but by teaching models how to use tools correctly. Current LLMs are still not very proficient at tool use, making this an exciting and highly promising area for future research.