Shun Shao
PhD
University of Cambridge
Debiasing Multilingual Representations by Discovering Shared Bias Components in Multiple Languages

Modern NLP models that learn from tremendous data are extremely powerful. However, they also tend to associate certain gender, race, or religion with the label, and make unfair and biased predictions. Multilingual debiasing is a promising topic that tries to learn a debiasing model from multiple languages, and debias an unseen language. Unfortunately, existing methods failed to deliver a satisfying result. One main reason is that they did not actually learn the biases from multiple languages. Recent studies showed that shared bias components might exist within multiple languages. Therefore, I am proposing this plan to discover those shared bias components, which can be used to debias multilingual representations. Specifically, I will explore iterative spectral attribute removal and multi-way decomposition debiasing method to mitigate the bias. I will also extend these two methods to zero-shot mulitilingual concept removal. Furthermore, I plan to collect a new multilingual multibias dataset from BBC News to facilitate the development of multilingual debiasing algorithms on different concepts.

Track:
Academic Track
PhD Duration:
October 1st, 2023 - September 30th, 2027
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.