Debiasing Multilingual Representations by Discovering Shared Bias Components in Multiple Languages
Shun Shao (Ph.D. Student)
Modern NLP models that learn from tremendous data are extremely powerful. However, they also tend to associate certain gender, race, or religion with the label, and make unfair and biased predictions. Multilingual debiasing is a promising topic that tries to learn a debiasing model from multiple languages, and debias an unseen language. Unfortunately, existing methods failed to deliver a satisfying result. One main reason is that they did not actually learn the biases from multiple languages. Recent studies showed that shared bias components might exist within multiple languages. Therefore, I am proposing this plan to discover those shared bias components, which can be used to debias multilingual representations. Specifically, I will explore iterative spectral attribute removal and multi-way decomposition debiasing method to mitigate the bias. I will also extend these two methods to zero-shot mulitilingual concept removal. Furthermore, I plan to collect a new multilingual multibias dataset from BBC News to facilitate the development of multilingual debiasing algorithms on different concepts.
Primary Host: | Anna Korhonen (University of Cambridge) |
Exchange Host: | Shay Cohen (University of Edinburgh) |
PhD Duration: | 01 October 2023 - 30 September 2027 |
Exchange Duration: | - Ongoing - Ongoing |