Beiduo Chen

On Learning Universal Multilingual Representations

Beiduo Chen (Ph.D. Student)

Endowing machines with the ability to distinguish and understand various languages under any circumstances has been one of the significant goals in the field of natural language processing. Since almost all the current approaches based on deep learning are massively data-driven, sufficient corpora across different languages play a key role in multilingual NLP. However, although an increasing number of models and corpora are emerging, the paucity of labeled data in low-resource languages still baffles the development of this subject. The top-priority goal of my PhD project is to make a preliminary exploration to solve the problem of language scarcity,consisting in excogitating model architectures, improving algorithms, and collecting resources that facilitate universal representations learning and cross-lingual transfer. To achieve a language-agnostic model performing well under any circumstances, in other word, to learn universal representations across languages. Eventually, these efforts promise to alleviate the current digital divide and provide vital services to endangered, under-documented and minority languages speaking communities.

Primary Host: Barbara Plank (LMU Munich & IT University of Copenhagen)
Exchange Host: Anna Korhonen (University of Cambridge)
PhD Duration: 01 October 2023 - Ongoing
Exchange Duration: - Ongoing - Ongoing