Renhao Pei
The recent success of large language models (LLMs) has marked an exciting era in both the fields of Artificial Intelligence (AI) and Natural Language Processing (NLP). However, the impressive abilities of LLMs are still largely limited to a handful of high-resource languages with abundant data, whereas low-resource languages with scarce data pose many methodological challenges. In my doctoral research, the central goal is therefore to advance the multilingual abilities of LLMs, especially on low-resource languages. To achieve this common goal, there are mainly three research directions that I plan to take: 1. bridging the gap between high- and low-resource languages through in-context machine translation (MT) and reinforcement learning (RL), 2. constructing multilingual datasets and benchmarks for low-resource languages from linguistic resources, and 3. generating synthetic data of low-resource languages in multilingual model pre-training.