Thumb ticker md skander moalla square 2mb

Reinforcement Learning for Education

Skander Moalla (Ph.D. Student)

Sequential student-teacher interactions are a central component of education (ED). Effective interactions significantly accelerate learning, however, they cannot be massively scaled when requiring a human teacher in the loop, and otherwise, have been hard to provide when the teaching is exclusively done by a computer program. This difficulty comes from the need for the computer program to accurately model a student’s knowledge and personality in order to provide them with relevant feedback and to consider the long-term impact of its feedback on the student’s performance. Recently, reinforcement learning (RL) has been identified as a promising framework to address these needs as it models sequential decision-making under uncertainty and provides methods that can learn from interactions. It has been used in multiple ways to improve student-teacher interactions. First from the teacher’s perspective as a way to learn instructional policies, and second, from the student’s perspective in an inverse reinforcement learning fashion to learn computational models of students and use them to diagnose learning difficulties and train the data-hungry RL teachers. However, these methods still have to overcome several challenges characteristic of the ED setting such as the lack of simulation-based environments, the limited observability of the environment’s state (i.e., the student’s knowledge), and significantly delayed and noisy progress measures. In this project, we aim to develop novel RL-based methods that learn student models exhibiting the capabilities of human students, such as few-shot learning and deductive reasoning. In addition, we hope to leverage those computational models of students to build simulation-based RL environments for educational tasks and use them to boost the research on instructional policy learning and form challenges that attract not only researchers in the field of ED but also the broader community of RL researchers.

Primary Host: Tanja Käser (EPFL)
Exchange Host: Adish Singla (Max Planck Institute for Software Systems)
PhD Duration: 20 September 2022 - 31 August 2026
Exchange Duration: 01 September 2024 - 01 March 2025 - Ongoing