Bridging Theory and Algorithms for Lifelong Learning
Marco Ciccone (PostDoc)
One of the most ambitious goals of Artificial Intelligence is to provide an agent with the ability to continually learn in an open-ended scenario. Lifelong Learning, or Continual Learning (CL), explicitly focuses on non-stationary or changing environments, where an agent has to complete a series of partially related problems integrating knowledge from a stream of data. This learning paradigm introduces several fundamental challenges that generally do not arise in a single task batch learning setting and involves balancing competing objectives. CL methods should keep learning effectively as new tasks are observed while minimizing catastrophic forgetting and interference across tasks (stability-plasticity dilemma). At the same time, learning a new task should improve related tasks, both past (backward transfer), and future (forward transfer), in terms of efficiency and performance. Moreover, CL introduces several constraints regarding the scalability of these approaches. While models should not necessarily increase their capacity at each subsequent task, given an arbitrarily long sequence of tasks, it is impossible to maintain perfect recall in a fixed-capacity model. This dilemma motivates the development of methods for fast adaptation to novel tasks or domain shifts and fast recovery strategies, which allows forgetting only if previous performance levels on past tasks can be recovered with a minimal amount of new or retained experience. In the past few years, Continual Learning steadily gained the interest of the Machine Learning community resulting in significant advances in the field; however, most research remains empirical, and the theoretical understanding of many aspects of Continual Learning is still relatively unexplored. Although partially successful, the development of algorithms for Continual Learning is mainly guided by intuitions or based on heuristics, and the theoretical understanding of these methods is often disregarded. The lack of existing theoretically grounded work on Continual Learning motivates the adoption of a new principled perspective, opening new exciting research opportunities. For instance, a principled analysis can help to identify the causes underlying phenomena such as catastrophic forgetting and provide insights for the development of novel solutions with desirable guarantees on a broad class of problems. The primary focus of this research endeavour is to establish a theoretical framework for foundational research in Continual Learning and to define a clear protocol for the evaluation of CL algorithms. We propose to analyze Continual Learning through the lenses of Statistical Learning Theory to draw connections with other related learning paradigms such as Online Learning (OL) and Meta-Learning. In particular, we build upon the consolidated Online Learning literature and augment its multitask formulation with the typical constraints of the Continual Learning setting to verify that it is suitable to express its desiderata and guide the development of novel provable methods with theoretical guarantees.
Primary Host: | Barbara Caputo (Politecnico di Torino & Italian Institute of Technology) |
Exchange Host: | Carlo Ciliberto (University College London) |
PostDoc Duration: | 01 June 2021 - 31 May 2023 |
Exchange Duration: | 01 February 2022 - 01 August 2022 - Ongoing |