Federica Spinola

PhD
Inria Paris Centre
Scalable data strategies for robust robot learning

This PhD research addresses the challenge of data in robotic policy learning. It focuses on two key research directions: 1. understanding the structure and utility of existing robotic datasets, and 2. scaling robot learning via human demonstration videos.

The first line of research involves a comprehensive analysis of robotics datasets to identify which samples contribute the most significantly to policy performance and which are redundant. This involves techniques such as influence functions and diversity metrics to support data curation strategies that enhance learning efficiency. Beyond per-sample importance, this work will investigate which types of data features (semantic labels, 2D vs. 3D structure, or specific modalities like touch, language, or speech) most effectively support generalizable policy learning. In turn, this analysis aims to guide the construction of more comprehensive datasets for robot learning, spanning different embodiments and skill domains.

The second line of work aims to unlock large-scale learning from human task demonstrations. Rather than relying on manual data collection or simulation, this research will develop methods to extract structured, task-relevant information (end-effector state changes, contact dynamics, or long-term intent) directly from real-world videos of humans performing tasks. Key challenges such as embodiment mismatch, contact ambiguity, and occlusions will be addressed through morphology-aware embeddings, and policy distillation strategies that adapt human-derived policies to robotic embodiments via reinforcement learning, or embodiment-agnostic models. The research will further investigate how to leverage human error examples to teach robots what not to do, and how to extract rich physical attributes from video using motion cues and learned dynamics priors.

Together, these efforts seek to lay the groundwork for data-efficient, scalable learning pipelines, paving the way for generalist robot policies. This research aims to integrate deep insights from datasets with supervisory signals from human behavior for generalist robotic policy learning, bridging the gap between data availability and robotic capability.

Track:
Academic Track
PhD Duration:
October 1st, 2025 - September 30th, 2028
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.