Imen Mahdi
PhD
Polytechnic Institute of Paris (Télécom Paris)
Long-Horizon Planning and Reasoning for Intelligent Robotic Systems

Vision Language Action (VLA) models have been proposed to enable robots to execute a variety of tasks by integrating perception, language understanding, and action planning. Large VLAs trained on a diverse set of tasks have been shown to generalize well to new tasks and environments, achieving high precision and efficiency. However, these tasks are typically not complex in nature and do not require long-term planning, such as opening a drawer, grasping an object, or reaching a target. Long-term goals can be decomposed into sub-tasks that are more manageable and easier to solve. For example, grabbing a soda from the fridge can be decomposed into reaching the fridge, opening the door, and grabbing the soda. This can be addressed as a sequence of primitive skills that can be executed in a hierarchical manner i.e. given a language instruction and a visual input, the robot plans a sequence of sub-tasks that lead to the final goal. This PhD aims to explore methodologies for hierarchical planning and reasoning to enable robots to perform complex tasks in a way that leverages the advancements made in the field of VLAs.

Track:
Academic Track
PhD Duration:
October 1st, 2024 - September 30th, 2028
First Exchange:
September 1st, 2025 - March 31st, 2026
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.