European Lab for Learning & Intelligent Systems

Fast feed-forward video avatars with diffusion prior

Photo-realistic human avatars are a rapidly developing area of research. However, to the best of my knowledge, all current methods have two shortcomings: Lack of human appearance prior and long training time. These shortcomings limit the applicability of photorealistic human avatars. The drawbacks described above can be addressed by developing a method for getting personalized avatars using diffusion prior and operating in feed-forward mode. In my PhD project, I plan to use video frames to obtain single-view avatars and then combine them using the information from several frames. For this, I will use a pre-trained model to extract features from each view. Then, I plan to train the neural network to combine the features for a multi-view avatar using the diffusion prior as the loss function. My research will accelerate to several seconds the speed of obtaining personalized animated avatars from a video. This will expand the applicability of avatars in telepresence tasks and also in the entertainment industry, for example, in the metaverse or AR / VR applications. I believe the avatar generation time and the difficulty of getting the input video are the major barriers to the broad use of this technology.

Primary Host:	Alessio Del Bue (Istituto Italiano di Tecnologia)
Exchange Host:	Lourdes Agapito (University College London)
PhD Duration:	01 November 2023 - 01 June 2026
Exchange Duration:	01 January 2025 - 01 June 2025 - Ongoing

ELLIS Newsletter

If you want to receive the ELLIS newsletter regularly via email, please subscribe here:

Intranet | Imprint | Privacy Policy | Logos | Contact