David Svitov
PhD
Italian Institute of Technology (IIT)
Fast feed-forward video avatars with diffusion prior

Photo-realistic human avatars are a rapidly developing area of research. However, to the best of my knowledge, all current methods have two shortcomings: Lack of human appearance prior and long training time. These shortcomings limit the applicability of photorealistic human avatars. The drawbacks described above can be addressed by developing a method for getting personalized avatars using diffusion prior and operating in feed-forward mode. In my PhD project, I plan to use video frames to obtain single-view avatars and then combine them using the information from several frames. For this, I will use a pre-trained model to extract features from each view. Then, I plan to train the neural network to combine the features for a multi-view avatar using the diffusion prior as the loss function. My research will accelerate to several seconds the speed of obtaining personalized animated avatars from a video. This will expand the applicability of avatars in telepresence tasks and also in the entertainment industry, for example, in the metaverse or AR / VR applications. I believe the avatar generation time and the difficulty of getting the input video are the major barriers to the broad use of this technology.

Track:
Academic Track
PhD Duration:
November 1st, 2023 - June 1st, 2026
First Exchange:
October 1st, 2025 - June 1st, 2025
ELLIS Edge Newsletter
Join the 6,000+ people who get the monthly newsletter filled with the latest news, jobs, events and insights from the ELLIS Network.