Thumb ticker md profile pic

lmproving Image Synthesis and Manipulation Control with GANs

Edgar Schönfeld (Ph.D. Student)

Among the major challenges of generative adversarial networks (GANs) remain the ability to create images that are indistinguishable from real images and the ability to depict precisely the content specified by the user. To this end, this work proposes methods to improve synthesis quality and maximize control over the image content. To increase the quality of synthetic images, we first propose the idea of segmentation-based adversarial losses. In particular, we redesign the GAN discriminator as a segmentation network that classifies image pixels as real or fake, offering new possibilities for regularization. The proposed method improves image quality in unconditional and conditional GANs. Next, we show that segmentation-based adversarial losses are naturally well-suited to the task of generating images from semantic layouts, also called semantic image synthesis, which offers precise control over the content of the image. In addition to adapting our approach to semantic image synthesis GANs, we introduce a noise injection method to make the semantic image synthesis GANs more noise-sensitive. The effects of the proposed techniques are improved image quality, new ways to locally edit the image, better modelling of long-tailed datasets, and a strong increase in the multi-modality of the synthesized images. Finally, we show that this improved multi-modality opens the door to controlling the image content via the latent space of the GAN generator. Therefore, we are the first to introduce a method for finding interpretable directions in the latent space of semantic image synthesis GANs. Consequently, we enable additional control of the image content via latent controls.

Primary Advisor: Bernt Schiele (Max Planck Institute for Informatics & Saarland University)
Industry Advisor: Anna Khoreva (Bosch Center for AI)
PhD Duration: 01 April 2019 - 30 September 2022