The document titled „DragGAN: User-Interactive and Generic Control of GANs for Image Manipulation“ proposes a user-interactive method for manipulating Generative Adversarial Networks (GANs) to achieve flexible, precise, and generic controllability over synthesized visual content. This includes position, pose, shape, expression, and layout of various categories such as animals, cars, humans, and landscapes.
The proposed method, DragGAN, enables point-based manipulation where users click handle points and target points on the image to control diverse spatial attributes. The approach consists of feature-based motion supervision and a new point tracking approach that allows users to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout.
The technique is shown to achieve efficient manipulation, taking only a few seconds on a single RTX 3090 GPU in most cases, and can be used for live, interactive editing sessions, allowing users to quickly iterate on different layouts.
The paper demonstrates DragGAN’s capabilities on diverse datasets, and the proposed GAN-based point tracking algorithm outperforms existing point tracking approaches for GAN-generated frames. Moreover, by combining with GAN inversion techniques, DragGAN also serves as a powerful tool for real image editing. Here is the link to the pdf document: