Publications

Adversarial Adaptive Interpolation in Autoencoders for Dually Regularizing Representation Learning

Published in IEEE MultiMedia, 2022

A new interpolation method that follows the data manifold.

Data interpolation is typically used to explore and understand the latent representation learnt by a deep network. Naive linear interpolation may induce mismatch between the interpolated data and the underlying manifold of the original data. In this paper, we propose an Adversarial Adaptive Interpolation (AdvAI) approach for facilitating representation learning and image synthesis in autoencoders. To determine an interpolation path that stays on the manifold, we incorprate an interpolation correction module, which learns to offset the deviation from the manifold. Further, we perform matching with a prior distribution to control the characteristics of the representation. The data synthesized from random codes along with interpolation-based regularization are in turn used to constrain the representation learning process. In the experiments, the superior performance of the proposed approach demonstrates the effectiveness of AdvAI and associated regularizers in a variety of downstream tasks.

Discovering Density-Preserving Latent Space Walks in GANs for Semantic Image Transformations

Published in ACM Multimedia, 2021

A new traversing method that preserves the latent probability density; Excavate useful directions from the pretrained generator.

Generative adversarial network (GAN)-based models possess superior capability of high-fidelity image synthesis. There are a wide range of semantically meaningful directions in the latent representation space of well-trained GANs, and the corresponding latent space walks are meaningful for semantic controllability in the synthesized images. To explore the underlying organization of a latent space, we propose an unsupervised Density-Preserving Latent Semantics Exploration model (DP-LaSE). The important latent directions are determined by maximizing the variations in intermediate features, while the correlation between the directions is minimized. Considering that latent codes are sampled from a prior distribution, we adopt a density-preserving regularization approach to ensure latent space walks are maintained in iso-density regions, since moving to a higher/lower density region tends to cause unexpected transformations. To further refine semantics-specific transformations, we perform subspace learning over intermediate feature channels, such that the transformations are limited to the most relevant subspaces. Extensive experiments on a variety of benchmark datasets demonstrate that DP-LaSE is able to discover interpretable latent space walks, and specific properties of synthesized images can thus be precisely controlled.

Adversarial Adaptive Interpolation for Regularizing Representation Learning and Image Synthesis in Autoencoders

Published in ICME, 2021

A new interpolation method that follows the data manifold.

Data interpolation is typically used to explore and understand the latent representation learnt by a deep network. Naive linear interpolation may induce mismatch between the interpolated data and the underlying manifold of the original data. In this paper, we propose an Adversarial Adaptive Interpolation (AdvAI) approach for facilitating representation learning and image synthesis in autoencoders. To determine an interpolation path that stays on the manifold, we incorprate an interpolation correction module, which learns to offset the deviation from the manifold. Further, we perform matching with a prior distribution to control the characteristics of the representation. The data synthesized from random codes along with interpolation-based regularization are in turn used to constrain the representation learning process. In the experiments, the superior performance of the proposed approach demonstrates the effectiveness of AdvAI and associated regularizers in a variety of downstream tasks.

High Fidelity GAN Inversion via Prior Multi-Subspace Feature Composition

Published in AAAI, 2021

Check out if you are interested in following questions:

How to analyze pre-trained generator's knowledge?
How to find a target image in the pre-trained generator?
How to manipulate your images?

Generative Adversarial Networks (GANs) have shown impressive gains in image synthesis. GAN inversion was recently studied to understand and utilize the knowledge it learns, where a real image is inverted back to a latent code and can thus be reconstructed by the generator. Although increasing the number of latent codes can improve inversion quality to a certain extent, we find that important details may still be neglected when performing feature composition over all the intermediate feature channels. To address this issue, we propose a Prior multi-Subspace Feature Composition (PmSFC) approach for high-fidelity inversion. Considering that the intermediate features are highly correlated with each other, we incorporate a self-expressive layer in the generator to discovermeaningful subspaces. In this case, the features at a channel can be expressed as a linear combination of those at other channels in the same subspace. We perform feature composition separately in the subspaces. The semantic differences between them benefit the inversion quality, since the inversion process is regularized based on different aspects of semantics. In the experiments, the superior performance of PmSFC demonstrates the effectiveness of prior subspaces in facilitating GAN inversion together with extended applications in visual manipulation.

Improving Representation Learning in Autoencoders via Multidimensional Interpolation and Dual Regularizations

Published in IJCAI, 2019

Autoencoders enjoy a remarkable ability to learn data representations. Research on autoencoders shows that the effectiveness of data interpolation can reflect the performance of representation learning. However, existing interpolation methods in autoencoders do not have enough capability of traversing a possible region between datapoints on a data manifold, and the distribution of interpolated latent representations is not considered. To address these issues, we aim to fully exert the potential of data interpolation and further improve representation learning in autoencoders. Specifically, we propose a multidimensional interpolation approach to increase the capability of data interpolation by setting random interpolation coefficients for each dimension of the latent representations. In addition, we regularize autoencoders in both the latent and data spaces, by imposing a prior on the latent representations in the Maximum Mean Discrepancy (MMD) framework and encouraging generated datapoints to be realistic in the Generative Adversarial Network (GAN) framework. Compared to representative models, our proposed approach has empirically shown that representation learning exhibits better performance on downstream tasks on multiple benchmarks.