Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (UPC Reading Group)

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

Anh Nguyen, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, Jeff Clune

[GitHub] [Arxiv]

Slides by Víctor GarciaUPC Computer Vision Reading Group (27/01/2017)

https://github.com/Evolving-AI-Lab/ppgn




https://imatge.upc.edu/web/teaching/computer-vision-reading-group

https://imatge.upc.edu/web/teaching/computer-vision-reading-group

Index● Introduction ● Probabilistic Interpretation of the method● Methods and Experiments

○ PPGN-x: DAE model of p(x)○ DGN-AM: sampling without a learned prior○ PPGN-h: Generator and DAE model of p(h)○ Joint PPGN-h: joint Generator and DAE

● Further Experiments○ Image Generation: Captioning○ Image Generation: Multifaceted Feature Visualization○ Image inpainting

● Conclusions

Introduction

Interpretation of different frameworks to generate images maximizing:

p(x, y) = p(x)*p(y|x)

Prior Condition

Encourages to look realistic

Encourages to look from a particular class

Introduction

Image Generation:

● High Resolution Images (227x227)

GANs struggle to Generate >64x64 Images

Introduction

Image Generation:

● High Resolution Images

● Intra-Class Variance

Introduction

Image Generation:

● High Resolution Images

● Intra-Class Variance

● Inter-Class Variance (1000-ImageNet classes)




● Conclusions

Probabilistic Interpretation of the methodMetropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples from a distribution p(x):

Probabilistic Interpretation of the methodMetropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples:

Current state


Future State Current state


Future State Current state Gradient to the natural manifold of

p(x)


Gradient to the natural manifold of

p(x)

NoiseFuture State Current state

Probabilistic Interpretation of the method

Future State Current state Gradient to the natural manifold

of p(x)

Noise


p(x)


p(x)

Step towards an image that causes the classifier to produce a higher score for class C

Step towards a more generic image

Noise


xt

Rough example


y_co = Content activations y_st = Style activationsRough example


xt+iRough example




● Conclusions

MethodWhy Plug & Play ?




● Conclusions

Method | PPGN-x: DAE model of p(x)What a Denoising Autoencoder is?

x

h(x)

R(x)


x_noise

h(x)

x

N(0,σ^2)

R(x)


x_noise

h(x)

x

N(0,σ^2)

R(x)

Method | PPGN-x: DAE model of p(x)

Method | PPGN-x: DAE model of p(x)

Method | PPGN-x: DAE model of p(x)1) Poorly modeled data, blurry 2) Slow changes




● Conclusions

Method | DGN-AM: sampling without a learned priorDeep Generator Network-based Activation Maximization

It is faster if we move over h subspace instead of the x

fc6AlexNet

Method | DGN-AM: sampling without a learned priorDeep Generator Network-based Activation Maximization

Discriminator 1/0

AlexNet

fc6

Method | DGN-AM: sampling without a learned priorOnce we trained the network G we find the equation for the MALA algorithm




No learned prior No noise

Method | DGN-AM: sampling without a learned prior

+ Different modes from different starts- Same image after many steps- Low mixing speed




● Conclusions

Method | PPGN-h: Generator and DAE model of p(h)

A 7 layers DAE is added to model the prior p(h) in order to increase the mixing speed

Method | PPGN-h: Generator and DAE model of p(h)

The equation is the following:

Prior p(h) Conditioned Gradient

Noise

Method | PPGN-h: Generator and DAE model of p(h)- Similar to the last case. Low diversity- p(h) model learned by DAE is too simple




● Conclusions

Method | Joint PPGN-h: joint Generator and DAEIn order to model p(h) in a more complex way

DAE: h/fc6 → ? → h/fc6



Joint Generator and DAE: h/fc6 x h/fc6G E



Joint Generator and DAE: h/fc6 x h/fc6G E

With the same existing network we train the Generator G to act as a DAE in conjunction with the E network

Method | Joint PPGN-h: joint Generator and DAE

AlexNet

Equation is the same than before

Method | Joint PPGN-h: joint Generator and DAE- Faster mixing- Better quality

Method | Joint PPGN-h: joint Generator and AE

AlexNet

Equation is the same than before

Method | Joint PPGN-h: joint Generator and AE- Faster mixing- Better quality

Method | Joint PPGN-h: joint Generator and DAE Noise sweepsFor the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:

fc6N(0, ) +

Method | Joint PPGN-h: joint Generator and AE Noise sweepsFor the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:

Method | Joint PPGN-h: joint Generator and AE Noise sweeps

Method | Joint PPGN-h: joint Generator and AE Noise sweeps

We can still recover large information from the image when mapping with a lot of noise.Many → one.

Method | Joint PPGN-h: joint Generator and DAE Combination of Losses

Comparison of Losses:

● Real Images

●

●

●

●



Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively






● Conclusions

Further Experiments | Captioning

MS-COCO Dataset

Further Experiments | Captioning




● Conclusions

Further Experiments | MFVMultifaceted Feature Visualization

Multifaceted Feature Visualization

Further Experiments | MFV




● Conclusions

Further Experiments | InpaintingMultifaceted Feature Visualization





Conclusions

● Only using GANs for the reconstruction, GANs collapse into fewer modes, far from the original p(x).

● Using extra Losses it is possible to better reconstruct the images even for 1000 classes and for higher resolution. Mapping one-to-one helps to prevent typical latent → missing modes.

● It would be great to generate also the embedding space for this super-resolution multi-class images instead of using a supervised learned space.

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (UPC Reading Group)

Data & Analytics