Top Banner
Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Generative Adversarial Text to Image Synthesis 1
39

Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Apr 21, 2018

Download

Documents

vanhuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Applications of GANs

● Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

● Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

● Generative Adversarial Text to Image Synthesis

1

Page 2: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Using GANs for Single Image Super-Resolution

Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi

2

Page 3: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Problem

How do we get a high resolution (HR) image from just one (LR) lower resolution image?

Answer: We use super-resolution (SR) techniques.

http://www.extremetech.com/wp-content/uploads/2012/07/super-resolution-freckles.jpg 3

Page 4: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Previous Attempts

4

Page 5: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN

5

Page 6: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Generator

● G: generator that takes a low-res image ILR and outputs its high-res counterpart ISR

● θG: parameters of G, {W1:L, b1:L}● lSR: loss function measures the difference between the 2 high-res images

6

Page 7: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Discriminator

● D: discriminator that classifies whether a high-res image is IHR or ISR

● θD: parameters of D

7

Page 8: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Perceptual Loss Function

Loss is calculated as weighted combination of:

➔ Content loss

➔ Adversarial loss

➔ Regularization loss

8

Page 9: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Content Loss

Instead of MSE, use loss function based on ReLU layers of pre-trained VGG network. Ensures similarity of content.

● i,j : feature map of jth convolution before ith maxpooling● Wi,j and Hi,j: dimensions of feature maps in the VGG

9

Page 10: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Adversarial Loss

Encourages network to favour images that reside in manifold of natural images.

10

Page 11: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Regularization Loss

Encourages spatially coherent solutions based on total variations.

11

Page 12: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Examples

12

Page 13: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

SRGAN - Examples

13

Page 14: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

Work by Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus

14

Page 15: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Short Background

15

Page 16: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Conditional Generative Adversarial Nets (CGAN)

Mirza and Osindero (2014)

GAN

CGAN 16

Page 17: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Laplacian pyramid

Burt and Adelson (1983) 17

Page 18: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Laplacian pyramid

Burt and Adelson (1983) 18

Page 19: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Laplacian Pyramid Generative Adversarial Network (LAPGAN)

19

Page 20: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Image Generation

20

Page 21: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Training

21

Page 22: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Generation: Coarse to fine

22

Page 23: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Different draws, starting from the same initial 4x4 image

23

Page 24: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

● The Laplacian Pyramid Framework is independent of the Generative Model

Some thoughts on the method

Possible to use a completely different model like Pixel RNN

24

Page 25: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Some thoughts on the method

● The Generative Models at each step can be totally different!

These can also be different models!

25

Page 26: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Some thoughts on the method

● The Generative Models at each step can be totally different!

Low resolution architectureHigh resolution architecture

26

Page 27: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Generative Adversarial Text to Image Synthesis

Author’s code available at: https://github.com/reedscot/icml2016

Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

27

Page 28: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Motivation

Current deep learning models enable us to...

➢ Learn feature representations of images & text➢ Generate realistic images & text

pull out images based on captions

generate descriptions based on images

answer questions about image content

28

Page 29: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Problem - Multimodal distribution

• Many plausible image can be associated with one single text description

• Previous attempt uses Variational Recurrent Autoencoders to generate image from text caption but the images were not realistic enough. (Mansimov et al. 2016)

29

Page 30: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

What GANs can do

• CGAN: Use side information (eg. classes) to guide the learning process

• Minimax game: Adaptive loss function

➢ Multi-modality is a very well suited property for GANs to learn.

30

Page 31: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

The Model - Basic CGAN

Pre-trained char-CNN-RNNLearns a compatibility function of images and text -> joint embedding

31

Page 32: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

The Model - Variations

GAN-CLS

In order to distinguish different error sources:

Present to the discriminator network 3 different types of input. (instead of 2)

Algorithm

32

Page 33: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

The Model - Variations cont.

GAN-INT

In order to generalize the output of G:

Interpolate between training set embeddings to generate new text and hence fill the gaps on the image data manifold.

Updated Equation

GAN-INT-CLS: Combination of both previous variations

{fake image, fake text}

33

Page 34: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Disentangling ❖ Style is background, position & orientation of the object, etc.

❖Content is shape, size & colour of the object, etc.

● Introduce S(x), a style encoder with a squared loss function:

● Useful in generalization: encoding style and content separately allows for different new combinations

34

Page 35: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Training - Data (separated into class-disjoint train and test sets)

Caltech-UCSD Birds

MS COCO

Oxford Flowers

35

Page 36: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Training – Results: Flower & Bird

36

Page 37: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Training – Results: MS COCO Mansimov et al.

37

Page 38: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Training – Results Style disentangling

38

Page 39: Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text ...

Thoughts on the paper

• Image quality

• Generalization

• Future work

39