Top Banner
Disentangling Disentanglement in Variational Autoencoders ICML 2019 Emile Mathieu ? , Tom Rainforth ? , N. Siddharth ? , Yee Whye Teh June 12, 2019 Departments of Statistics and Engineering Science, University of Oxford
19

Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Aug 14, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Disentangling Disentanglement inVariational AutoencodersICML 2019

Emile Mathieu?, Tom Rainforth?, N. Siddharth?, Yee Whye TehJune 12, 2019

Departments of Statistics and Engineering Science, University of Oxford

Page 2: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Variational Autoencoders

x1

x2

x3

x4

x5

x

z1

z2

z3

z4

GenerativeModel Inference

Modelzl(gender)

zm (beard)zn

(makeup)

Factors

1

Page 3: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Disentanglement

= Independence

x1

x2

x3

x4

x5

x

z1

z2

z3

z4

GenerativeModel Inference

Model

xixj

zl(gender)

zm (beard)zn

(makeup)

MeaningfulFactors

1

Page 4: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Disentanglement = Independence

x1

x2

x3

x4

x5

x

z1

z2

z3

z4

GenerativeModel Inference

Modelzl(shape)

zm (angle)zn

(scale)

IndependentFactors

1

Page 5: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition ∈ {Independence, Clustering, Sparsity, …}

x1

x2

x3

x4

x5

x

z1

z2

z3

z4

GenerativeModel Inference

Modelzl(gender)

zm (beard)zn

(makeup)

Co-RelatedFactors

1

Page 6: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: A Generalization of Disentanglement

Characterise decomposition as the fulfilment of two factors:

(a) level of overlap between encodings in the latent space,(b) matching between the marginal posterior qφ(z) and structured

prior p(z) to constrain with the required decomposition.

2

Page 7: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: An Analysis

Desired StructureTargetStructure

p(z)

3

Page 8: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: An Analysis

Insufficient Overlap

Insu�cient

Overlap

q�(z|x) p✓(x|z)pD(x) q�(z) p(z) p✓(x)

3

Page 9: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: An Analysis

Too Much Overlap

q�(z|x) p✓(x|z)

Too Much

Overlap

pD(x) q�(z) p(z) p✓(x)

3

Page 10: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: An Analysis

Appropriate Overlap

Appropriate

Overlap

q�(z|x) p✓(x|z)pD(x) q�(z) p(z) p✓(x)

3

Page 11: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Overlap — Deconstructing the β-VAE

Lβ(x) = Eqφ(z|x)[logpθ(x|z)]− β · KL(qφ(z|x)||p(z))= L(x) (πθ,β ,qφ)︸ ︷︷ ︸

ELBO with β-annealed prior

+(β − 1) · Hqφ︸ ︷︷ ︸maxent

+ log Fβ︸ ︷︷ ︸constant

Implicationsβ-VAE disentangles largely by controlling the level of overlapIt places no direct pressure on the latents to be independent!

4

Page 12: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Objective

Lα,β(x) = Eqφ(z|x)[logpθ(x | z)] Reconstruct observations

− β · KL(qφ(z | x) ‖ p(z)) Control level of overlap

− α · D(qφ(z),p(z)) Impose desired structure

5

Page 13: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Generalising Disentanglement

Independence: p(z) = N (0,σ?)

Figure 1: β-VAE trained on 2D Shapes1 computing disentanglement2.

1Matthey et al., dSprites: Disentanglement testing Sprites dataset, p. 1.2Kim and Mnih, “Disentangling by Factorising”, p. 2.

6

Page 14: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Generalising Disentanglement

Clustering: p(z) = ∑k ρk · N (µk,σk)

β = 0.01 β = 0.5 β = 1.0 β = 1.2

α=0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

β=0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

α = 1 α = 3 α = 5 α = 8

Figure 2: Density of aggregate posterior qφ(z) with different α, β for thepinwheel dataset.3

3http://hips.seas.harvard.edu/content/synthetic-pinwheel-data-matlab. 7

Page 15: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Generalising Disentanglement

Sparsity: p(z) = ∏d (1− γ) · N (zd; 0, 1) + γ · N (zd; 0, σ20)

0 5 10 15 20 25 30 35 40 45Latent dimension

0.0

0.2

0.4

0.6

Avg.

late

nt m

agni

tude

TrouserDressShirt

Figure 3: Sparsity of learnt representations for the Fashion-MNIST4 dataset.

4Xiao, Rasul, and Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms.

8

Page 16: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Generalising Disentanglement

Sparsity: p(z) = ∏d (1− γ) · N (zd; 0, 1) + γ · N (zd; 0, σ20)

(a) d = 49 (b) d = 30 (c) d = 19 (d) d = 40leg separation dress width shirt fit sleeve style

Figure 3: Latent space traversals for “active” dimensions4.

4Xiao, Rasul, and Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms.

8

Page 17: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Decomposition: Generalising Disentanglement

Sparsity: p(z) = ∏d (1− γ) · N (zd; 0, 1) + γ · N (zd; 0, σ20)

0 200 400 600 800 1000alpha

0.2

0.3

0.4

0.5Av

g. N

orm

alise

d Sp

arsit

y

γ= 0, β= 0.1γ= 0.8, β= 0.1

γ= 0, β= 1γ= 0.8, β= 1

γ= 0, β= 5γ= 0.8, β= 5

Figure 3: Sparsity vs regularisation strength α (higher better)4.

4Xiao, Rasul, and Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms.

8

Page 18: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Recap

We propose and develop:

• Decomposition: a generalisation of disentanglement involving:(a) overlap of latent encodings(b) match between qφ(z) and p(z)

• A theoretical analysis of the β-VAE objective showing it primarilyonly contributes to overlap.

• An objective that incorporates both factors (a) and (b).• Experiments that showcase efficacy at different decompositions:• independence • clustering • sparsity

9

Page 19: Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Emile Mathieu Tom Rainforth N. Siddharth Yee Whye Teh

Code Paper

iffsid/disentangling-disentanglementarXiv:1812.02833

Come talk to us at our poster: #59