Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Disentangling Disentanglement inVariational AutoencodersICML 2019

Emile Mathieu?, Tom Rainforth?, N. Siddharth?, Yee Whye TehJune 12, 2019

Departments of Statistics and Engineering Science, University of Oxford

Variational Autoencoders

GenerativeModel Inference

Modelzl(gender)

zm (beard)zn

(makeup)

Factors

Disentanglement

= Independence

zl(gender)

zm (beard)zn

(makeup)

MeaningfulFactors

Disentanglement = Independence

Modelzl(shape)

zm (angle)zn

(scale)

IndependentFactors

Decomposition ∈ {Independence, Clustering, Sparsity, …}

Modelzl(gender)

zm (beard)zn

(makeup)

Co-RelatedFactors

Decomposition: A Generalization of Disentanglement

Characterise decomposition as the fulfilment of two factors:

(a) level of overlap between encodings in the latent space,(b) matching between the marginal posterior qφ(z) and structured

prior p(z) to constrain with the required decomposition.

Decomposition: An Analysis

Desired StructureTargetStructure

Insufficient Overlap

Insu�cient

Overlap

q�(z|x) p✓(x|z)pD(x) q�(z) p(z) p✓(x)

Too Much Overlap

q�(z|x) p✓(x|z)

Too Much

Overlap

pD(x) q�(z) p(z) p✓(x)

Appropriate Overlap

Appropriate

Overlap

q�(z|x) p✓(x|z)pD(x) q�(z) p(z) p✓(x)

Overlap — Deconstructing the β-VAE

Lβ(x) = Eqφ(z|x)[logpθ(x|z)]− β · KL(qφ(z|x)||p(z))= L(x) (πθ,β ,qφ)︸︷︷︸

ELBO with β-annealed prior

+(β − 1) · Hqφ︸︷︷︸maxent

+ log Fβ︸︷︷︸constant

Implicationsβ-VAE disentangles largely by controlling the level of overlapIt places no direct pressure on the latents to be independent!

Decomposition: Objective

Lα,β(x) = Eqφ(z|x)[logpθ(x | z)] Reconstruct observations

− β · KL(qφ(z | x) ‖ p(z)) Control level of overlap

− α · D(qφ(z),p(z)) Impose desired structure

Decomposition: Generalising Disentanglement

Independence: p(z) = N (0,σ?)

Figure 1: β-VAE trained on 2D Shapes1 computing disentanglement2.

1Matthey et al., dSprites: Disentanglement testing Sprites dataset, p. 1.2Kim and Mnih, “Disentangling by Factorising”, p. 2.

Clustering: p(z) = ∑k ρk · N (µk,σk)

β = 0.01 β = 0.5 β = 1.0 β = 1.2

α = 1 α = 3 α = 5 α = 8

Figure 2: Density of aggregate posterior qφ(z) with different α, β for thepinwheel dataset.3

3http://hips.seas.harvard.edu/content/synthetic-pinwheel-data-matlab. 7

Sparsity: p(z) = ∏d (1− γ) · N (zd; 0, 1) + γ · N (zd; 0, σ20)

0 5 10 15 20 25 30 35 40 45Latent dimension

TrouserDressShirt

Figure 3: Sparsity of learnt representations for the Fashion-MNIST4 dataset.

4Xiao, Rasul, and Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms.

(a) d = 49 (b) d = 30 (c) d = 19 (d) d = 40leg separation dress width shirt fit sleeve style

Figure 3: Latent space traversals for “active” dimensions4.

0 200 400 600 800 1000alpha

γ= 0, β= 0.1γ= 0.8, β= 0.1

γ= 0, β= 1γ= 0.8, β= 1

γ= 0, β= 5γ= 0.8, β= 5

Figure 3: Sparsity vs regularisation strength α (higher better)4.

We propose and develop:

• Decomposition: a generalisation of disentanglement involving:(a) overlap of latent encodings(b) match between qφ(z) and p(z)

• A theoretical analysis of the β-VAE objective showing it primarilyonly contributes to overlap.

• An objective that incorporates both factors (a) and (b).• Experiments that showcase efficacy at different decompositions:• independence • clustering • sparsity

Emile Mathieu Tom Rainforth N. Siddharth Yee Whye Teh

Code Paper

iffsid/disentangling-disentanglementarXiv:1812.02833

Come talk to us at our poster: #59

Disentangling Disentanglement in [-0.5ex] Variational ...12-11-00)-12-11-35-4811... · EmileMathieu TomRainforth N.Siddharth YeeWhyeTeh Code Paper iffsid/disentangling-disentanglement

Documents

Disentangling Blurring - Duke University

Hi-CMD: Hierarchical Cross-Modality Disentanglement for...

Disentangling Japanese Knotweed

Disentangling Content and Pose with an Adversarial...

Conversation Disentanglement in Sports Discourse

Understanding Minds: Disentangling ToM

Disentangling Disentanglement

Geometric Disentanglement for Generative Latent Shape...

Preventing Disentanglement by Symmetry Manipulations

Proceedings of the[1.5ex]16th International Workshop[0.5ex.....

Disentangling the Web of Life

CausalVAE: Structured Causal Disentanglement in Variational....

Isolating Sources of Disentanglement in Variational ... ·....

Disentangling Disentanglement in Variational...

Disentangling complex phenotype-environment relationships—

The disentanglement of the neural and experiential ...