Top Banner
A Simple SGVB (Stochastic Gradient Variational Bayes) for the CTM (Correlated Topic Model) Tomonari MASADA ( 正正正正 ) Nagasaki University ( 正正正正 ) [email protected] APWeb 2016 @ Suzhou
21

A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Apr 13, 2017

Download

Engineering

Tomonari Masada
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

A Simple SGVB(Stochastic Gradient Variational Bayes)

for the CTM(Correlated Topic Model)

Tomonari MASADA ( 正田备也 )Nagasaki University (长崎大学 )

[email protected]

APWeb 2016 @ Suzhou

Page 2: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Aim•Make an informative summary of

large document sets by

•extracting word lists, each relating to

a different and particular topic.

Topic modeling2

Page 3: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Page 4: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Contribution•We propose a new posterior estimation for the correlated topic model (CTM) [Blei+ 07],•an extension of LDA [Blei+ 03] for modeling topic correlations,

•with stochastic gradient variational Bayes (SGVB) [Kingma+ 14].

4

Page 5: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

LDA [Blei+ 03]

•Clustering word tokens by assigning each word token to

one among the topics.• : To which topic is the -th word token in document is assigned?

• : How often is the topic talked about in document ?

•Multinomial distribution for each

• : How often is the word used to talk about the topic ?

•Multinomial distribution for each

discrete variables

continuous variables

5

Page 6: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

CTM [Blei+ 05]

•Clustering word tokens by assigning each word token to

one among the topics.• : To which topic is the -th word token in document is assigned?

• : How often is the topic talked about in document ?

• where (logistic normal distribution)

• : How often is the word used to talk about the topic ?

•Multinomial distribution for each

discrete variables

continuous variables

6

Page 7: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Variational BayesMaximization of ELBO (evidence lower bound)

•VB (variational Bayes) approximates the true posterior.•An approximate posterior is introduced when ELBO is

obtained by Jensen's inequality:

• : discrete hidden variables (topic assignments)• : continuous hidden variables (multinomial parameters)

7

log evidence approximate posterior

Page 8: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Factorization assumption

•We assume the approximate posterior factorizes

as .

•Then ELBO can be written as

8

×discrete continuous

Page 9: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

SGVB[Kingma+ 14]

•SGVB (stochastic gradient variational Bayes) is a general framework for estimating ELBO in VB.

•SGVB is only applicable to continuous distributions .•Monte Carlo integration for expectation

9

Page 10: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Reparameterization

•We use the diagonal logistic normal for approximating the true posterior of .•We can efficiently sample from the logistic normal with reparameterization.

10

Page 11: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Monte Carlo integration

•ELBO is estimated with a sample from the approximate posterior.

• The discrete part is estimated as in the original VB. 11

Page 12: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Parameter updatesNo explicit inversion (only Cholesky factorization)

12

Page 13: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

"Stochastic" gradient•The expectation integrations are estimated by Monte Carlo method.•The derivatives of ELBO depend on samples.•Randomness is incorporated into the maximization of ELBO.•Does this make it easier to avoid local minima?

13

Page 14: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Data sets# docs # word types

NYT 149,890 46,650MOVIE 27,859 62,408

NSF 128,818 21,471

MED 125,490 42,83014

Page 15: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Page 16: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Page 17: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Page 18: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Page 19: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Conclusion•We incorporate randomness into the posterior inference for the CTM by using SGVB.

•The proposed method gives perplexities comparable to those achieved by LDA.

19

Page 20: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Pro/Con•No explicit inversion of covariance matrix is required.

•Careful tuning of gradient descent seems required.

•Only Adam was tested.20

Page 21: A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Future work•Online learning for topic models with NN•NN may achieve a better approximate posterior.

•SGVB can be used to estimate ELBO in a similar

manner.

•Document batches can be fed to VB indefinitely.•Topic word lists are then updated indefinitely.

21