[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

[Kim+ ICML2012] Dirichlet Process

with Mixed Random Measures : A

Nonparametric Topic Model for

Labeled Data

2012/07/28

Nakatani Shuyo @ Cybozu Labs, Inc

twitter : @shuyo

LDA(Latent Dirichlet Allocation)

[Blei+ 03]

• Unsupervised Topic Model

– Each word has an unobserved topic

• Parametric

– The topic size K is given in advance

via Wikipedia

Labeled LDA [Ramage+ 09]

• Supervised Topic Model

– Each document has an observed label

• Parametric

via [Ramage+ 09]

Generative Process for L-LDA

• 𝜷𝑘~Dir 𝜼

• Λ𝑘𝑑~Bernoulli Φ𝑘

• 𝜽 𝑑 ~Dir 𝜶 𝑑

– where 𝜶 𝑑 = 𝛼𝑘 𝑘 Λ𝑘𝑑=1

• 𝑧𝑖𝑑~Multi 𝜽 𝑑

• 𝑤𝑖𝑑~Multi 𝜷

𝑧𝑖𝑑

via [Ramage+ 09]

restricted to labeled parameters

topics corresponding to observed labels

Pros/Cons of L-LDA

• Pros

– Easy to implement

• Cons

– It is necessary to specify label-topic

correspondence manually

• Its performance depends on the corresponds

※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py

via [Ramage+ 09]

DP-MRM [Kim+ 12]

– Dirichlet Process with Mixed Random Measures

• Supervised Topic Model

• Nonparametric

– K is not the topic size, but the label size

𝑥𝑗𝑖 𝐻 𝐺𝑗 𝜃𝑗𝑖 𝐺0𝑘

𝛽 𝛾𝑘

𝑟𝑗 𝜆j

𝜂 𝐾

𝑁𝑗

Generative Process for DP-MRM

• 𝐻 = Dir 𝛽

• 𝐺0𝑘~DP 𝛾𝑘 , 𝐻

• 𝜆𝑗~Dir 𝒓𝑗𝜂 where 𝒓𝑗 = 𝐼𝑘∈label 𝑗

• 𝐺𝑗~DP 𝛼, 𝜆𝑗𝑘𝐺0𝑘

𝑘∈label 𝑗

• 𝜃𝑗𝑖~𝐺𝑗 , 𝑥𝑗𝑖~𝐹 𝜃𝑗𝑖 = Multi 𝜃𝑗𝑖

Each label has a random measure as topic space

mixed random measures

𝑥𝑗𝑖 𝐻 𝐺𝑗 𝜃𝑗𝑖 𝐺0𝑘

𝛽 𝛾𝑘

𝑟𝑗 𝜆j

𝜂 𝐾

𝑁𝑗

Stick Breaking Process

• 𝑣𝑙𝑘~Beta 1, 𝛾𝑘 , 𝜋𝑙

𝑘 = 𝑣𝑙𝑘 1 − 𝑣𝑑

𝑘𝑙−1𝑑=0

• 𝜙𝑙𝑘~𝐻, 𝐺0

𝑘 = 𝜋𝑙𝑘𝛿

𝜙𝑙𝑘

∞𝑙=0

• 𝜆𝑗~Dir 𝒓𝑗𝜂 , 𝑤𝑗𝑡~Beta 1, 𝛼 , 𝜋𝑗𝑡 = 𝑤𝑗𝑡 1 − 𝑤𝑗𝑑𝑡−1𝑑=0

• 𝑘𝑗𝑡~Multi 𝜆𝑗 , 𝜓𝑗𝑡~𝐺0𝑘𝑗𝑡 , 𝐺𝑗 = 𝜋𝑗𝑡𝛿𝜓𝑗𝑡

∞𝑡=0

Chinese Restaurant Franchise

• 𝑡𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document

• 𝑘𝑗𝑡, 𝑙𝑗𝑡: dish indexes on 𝑡-th table of 𝑗-th

document

This layer consists on only a single DP G0

on normal HDP

Inference (1)

• Sampling 𝑡

Inference (2)

• Sampling 𝑘 and 𝑙

Experiments

via [Kim+ 12]

• DP-MRM gives label-topic probabilistic

corresponding automatically.

• L-LDA can also predict single labeled document to

assign a common second label to any documents.

via [Kim+ 12]

References

• [Kim+ ICML2012] Dirichlet Process with Mixed

Random Measures : A Nonparametric Topic

Model for Labeled Data

• [Ramage+ EMNLP2009] Labeled LDA : A

supervised topic model for credit attribution in

multi-labeled corpora

• [Blei+ 2003] Latent Dirichlet Allocation

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

supervised topic model

latent dirichlet allocation

topic size

dirichlet process

generative process

labeled lda

kim icml2012

lda

Technology

2.1 Motivationbouchard/courses/stat547-sp2011/notes-p… ·...

Nonparametric Bayesian Methods in Machine Learning...

Nonparametric Probabilistic...

Efficient Decomposed Learning for Structured Prediction...

Dirichlet Distribution, Dirichlet Process and Dirichlet ...

Bayesian Nonparametric Modelling · 2008-04-25 · Outline....

Dirichlet Mixtures, the Dirichlet Process, and the Structure...

Bayesian Nonparametric Ordination for the Analysis of...

AMS 241: Bayesian Nonparametric Methods Notes 2 Dirichlet...

Dirichlet Processes and Nonparametric Bayesian...

Nonparametric Bayesian Image Segmentation · Bayesian...

Dirichlet - Biography

Bayesian Nonparametric Mixture Modelling: Methods and...

The Stationary, Continuous time, Discrete Space Model with.....

Dirichlet Processes and Nonparametric Bayesian …•I:...

ICML2012読み会 Scaling Up Coordinate Descent Algorithms.....