Top Banner
Online Learning of Semantic Relations Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Online Learning ofSemantic Relations

Nir Grinberg and William M. Pottenger, Ph.D.Rutgers University

03/30/2012 1

Page 2: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Introduction

What are semantic relations?“Barack H. Obama is the 44th President of the

United States”“Barack Obama takes the oath of office as President

of the United States”“Barack Obama, in full Barack Hussein Obama II

(born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009– ) and the first African…”

“X was born in Y” or “X is from Y”, etc.

03/30/2012 2

Page 3: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

IntroductionWhy are we interested in Semantic

Relations?

Information Extraction, Information Retrieval and Question Answering

Building blocks for IDEAs

Interpretability and Generalization of Topic Models

03/30/2012 3

Page 4: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Related Work

Early works: DIPRE (Brin ’98), Snowball (Agichtein et al. 2000)

ACE and MUC-7 Datasets appearing => Supervised methods appear.Using features like extracted entities, POS,

parse tree… ?Kernel functions

Unsupervised: Dirt (Lin et al. ‘01) and USP (Poon et al ‘09)03/30/2012 4

Page 5: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Related WorkTopic Modeling:

Nubbi (Chang et al. 2009)Rel-LDA and Type-LDA

(Yao et al. 2011)

03/30/2012 5Rel-LDA Type-LDA

Page 6: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

What is missing?

Interpretability?

Parallelizable but not O(N)

Interaction with other features?

Higher-Order learning?

03/30/2012 6

Page 7: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

One more Related Work

Pachinko Allocation Model: (PAM) by Li et al. 2007

Capture arbitrary:Topic-Topic

correlationsTopic-Word

correlations

Better than LDA and CTM

03/30/2012 7PAM

Page 8: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Our ApproachSemRel: based on

Type-LDA and PAM.

Adds a layer of abstractionImprove interpretabilityAllow feature

interactions

Variational Inference:Stochastic natural

gradient

03/30/2012 8SemRel

Page 9: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

PreprocessingTokenization, Lemmatization, POS tagging,

NERUsing StanfordNLP toolbox

Dependency Path ParsingUsing MaltParser

Filtering out long paths and syntactically irrelevant

Filtering out infrequent features and entities 03/30/2012 9

Page 10: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Example

“Gamma Knife, made by the Swedish medical technology firm Elekta, focuses low dosage gamma radiation ...”

03/30/2012 10

Page 11: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

The Algorithm

03/30/2012 11

We derived similar online learning algorithms for RelLDA, Type-LDA and PAM

Page 12: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Results

03/30/2012 12

Page 13: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Results

SemRel outperforms Type-LDA:two tailed paired t-test across # topics:

t(4)= -6.01, p<0.002two tailed paired t-test across folds:

p<0.001

Preprocessing is more of bottleneck than the learning algorithm!

03/30/2012 13

Page 14: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Future Work

We’re currently investigating convergence

Complementary qualitative evaluation

Other datasets

Extensions with more features Word, Entities, Higher-Order features, etc.

04/02/2011 14

Page 15: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Conclusions

Yet another topic model, but:

Moved from Bag-Of-Words assumption without breaking the framework

Devised an online learning algorithm

Hopefully, improved on interpretability

04/02/2011 15

Page 16: Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/2012 1.

Q&A

Thank you!

04/02/2011 16