Top Banner
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangy uan Wu, Guihong Cao, Dependence La nguage Model for Information Retri eval, SIGIR 2004
25

Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Jan 19, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Dependence Language Model for Information Retrieval

Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Infor

mation Retrieval, SIGIR 2004

Page 2: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Reference

• Structure and performance of a dependency language model. Ciprian, David Engle and et al. Eurospeech 1997.

• Parsing English with a Link Grammar. Daniel D. K. Sleator and Davy Temperley. Technical Report CMU-CS-91-196 1991.

Page 3: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Why we use independence assumption?

• The independence assumption is one of the assumptions widely adopted in probabilistic retrieval theory.

• Why?– Make retrieval models easier.

– Make retrieval operation tractable.

• The shortage of independence assumption– Independence assumption does not hold in textual data.

Page 4: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Latest ideas of dependence assumption

• Bigram– Some language modeling approach try to incorporate word frequency b

y using bigram.

– Shortage:

• Some of word dependencies not only exist between adjacent words but also exist at more distant.

• Some of adjacent words are not exactly connected.

– Bigam language model showed only marginally better effectiveness than the unigram model.

• Bi-term– Bi-term language model is similar to the bigram model except the const

raint of order in terms is relaxed.

– “information retrieval” and “retrieval of information” will be assigned the same probability of generating the query.

Page 5: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Structure and performance of a dependency language model

Page 6: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Introduction

• This paper present a maximal entropy language model that incorporates both syntax and semantics via a dependency grammar.

• Dependency grammar: express the relations between words by a directed graph which can incorporate the predictive power of words that lie outside of bigram or trigram range.

Page 7: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Introduction

• Why we use Ngram– Assume

if we want to record

we need to store independent parameters

• The drawback of Ngram– Ngram blindly discards relevant words that lie N or more positions in t

he past.

nwwwwS ...,,, 210)...|()...|()()( 10010 nn wwwPwwPwPSP

)...|( 10 nn wwwP

)1(1 VV i

Page 8: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Structure of the model

Page 9: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Structure of the model

• Develop an expression for the joint probability , K is the linkages in the sentence.

• Then we get

• Assume that the sum is dominated by a single term, then

),( KSP

K

KSPSP ),()(

),(maxarg

),(),(*

*

KSPKwhere

KSPKSP

K

K

),()( *KSPSP

Page 10: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A dependency language model of IR

• A query we want to rank – Previous work:

• Assume independence between query terms :

– New work:

• Assume that term dependencies in a query form a linkage

)...( 1 mqqQ )|( DQP

mi i DqPDQP...1

)|()|(

L L

DLQPDLPDLQPDQP ),|()|()|,()|(

Page 11: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A dependency language model of IR

• Assume that the sum over all the possible Ls is dominated by a single term

• Assume that each term is dependent on exactly one related query term generated previous.

L L

DLQPDLPDLQPDQP ),|()|()|,()|(

L

DLQP )|,(

*L

)|(maxarg

),|()|()|(

DLPLthatsuch

DLQPDLPDQP

L

hq iq jq

Page 12: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A dependency language model of IR

Lji ji

jijh

Lji i

ijh

Ljiijh

DLqPDLqP

DLqPDLqqPDqP

DLqP

DLqqPDqP

DLqqPDqPDLQP

),(

),(

),(

),|(),|(

),|(),|,()|(

),|(

),|,()|(

),,|()|(),|(

)|(maxarg

),|()|()|(

DLPLthatsuch

DLQPDLPDQP

L

hq iq jq

Page 13: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A dependency language model of IR

• Assume– The generation of a single term is independent of L

• By this assumption, we would have arrived at the same result by starting from any term. L can be represented as an undirected graph.

)|(),|( DqPDLqP jj

mi Lji ji

jii

hj Lji ji

jijh

DqPDqP

DLqqPDqP

DqPDqP

DLqqPDqPDqPDLQP

...1 ),(

),(

)|()|(

),|,()|(

)|()|(

),|,()|()|(),|(

Lji ji

jijh DLqPDLqP

DLqPDLqqPDqP

),( ),|(),|(

),|(),|,()|(

Page 14: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A dependency language model of IR

)|(maxarg

),|()|()|(

DLPLthatsuch

DLQPDLPDQP

L

)|()|(

),|,(log),|,(

),|,()|(log)|(log)|(log...1 ),(

DqPDqP

DLqqPDLqqMI

DLqqMIDqPDLPDQP

ji

jiji

mi Ljijii

取 log

mi Lji ji

jii DqPDqP

DLqqPDqPDLQP

...1 ),( )|()|(

),|,()|(),|(

Page 15: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Parameter Estimation

• Estimating– Assume that the linkages are independent.

– Then count the relative frequency of link l between and given that they appear in the same sentence.

)|( DLP

Ll

DlPDLP )|()|(

),(

),,(),|(

ji

jiji qqC

RqqCqqRF

mi Lji

jii DLqqMIDqPDLPDQP...1 ),(

),|,()|(log)|(log)|(log

iq jq

Have a link in a sentence

in training dataA score

)|(),|(

),|(

),(

QlPqqRF

qqRF

ljiji

ji

The link frequency of

query i and query j

Page 16: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Parameter Estimation

),|()|( ji qqRFQlP

Lji

jiLL

qqRFQLPL),(

),|(maxarg)|(maxarg

Ll Lji

ji qqRFDlPDQLPDLP),(

),|()|(),|()|(

)|(),|(

),|(

),(

QlPqqRF

qqRF

ljiji

ji

assumption

),|(),|()1(),|( jiCjiDji qqRFqqRFqqRF

Assumption: )|()|( DLPQLP

Page 17: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Parameter Estimation

• Estimating– The document language model is smoothed with a Dirichlet prior

)|( DqP i

ii qiC

iC

qiC

iCiD

iii

qC

qC

qC

qCqC

CqPDqPDqP

)(

)(

)(

)()()1(

)|()|()1()|('

Dirichilet distribution

Constant discount

Page 18: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Parameter Estimation

• Estimating ),|,( DLqqMI ji

),(*,),*,(

),,(log

)),(*,)(),*,((

),,(log

),|(),|(

),|,(log),|,(

RqCRqC

NRqqC

NRqCNRqC

NRqqC

DLqPDLqP

DLqqPDLqqMI

jDiD

jiD

jDiD

jiD

ji

jiji

)(*,*, RCN D

Page 19: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Experimental Setting

• Stemmed and stop words were removed.

• Queries are TREC topics 202 to 250 on TREC disk 2 and 3.

Page 20: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

The flow of the experimental

Find the linkage of query

query

Find the max L by maxlP(l|Q)

Get

document Training data For weight computation

Count the frequency

),|( and

),|(

jiC

jiD

qqRF

qqRF

),|( ji qqRF

Get P(L|D)

Count the frequency

)( and )( iDiC qCqC Get )|( DqP i

Count the frequency

)(*,*, and

),,( and

),*,( and ),*,(

RC

RqqC

RqCRqC

D

jiD

iDiD Get ),|,( DLqqMI ji

combine Ranking

document

Page 21: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Result-BM & UG

• BM: binary independent retrieval

• UG: unigram language model approach

• UG achieves the performance similar to, or worse than, that of BM.

Page 22: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Result- DM

• DM: dependency model

• The improve of DM over UG is statistically significant.

Page 23: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Result- BG

• BG: bigram language model

• BG is slightly worse than DM in five out of six TREC collections but substantially outperforms UG in all collection.

Page 24: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Result- BT1 & BT2

• BT: bi-term language model

)),|(),|((2

1),|( 1111 DqqPDqqPDqqP iiBGiiBGiiBT

)}(),(min{2

),(),(),|(

1

1112

iDiD

iiDiiDiiBT qCqC

qqCqqCDqqP

Page 25: Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

Conclusion

• This paper introduce the linkage of a query as a hidden variable.

• Generate each term in turn depending on other related terms according to the linkage.– This approach cover several language model approaches as special case

s.

• The experimental of this paper outperforms substantially over unigram, bigram and classical probabilistic retrieval model.