Top Banner
A Markov Random Field Model for Term Dependencies Donald Metzl er W. Bruce Cro ft Present by Chia-Ha o Lee
27

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Dec 30, 2015

Download

Documents

Ashlyn Richards
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

A Markov Random Field Model for Term Dependencies

Donald Metzler W. Bruce Croft

Present by Chia-Hao Lee

Page 2: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

2

outline

• Introduction• Model

– Overview– Variants– Potential Functions– Training

• Experimental Results• Conclusions

Page 3: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

3

Introduction

• There is rich history of statistical models for information retrieval, including the binary independence model (BIM), language modeling, inference network model, and so on.

• It is well known that dependencies exist between terms in a collection of text.

• For example, with a SIGIR proceeding, occurrences of certain pairs of terms are correlated, such as information and retrieval.

Page 4: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

4

Introduction

• Unfortunately, estimating statistical models for general term dependencies is infeasible, due to data sparsity.

• For this reason, most retrieval models assume some form of independence exists between terms.

• Most work on modeling term dependencies in the past has focused on phrases/proximity or term co-occurrences. Most of these models only consider dependencies between pairs of terms.

• Several recent studies have examined term dependence models for the language modeling framework.

Page 5: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

5

Model

• Markov random fields (MRF), also called undirected graphical models, are commonly used in the statistical machine learning domain to succinctly model joint distributions.

• We use MRFs to model the joint distribution over queries Q and documents D, parameterized by Λ.

DQP ,

Page 6: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

6

Model

• A markov random field is constructed from a graph G.• The nodes in the graph represent random variables, and

the edges define the independence semantics between the random variables.

• In this model, we assume G consists of query nodes and a document node D, such as the graphs in the figure.

GCc

cZ

DQP ;1

,

nqqQ ,,1

GC : the set of cliques in G

; : a non-negative potential function over clique configurations parameterized by Λ

DQ GCc

cZ,

; :normalizes the distribution

Page 7: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

7

Model

• For ranking purposes we compute the posterior:

• As noted above, all potential functions must be non-negative, and are must commonly parameterized as:

GCc

rank

rank

c

QPDQP

QP

DQPQDP

;log

log,log

,

cfc c exp; cf : real-valued feature function over clique values

c : the weight given to that particular feature function

Page 8: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

8

Model

• Substituting this back into ranking function, we end up with the following ranking function

• To utilize the model, the following steps must be taken for each query Q:– Construct a graph representing the query term dependencies to

model – Define a set of potential functions over the cliques of this graph– Rank documents in descending order of

1

GCc

c

rank

cfQDP

QDP

Page 9: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

9

Model

• We now describe and analyze three variants of the MRF model, each with different underlying dependence assumptions.– Full independence (FI)– Sequential dependence (SD)– Full dependence (FD)

Page 10: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

10

Model

• The full independence variant makes the assumption that query terms are independent given some document D.

• The likelihood of query term occurring is not affected by the occurrence of any other query term, or more succinctly,

.

• The sequential dependence variant assumes a dependence between neighboring query terms.

• Formally, this assumption states that only for nodes that are not adjacent to .

iq

iq

DqPqDqP iiji ,

DqPqDqP iji ,

iqjq

Page 11: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

11

Model

• The full dependence variant, all query terms are in some way dependent on each other.

• Graphically, a query of length n translates into the complete graph , which includes edges from each query node to the document node D.

1nK

Page 12: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

12

Model

• The potential functions φ play a very important role in how accurate our approximation of the true joint distribution is.

• For example : Consider a document D on the topic of information retrieval.

Using the sequential dependence variant, we would expect

, as the term

information and retrieval are much more “compatible” with the topicality

of document D than the terms information and assurance.

Dassurance,n,informatioDretrieval,n,informatio

Page 13: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

13

Model

• Since documents are ranked by Equation 1, it is also important that the potential functions can be computed efficiently.

• Based on these criteria and previous research on phases and term dependence, we focus on three types of potential functions.

• These potential functions are attempt to abstract the idea of term co-occurrence.

Page 14: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

14

Model

• Since potentials are over cliques in the graph, we now proceed to enumerate all of the possible ways graph cliques are formed in our model and how potential functions are defined for each.

• The simplest type of clique that can appear in our graph is a 2-clique consisting of an edge between a query term and the document D.

iq

Page 15: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

15

Model

• In keeping with simple to compute measures, we define this potential as:

C

cf

D

tf

DqPc

ii qD

DqDT

iTT

,1log

log

DqP i : a smoothed language modeling estimate

Dwtf , : the number of the terms w occurs in document D

wcf : the number of times term w occurs in the entire collection

D : total number of terms in the document D

C : the length of the collection

Page 16: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

16

Model

• Next, we consider cliques that contain two or more query terms.

• For example: In the query train station security measures, if any of the sub-phrases,

train station, train station security, station security measures, or

security measures appear in a document then there is strong

evidence in favor of relevance.

Page 17: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

17

Model

• Therefore, for every clique that contains a contiguous set of two or more terms and the document node D,

we apply the following “ordered” potential function:

C

cf

D

tf

DqqPc

kiikii qqD

DqqDO

kiiOO

,,1#,,,1#1log

,,1#log

kii qqcf1# : the number of times term ω occurs in the entire collection

D : total number of terms in the document D

C : the length of the collection

kii qq ,,

Dqq kiitf ,1# : the number of the times the exact phrase occurs in document D kii qq ,,

Page 18: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

18

Model

• Although the occurrence of contiguous sets of query terms provide strong evidence of relevance, it is also the case that the occurrence of non-contiguous sets of query terms can provide valuable evidence.

• In the previous example, documents containing the terms train and security within some short proximity of one another also provide additional evidence towards relevance.

Page 19: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

19

Model

• For our purposes, we construct an “unordered” potential function over cliques that consist of sets of two or more query terms and the document node D. Such potential functions have the following from:

C

cf

D

tf

DqquwNPc

jiji qquwN

D

DqquwN

DU

jiUU

,,#,,,#1log

,,#log

DqquwN jitf ,#

: the number of the times the terms appear ordered or unordered with a window N terms.

ji qquwNcf # : the number of times term ω occurs in the entire collection

D : total number of terms in the document D

C : the length of the collection

ji qq ,,

Page 20: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

20

Model

• Using these potential functions, we derive the following specific ranking function:

UOcUU

OcOO

TcTT

GCcc

rank

cfcfcf

cfQDP

Page 21: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

21

Experimental Results

• We make use of the Associated Press and Wall Street Journal sub-collections of TREC, which are small homogeneous collections, and two web collections, WT10g and GOV2, which are considerably larger and less homogeneous.

Page 22: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

22

Experimental Results

• Full independence

Page 23: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

23

Experimental Results

• Sequential dependence

Page 24: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

24

Experimental Results

• Full dependence

Page 25: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

25

Conclusions

• In this paper, we develop as general term dependence model that can make use of arbitrary text feature.

• Three variants of the model are described, where each capture different dependencies between query terms.

Page 26: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

26

Markov Random Fields

• Let be random variables taking values in some finite set S, and let be a finite graph such that , whose elements will sometime be called sites.

• For a set , let define its neighbor (or boundary) set: all elements in that have a neighbor in A. For

, let .

• The random variables are said to define a Markov random field if, for any vector :

nXX ,,1 ENG ,

NN ,,1

NA AAN \

Ni ii

NSx

ijxXxXiNjxXxX jjiijjii ,Pr\,Pr

Page 27: A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

27

Potentials

• A potential is a function indexed by subsets of N on the space . We will write potentials as for , .

• Given a full set of potentials, the energy of a configuration w will be defined as:

• Using the energy, we can define a probability measure, P, from a set of potentials by:

NS NA wVANSw

NA

A wVwU

Z

wUwP

exp

NSw

wUZ exp