Top Banner
Learning to Rank Methods Hang Li Microsoft Research Asia IBIS 2009 Oct. 21, 2009 Fukuoka Japan 1
63

Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Oct 31, 2019

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods

Hang Li

Microsoft Research Asia

IBIS 2009Oct. 21, 2009

Fukuoka Japan

1

Page 2: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Talk Outline

• What is Learning to Rank

• Learning to Rank Methods– Ranking SVM

– IR SVM

– ListMLE

– Ada Rank

• Learning to Rank Theory

• Learning to Rank Applications

• Future Directions of Learning to Rank Research

2

Page 3: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

What is Learning to Rank?

3

Page 4: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Ranking Problem: Example = Document Retrieval

q

query

documents

ranking of documents

NdddD ,,, 21

),( dqf

4

ranking based on relevance, importance,

preference

),(

),(

),(

11 ,11,1

2,112,1

1,111,1

mm nmmnm

mmm

mmm

dqfd

dqfd

dqfd

Page 5: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Traditional Approach to Search Ranking

5

Page 6: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Probabilistic Model

Nd

d

d

2

1

q

),|(~

),|(~

),|(~

22

11

nn dqrPd

dqrPd

dqrPd

query

documents

ranking of documents

}0,1{

),|(

R

dqrP

6

Page 7: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

BM25

Nd

d

d

2

1

qquery

documents

ranking function

qdw wtfavgdl

dlbkb

wtfk

)()1(

)()1(

7

),|(~

),|(~

),|(~

22

11

nn dqrPd

dqrPd

dqrPd

ranking of documents

Page 8: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank: New Approach to Search Ranking

8

Page 9: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank

1,1

2,1

1,1

1

nd

d

d

q

mnm

m

m

m

d

d

d

q

,

2,

1,

Learning System

Ranking System

1mq

1,1

2,1

1,1

mnm

m

m

d

d

d

9

NdddD ,,, 21

Page 10: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank

1,1

2,1

1,1

1

nd

d

d

q

mnm

m

m

m

d

d

d

q

,

2,

1,

Learning System

Ranking System

1mq

),(

),(

),(

11 ,11,1

2,112,1

1,111,1

mm nmmnm

mmm

mmm

dqfd

dqfd

dqfd

),( dqf

10

NdddD ,,, 21

Page 11: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

1. Data Labeling(rank) 3. Learning

)(xf

2. Feature Extraction

1,1

2,1

1,1

1

nd

d

d

q

mnm

m

m

m

d

d

d

q

,

2,

1,

11 ,1,1

2,12,1

1,11,1

1

nn yd

yd

yd

q

mm nmnm

mm

mm

m

yd

yd

yd

q

,,

2,2,

1,1,

11 ,1,1

2,12,1

1,11,1

nn yx

yx

yx

mm nmnm

mm

mm

yx

yx

yx

,,

2,2,

1,1,

Training Process

11

Page 12: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

1. Data Labeling(rank)

2. Feature Extraction

Testing Process

12

3. Ranking with )(xf

)(

)(

)(

111 ,1,1,1

2,12,12,1

1,11,11,1

mmm nmnmnm

mmm

mmm

yxfx

yxfx

yxfx

4. Evaluation

EvaluationResult

1,1

2,1

1,1

1

mnm

m

m

m

d

d

d

q

11 ,1,1

2,12,1

1,11,1

1

mm nmnm

mm

mm

m

yd

yd

yd

q

11 ,1,1

2,12,1

1,11,1

mm nmnm

mm

mm

yx

yx

yx

Page 13: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Notes

• Features are functions of query and document

• Query and associated documents form a group

• Groups are i.i.d. data

• Feature vectors within group are not i.i.d. data

• Ranking model is function of features

• Several data labeling methods (here labeling of rank as example)

13

Page 14: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Recent Trends on Learning to Rank

• Successfully applied to search

• Hot topic in Information Retrieval and Machine Learning

– Over 100 publications at SIGIR, ICML, NIPS, etc

– 2 sessions at SIGIR every year

– 3 SIGIR workshops

– Special issue at Information Retrieval Journal

– LETOR benchmark dataset, over 1,000 downloads

http://research.microsoft.com/en-us/um/beijing/projects/letor/index.html

14

Page 15: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Issues in Learning to Rank

• Data Labeling

• Feature Extraction

• Evaluation Measure

• Learning Method (Model, Loss Function, Algorithm)

15

Page 16: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Data Labeling Problem

• E.g., relevance of documents w.r.t. query

16

Doc A

Doc B

Doc C

Query

Page 17: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Data Labeling Methods

• Labeling of Ranks

– Multiple levels (e.g., relevant, partially relevant, irrelevant)

– Widely used in IR

• Labeling of Ordered Pairs

– Ordered pairs between documents (e.g. A>B, B>C)

– Implicit relevance judgment: derived from click-through data

• Creation of List

– List (or permutation) of documents is given

– Ideal but difficult to implement

17

Page 18: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Feature Extraction

18

Doc A

Doc B

Doc C

Query

BM25

BM25

BM25

PageRank

PageRank

PageRank

.............

.............

.............

Query-document feature

Document feature

Feature Vectors

Page 19: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Example Features

• Relevance: BM25

• Relevance: proximity

• Relevance: query exactly occurs in document

• Importance: PageRank

19

Page 20: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Evaluation Measures

• Important to rank top results correctly

• Measures– NDCG (Normalized Discounted Cumulative Gain)

– MAP (Mean Average Precision)

– MRR (Mean Reciprocal Rank)

– WTA (Winners Take All)

– Kendall’s Tau

20

Page 21: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

NDCG

• Evaluating ranking using labeled ranks

• NDCG at position j

21

)1log(/)12(1

1

)( in

j

i

ir

j

Page 22: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

NDCG (cont’)

• Example: perfect ranking

– (3, 3, 2, 2, 1, 1, 1) rank r=3,2,1

– (7, 7, 3, 3, 1, 1, 1) gain

– (1, 0.63, 0.5, 0.43, 0.39, 0.36, 0.33) position discount

– (7, 18.11, 24.11, …) DCG

– (1/7, 1/18.11, 1/24.11, …) normalizing factor

– (1, 1,1,1,1,1,1) NDCG for perfect ranking

12 )( jr

)1log(/1 j

)1log(/)12(1

)( ij

i

ir

jn

22

Page 23: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

NDCG (cont’)

• Example: imperfect ranking

– (2, 3, 2, 3, 1, 1, 1)

– (3, 7, 3, 7, 1, 1, 1) Gain

– (1, 0.63, 0.5, 0.43, 0.39, 0.36, 0.33) Position discount

– (3, 14.11, 20.11, … ) DCG

– (0.43, 0.78, 0.83, ….) NDCG

• Imperfect ranking decreases NDCG

23

Page 24: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Relations with Other Learning Tasks

• No need to predict category

vs Classification

• No need to predict value of

vs Regression

• Relative ranking order is more important

vs Ordinal regression

• Learning to rank can be approximated by classification, regression, ordinal regression

24

),( dqf

Page 25: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Ordinal Regression (Ordinal Classification)

• Categories are ordered

– 5, 4, 3, 2, 1

– e.g., rating restaurants

• Prediction

– Map to ordered categories

25

Page 26: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods

26

Page 27: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods• Pointwise Approach

– Subset Ranking [Cossock and Zhang, 2006]: Regression

– SVM [Nallapati, 2004]: Binary Classification Using SVM

– McRank [Li et al 2007]: Multi-Class Classification Using Boosting Tree

– Prank [Crammer and Singer 2002]: Ordinal Regression Using Perceptron

– Large Margin [Shashua & Levin 2002]: Ordinal Regression Using SVM

27

Page 28: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods• Pairwise Approach

– Ranking SVM: Pairwise Classification Using SVM– RankBoost [Freund et al 2003]: Pairwise Classification

Using Boosting– RankNet [Burges et al 2005]: Pairwise Classification

Using Neural Net– Frank [Tsai et al 2007]: Pairwise Classification Using

Fidelity Loss and Neural Net– GBRank [Zheng et al 2007]: Pairwise Regression Using

Boosting Tree– IR SVM [Cao et al 2006]: Cost-sensitive Pairwise

Classification Using SVM– Multiple SVMs [Qin et al 2007]: Multiple SVMs

28

Page 29: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods

• Listwise Approach– ListNet [Cao et al 2007]: Probabilistic Ranking Model– ListMLE [Xia et al 2008]: Probabilistic Ranking Model– AdaRank [Xu and Li 2007]: Direct Optimization of

Evaluation Measure– SVM Map [Yue et al 2007]: Direct Optimization of

Evaluation Measure– PermuRank [Xu et al 2008]: Direct Optimization of

Evaluation Measure– Soft Rank [Taylor et al 2008]: Approximation of Evaluation

Measure– Lambda Rank [Burges et al 2007]: Using Implicit Loss

Function

29

Page 30: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Methods

• Other Methods

– K-Nearest Neighbor Ranker [Geng et al 2008]

– Semi-Supervised Learning [Jin et al 2008]

30

Page 31: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Evaluation Results

• Pairwise approach and listwise approach perform better than pointwise approach

• Listwise approach performs better than pairwiseapproach in most cases

• Listwise approach– ListMLE, ListNet, AdaRank, PermuRank, SVM-MAP

• Pairwise approach– Ranking SVM, RankNet, RankBoost

• Pointwise approach– Linear Regression

31

Page 32: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Ranking SVM

32

Page 33: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Transforming Ranking to Pairwise Classification

• Input space: X

• Ranking function

• Ranking:

• Linear ranking function:

• Transforming to pairwise classification:

RXf :

);();( wxfwxfxx jiji

xwwxf ,);(

);();( 0, wxfwxfxxw jiji

ij

ji

jixx

xxzzxx

1

1 ),,(

33

Page 34: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Ranking Problem

34

rank 1

rank 2

rank 3

1x

2x

3x

Page 35: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Transformed Pairwise Classification Problem

35

);( wxf31 xx

32 xx

21 xx

+1-1

12 xx

23 xx

13 xx

Positive Examples

Negative Examples

Page 36: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Ranking SVM

• Pairwise classification on differences of feature vectors

• Corresponding positive and negative examples

• Negative examples are redundant and can be discarded

• Hyper plane passes the origin

• Soft Margin and Kernel can be used

• Ranking SVM = pairwise classification SVM

36

Page 37: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning of Ranking SVM

2

1

)2()1( ||||,1min wxxwzl

i

iiiw

37

),0max(][ ss C2

1

0

,,1 1,

||||2

1min

)2()1(

1

2

,

i

iiii

l

i

iw

lixxwz

Cw

Page 38: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

IR SVM

38

Page 39: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Problems with Ranking SVM

• Not sufficient emphasis on correct ranking on topr: relevant, p: partially relevant, i: irrelevantranking 1: p r p i i i iranking 2: r p i p i i iranking 2 should be better than ranking 1Ranking SVM views them as the same

• Numbers of pairs vary according to queriesq1: r p p i i i iq2: r r p p p i i i i iq1 pairs: 2*(r, p) + 4*(r, i) + 8*(p, i) = 14q2 pairs: 6*(r, p) + 10*(r, i) + 15*(p, i) = 31Ranking SVM is biased toward q2

39

Page 40: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

IR SVM

• Solving the two problems of Ranking SVM

• Higher weight on important rank pairs

• Normalization weight on pairs in query

• IR SVM = Ranking SVM using modified hinge loss

40

)(ik

)(iq

Page 41: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Modified Hinge Loss function

41

2)2()1(

)(

1

)( ||||,1min wxxwz iiiiq

l

i

ikw

1

2

0.5

1 )( )2()1( xxzf

Loss

Page 42: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning of IR SVM

42

2)2()1(

)(

1

)( ||||,1min wxxwz iiiiq

l

i

ikw

2

0

,,1 1,

||||2

1min

)()(

)2()1(

1

2

,

iqik

i

i

iiii

l

i

iiw

C

lixxwz

Cw

Page 43: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

ListMLE

43

Page 44: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Plackett-Luce Model(Permutation Probability)

• Probability of permutation 𝜋 is defined as

• Example:

n

in

ij j

i

s

sP

1 )(

)()(

C

C

CB

B

CBA

A

s

s

ss

s

sss

sP

ABC

P(A ranked No.1)

P(B ranked No.2 | A ranked No.1)

P(C ranked No.3 | A ranked No.1, B ranked No.2)44

Page 45: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Properties of Plackett-Luce Model

• Objects: ABC

• Scores:

• Property 1: P(ABC) is largest, P(CBA) is smallest

• Property 2: swap B and C in ABC, P(ABC) > P(ACB)

45

1,3,5 CBA sss

Page 46: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Plackett-Luce Model(Top-k Probability)

• Computation of permutation probabilities is intractable

• Top-k probability

– Defining Top-k subgroup G(o1…ok) containing all permutations whose top-k objects are o1, …, ok

– Time complexity of computation : from n! to

• Example:

k

in

ij o

o

k

j

i

s

sooGP

1

1

)!(! knn

46

CBA

A

sss

sGP

(A)

Page 47: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

ListMLE

• Parameterized Plackett-Luce Model

• Maximum Likelihood Estimation

47

));(exp( wxfs

Qq

k

in

ij j

i

wxf

wxfwL

1 ));(exp(

));(exp(log)(

k

in

ij x

x

k

j

i

s

sxxGP

1

1

Page 48: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

AdaRank

48

Page 49: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Listwise Loss

49

111 ,1,1,1

2,11,22,1

1,11,11,1

1

nnn yx

yx

yx

q

mmm nmnmnm

mmm

mmm

m

yx

yx

yx

q

,,,

2,2,2,

1,1,1,

Page 50: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

AdaRank

• Optimizing exponential loss function

• Algorithm: AdaBoost-like algorithm for ranking

50

Page 51: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Loss Function of AdaRank

51

Any evaluation measuretaking value between [-1,+1]

Page 52: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

AdaRank Algorithm

52

Page 53: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Theoretical Results on AdaRank

• Training error will be continuously reduced during learning phase.

53

Page 54: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Theory

54

Page 55: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Theory

• Pairwise Approach

– Generalization Analysis [Lan et al 2008]

• Listwise Approach

– Generalization Analysis [Lan et al 2009]

– Consistency Analysis [Xia et al 2008]

55

Page 56: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Applications

56

Page 57: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Learning to Rank Applications

• Search [Burges et al 2005]

• Collaborative Filtering [Freund et al 2003]

• Key Phrase Extraction [Jiang et al 2009]

57

Page 58: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Collaborative Filtering

58

Item1 Item2 Item3 ...

User1 5 4

User2 1 2 2

... ? ? ?

UserM 4 3

Page 59: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Future Directions of Learning to Rank Research

59

Page 60: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

New Issues to be Further Studied

• Learning from implicit data

– Automatically generate labeled data from implicit feedback

• Model (feature) learning

– Automatically learn features such as BM25

• Global ranking

– Using features of current document as well as relations with other documents

60

Page 61: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

New Issues to be Further Studied (cont’)

• Query-dependent ranking

– Creating different ranking models for different queries (in search)

• New applications

– Machine Translation

61

Page 62: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Takeaway Message

• Learning to Rank = Machine Learning Task

• Different from classification, regression, ordinal regression

• Learning to Rank has been successfully applied to search

• Existing approaches: pointwise, pairwise, listwise

• Many open problems

62

Page 63: Learning to Rank - IBISMLibisml.org/ibis2009/pdf-invited/hangli1.pdf · •Learning to Rank Methods –Ranking SVM –IR SVM –ListMLE –Ada Rank •Learning to Rank Theory •Learning

Contact: [email protected]

63