Rank Aggregation Methods for the Web CS728 Lecture 11.

Rank Aggregation Methods for the Web

CS728

Lecture 11

Web Page Ranking Methods Reviewed

• PageRank – global link analysis

• Indegree – local link analysis

• HITS- topic-based link analysis

• Voting –NNN and Correlation

• Graph distance from seed

• URL length and depth

• Text-based methods (e.g., tf*idf)

Rank Aggregation

BDCA

ABDCFE

BCDAFE

“Consensus” ranking of all BDCAFE

Notations for Ranking

• Given a universe U, and ordered list τ of a subset of S of U

τ=[x1≥ x2≥… ≥xd] , xi in S τ(i) : position of rank of i

|τ|: number of elements

• full list : τ which contains all the elements in U• partial list : rank only some of elements in U• top d list : all d ranked elements are above all unra

nked elements• Question: when are two orderings similar? Can you

give a distance measure?

Measuring Distance Between Orderings

• Spearman’s Footrule Distance– σ , τ : two full list.– σ( i ) :rank of candidate i

• Kendall tau distance– Count the number of pairwise

disagreements between the two lists

Example of Ordered-List Distance

• Example

– S = {A,B,C,D,E}

– σ , τ : two full list

• Spearman’s Footrule Distance

– F(σ , τ ) = 1 + 2 + 1 + 0 + 2 = 6

• Kendall tau distance

– K(σ , τ ) = |{(A,C), (B.D), (B,E), (D,E)}| = 4

ACEDB

CABDE

12345

τσ

Optimal ranking aggregation

• Optimality depends on the distance measure we use.

• Optimizing with Kendall tau distance, we obtain Kemeny optimal aggregation

• Can show satisfies neutrality and consistency – important properties of rank aggregation

functions.• Useful but computationally hard. Kemeny

optimal aggregation is NP-hard.• Will show that footrule-optimal is in P.

Two properties relate K and F

• For any full lists σ,τK(σ,τ) ≤ F(σ,τ) ≤ 2 K(σ,τ)So we get a 2-approximation to Kemeny-optimality

• Since, if σ is the Kemeny optimal aggregation of full lists τ1 ,…, τk and σ’ optimizes the footrule aggregation then,

K(σ’, τ1 ,…, τk ) ≤ 2 K(σ, τ1 ,…, τk )

• Condorcet Criterion– An element of S which wins every other in pairwise sim

ple majority voting should be ranked first.• Extended Condorcet Criterion (XCC):

– If most voters prefer candidate a to candidate b (i.e., # of i s.t. i(a) < i(b) is at least n/2), then also should prefer a to b (i.e., (a) < (b)).

• XCC is effective in ‘spam-fighting’ and thus good to use in meta-search.

Condorcet Criteria and SPAM Filters

XCC: Not always realizable

a b c

b a a

c c b

a) < (b) < (c)

a b c

b c a

c a b

Not realizable

Voting Theory: Desired Properties

• Given set of candidates and voter preferences: seek an algorithm that ranks candidates which satisfies a set of desired properties

• Which combination of properties are realizable?

• 1) Independence from Irrelevant Alternatives:Relative order of a and b in should depend only on relative order of a and b in 1,…,n.– Ex: if i = (a b c) changes to (a c b), relative order of

a,b in should not change.

Desired Properties:

• 2) NeutralityNo candidate should be favored to others.– If two candidates switch positions in 1,…,n, they

should switch positions also in .

• 3) AnonymityNo voter should be favored to others.– If two voters switch their orderings, should

remain the same.

Desired Properties:

• 4) MonotonicityIf the ranking of a candidate is improved by a voter, its ranking in can only improve.

• 5) ConsistencyIf voters are split into two disjoint sets, S and T, and both the aggregation of voters in S and the aggregation of voters in T prefer a to b, then also the aggregation of all voters should prefer a to b.

Desired Properties

• 6) No Dictatorship: f(1,…,n) != I

• 7) Unanimity (a.k.a. Pareto optimality):

If all voters prefer candidate a to candidate b (i.e., i(a) < i(b) for all i), then also should prefer a to b (i.e., (a) < (b)).

Desired Properties

• 8) Democracy: satisfies extended Condorcet Criterion XCC.– Always works for m = 2.– Not always realizable for m ≥ 3.

• Theorem [May, 1952]: For m = 2, Democracy is the only rank aggregation function which is monotone, neutral, and anonymous.

Arrow’s Impossibility Theorem [Arrow, 1951]

• Theorem: If m ≥ 3, then the only rank aggregation function that is unanimous and independent from irrelevant alternatives is dictatorship.– Won Nobel prize (1972)

Borda’s method

• Easy and intuitive - Several “score-based”variants; 1781

• Violates independence from irrelevant alternatives

C3C1...

C7C8

C10

C7C1...

C8C3

C10

C3C2...

C7C10C9

C3C8...

C1C15C10

1 2 3 4

Bi(c)=the number of candidates ranked below c in i

B(c)=iBi(c)Sorted in

decreasing order

Bi(C8) = 1 2 0 13

Partial lists

• Handle partial lists by giving all the excess scores equally among all unranked candidates,

Example: Candidates number = 100 Ranked candidates number =70

(score: 31~100)

=>Assign score 31/30 to each 30 unranked candidates

Footrule optimal aggregation

• Footrule optimal aggregation can be computed in polynomial time. is a good approximation of Kemeny optimal aggregation.

• Proof : Via minimum cost perfect matching

Markov Chain method for rank aggregation.

• States=candidates• Transitions depend on the preference orders

given by voters• Basic idea: probabilistically switch to a

“better candidate”• Rank candidates based on stationary

probabilities!

Markov chain advantages

• Handling partial list and top d list by using available comparisons to infer new ones

• Handling uneven comparison and list length

• Computation efficiency– O(NK) preprocessing,O(K) per step for about O(N) steps

Four ways to build transition Matrix• Current state is candidate a.

• MC1: Choose uniformly from multiset of all candidates that were ranked at least as high as a by some voter.

– Probability to stay at a: ~ average rank of a.

• MC2: Choose a voter i uniformly at random and pick uniformly at random from among the candidates that the i-th voter ranked at least as high as a.

• MC3: Choose a voter i uniformly at random and pick uniformly at random a candidate b. If i-th voter ranked b higher than a, go to b. Otherwise, stay in a.

• MC4: Choose a candidate b uniformly at random If most voters ranked b higher than a, go to b. Otherwise, stay in a.

– Rank of a ~ # of “pairwise contests” a wins.

A locally Kemeny optimal aggregation is a relaxation of Kemeny Optimality

• A locally Kemeny optimal aggregation satisfies the extended Condorcet property and can be computed in “kO(nlogn)” worst case, O(n2)

• Many of existing aggregation methods do not satisfy ECC.

=>Given τ1 , … ,τk use your favorite aggregation method to obtain a full list μ. And Apply local kemenization to μ with respect to τ1 , … ,τk .

• A local Kemenization of a full list with respect to Compute a locally Kemeny optimal aggregation of that is maximally consistent with

This approach:

(1) preserves the strengths of the initial aggregation .

(2) ranks non-spam above spam.

(3) gives a result that disagrees with on any pair ( i, j ) only if a majority of the τ’s endorse this disagreement.

(4) for every d, 1 ≤ d ≤ | μ |, the restriction of the output is a local Kemenization of the top d elements of μ

Local Kemenization is a procedure to get locally Kemeny optimal aggregation.

1 2, ,..., K 1 2, ,..., K

How do we perform local kemenization?

ABFECD

BCAEFD

ACFDEB

BFDCAE

CABFED

BADCEF

BBAABABD

ABDC

ABCD

ABCFED

• Local Kemenization Example!

disagree

A>B: 3A<B: 2

B>D: 4B<D: 1

Experiments: meta-search

K = Kendall distance SF = scaled footrule distance

IF = induced footrule distance LK = Local Kemenization

Rank Aggregation Methods for the Web CS728 Lecture 11.

Documents

candidate b

realizable slide

lists slide

idf slide

bdcafebdcafe slide

d list

b c changes

footrule aggregation