Top Banner
Carnegie Mellon Jonathan Huang Carlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from Rankings
31

Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

Carnegie Mellon

Jonathan Huang Carlos Guestrin

Carnegie Mellon UniversityICML 2010

Haifa, Israel

Learning Hierarchical Riffle Independent Groupings

from Rankings

Page 2: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

American Psychological Association Elections

Each ballot in election data is a ranked list of candidates [Diaconis, 89]

5738 full ballots (1980 election)

5 candidatesWilliam BevanIra IscoeCharles KieslerMax SiegleLogan Wright

10 20 30 40 50 60 70 80 90 100 110 12000.010.020.030.040.050.060.070.080.090.1

rankings

pro

bab

ility

candidates

ran

ks

First-order matrix Prob(candidate i was

ranked j)

Candidate 3 has most

1st place votesAnd many

last place votes

2

Page 3: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

3

Factorial Possibilitiesn items/candidates means n! rankings:

n n! Memory required to store n! doubles

5 120 <4 kilobytes

15 1.31x1012 1729 petabytes (!!)

Possible learning biases for taming complexity:

Parametric? (e.g., Mallows, Plackett Luce,…)

Sparsity? [Reid79, Jagabathula08]

Independence/Graphical models? [Huang09]

(Not to mention sample complexity issues…)

10 20 30 40 50 60 70 80 90 100 110 12000.010.020.030.040.050.060.070.080.090.1

rankings

pro

bab

ility

Page 4: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

4

Full independence on rankingsGraphical model for

joint ranking of 6 items

Rank of item C

A F

C D

B E

Mutual exclusivity leads to fully

connected model!

A F

C D

B E

{A,B,C}, {D,E,F} independent

Ranks of {A,B,C} a permutation of {1,2,3}

Ranks of {D,E,F} a permutation of {4,5,6}

Page 5: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

5

First-order independence condition

Ran

ks

Candidates

Ran

ks

Candidates

Not independent

Independent

vs.

A F

C D

B E

A F

C D

B E

candidates

ran

ks

Prob(candidate i was ranked j)

Sparsity: any ranking putting A, B, or C in ranks 4, 5, or 6 has zero probability!

But… such sparsity unlikely to exist in real data

Page 6: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

6

Drawing independent fruits/veggies

Veggies in ranks {1,2}, Fruits in ranks {3,4}(veggies always better than fruits)

Draw veggie rankings, fruit rankings independently:

Form joint ranking of veggies and fruits:Veggie

Fruit

Veggie

Fruit

>

>

>

Artichoke Broccoli>

Veggies

Cherry Dates>Fruits

Broccoli

Artichoke

Cherry

Date

Veggie

Fruit

Veggie

Fruit

>

>

>

Artichoke

Broccoli

Date

Cherry

Full independence: fruit/veggie positions fixed ahead of time!

Page 7: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

7

Riffled independence

Riffled independence model [Huang, Guestrin, 2009]:

Draw interleaving of veggie/fruit positions (according to a distribution)

Artichoke Broccoli>

Veggies

Cherry Dates>Fruits

Veggie

Fruit

Veggie

Fruit

>

>

>Veggie

Fruit

Veggie

Fruit

>

>

>Veggie

Fruit

Veggie

Fruit

>

>

>

Veggie

Fruit

Veggie

Fruit

>

>

>

Artichoke

Broccoli

Cherry

Dates

>

>

>

Page 8: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

8

Riffle ShufflesRiffle shuffle (dovetail shuffle)

Cut deck into two piles.Interleave piles.

Riffle shuffles corresponds to

distributions over interleavings Interleaving distribution

Page 9: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

9

American Psych. Assoc. Election (1980)

10 20 30 40 50 60 70 80 90 100 110 12000.010.020.030.040.050.060.070.080.090.1

permutations

pro

bab

ility

Candidate 3 fully independent

Minimize:

KL(empirical || riffle indep. approx.)

Best KL split

{12345}

{1345} {2}

vs. Candidate 2 riffle independent

Page 10: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

10

Irish House of Parliament election 2002

64,081 votes, 14 candidates

Two main parties: Fianna Fail (FF)Fine Gael (FG)

Minor parties: Independents (I) Green Party (GP)Christian Solidarity (CS)Labour (L)Sinn Fein (SF)

[Gormley, Murphy, 2006]

Ireland

Page 11: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

11

Approximating the Irish Election

“True” first order marginals

Riffle Independent approx.

n=14 candidates

Major parties riffle independent of minor parties?

Sinn Fein, Christian Solidarity marginals not well captured by a single riffle

independent split!

CandidatesFF FF FFFG FGFGI I I I GPCS SF L

CandidatesFF FF FFFG FGFGI I I I GPCS SF L

Prob(candidate i was ranked j)

Page 12: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

12

Back to Fruits and Vegetables

banana, apple, orange, broccoli, carrot, lettuce

banana, apple, orange

broccoli, carrot, lettuce

candy, cookies

candy, cookies

??

candy, cookies

??

Need to factor out {candy, cookies} first!

(Sinn Fein, Christian Solidarity)

Page 13: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

13

Hierarchical Decompositions

Banana, apple, orange, broccoli, carrot, lettuce,

candy, cookies

Banana, apple, orange Broccoli, carrot, lettuce

Candy, cookiesBanana, apple,

orange, broccoli, carrot, lettuce

All foods items

Healthy food Junk food

Fruits Vegetables

Fruits, Vegetables, marginally riffle independent

Page 14: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

14

Generative process for hierarchy

Interleave Healthy foods with Junk food

Interleave fruits/vegetables

Rank Junk food

Rank fruits Rank vegetables

better

Hierarchical Riffle Independent Models- Encode intuitive independence constraints

- Can be learned with lower sample complexity

- Have interpretable parameters/structure

Page 15: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

15

Contributions

Structured Representation: Introduction of a hierarchical model based on riffled independence factorizations

Structure Learning Objectives: An objective function for learning model structure from ranking data

Structure Learning Algorithms: Efficient algorithms for optimizing the proposed objective

Page 16: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

16

Learning a HierarchyExponentially many hierarchies, each encoding a distinct set of independence assumptions

{12345}

{1345}

{2}

{13} {45}

{12345}

{1234}

{5}

{234} {1}

{2} {34}

{12345}

{245} {13}

{24} {5}

{12345}

{2345}

{1}

{34} {25}

{12345}

{1345}

{3}

{1} {452}

{12345}

{1234}

{5}

{234} {1}

{24} {3}

{12345}

{243} {15}

{34} {2}

Problem statement: given i.i.d. ranked data, learn the hierarchy that generated the

data

Page 17: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

17

Learning a HierarchyOur Approach: top-down partitioning of item set X={1,…,n}Binary splits at each stage

{12345}

{1234}

{5}

{234}{1}

Page 18: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

18

10 20 30 40 50 60 70 80 90 100 110 1200

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

permutations

pro

bab

ility

Minimize:

KL(empirical || hierarchical model)

Hierarchical Decomposition

KL(true,best hierarchy)=6.76e-02, TV(true, best hierarchy)=1.44e-01

Best KL hierarchy

{12345}

{1345} {2}

{13} {45}Research

psychologistsClinical

psychologists

Community psychologists

candidates

rank

s

1 2 3 4 5

1

2

3

4

5

candidates

rank

s

1 2 3 4 5

1

2

3

4

50

0.05

0.1

0.15

0.2

0.25

“True” first order Learned first order

Page 19: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

19

KL-based objective function

KL Objective:Minimize KL(true distribution || riffle independent approx)

Algorithm:For each binary partitioning of the item set: [A,B]

Estimate parameters: (ranking probabilities, for A, B and interleaving probabilities)Compute log likelihood of data

Return maximum likelihood partition

Need to search over exponentially many subsets!

If hierarchical structure of A or B is unknown, might not have enough samples to learn parameters!

Page 20: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

20

B

Finding fully independent sets by clustering

Why this won’t work (for riffled independence):

Pairs of candidates on opposite sides of the split can be strongly correlated in general(If I vote up Democrat, I am likely to vote down Republican)

Compute pairwise mutual informationsPartition resulting graph

APairwise measures unlikely to be

effective for detecting riffled independence!

Page 21: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

21

Higher order measure of riffled independence

Key insight:Riffled independence means: absolute rankings in A not informative about relative rankings in B

If i, (j,k) lie on opposite sides,Mutual information=0

Idea: measure mutual information between singleton rankings and pairwise

rankings

preference over Democrat i

relative preference over Republicans j & k

Page 22: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

22

Tripletwise objective function

Objective function (want to minimize):

A Ball items in set A

–plays nono role

inobjectiv

e

Tripletwise measure: no longer obviously a graph cut problem…

Page 23: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

23

Efficient Splitting: Anchors heuristic

Given two elements of A, we can decide whether rest of elements are in A or B:

large

small

anchor elements

Large

Small

Page 24: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

24

Efficient Splitting: Anchors heuristic

In practice, anchor elements a1, a2, unknown!

Theorem: Anchors heuristic recovers riffle independent split with high probability given polynomial samples (under certain strong connectivity assumptions)

Page 25: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

251 2 3 4 5

-6300

-6200

-6100

-6000

-5900

-5800

-5700

-5600

-5500

-5400

log-

likel

ihoo

d

log10(# samples)

true structure known

learned structure

random 1-chain (with learned parameters)

Structure learning with synthetic data

16 items, 4 items in each leaf

Page 26: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

26

Anchors on Irish election data

Irish Election hierarchy (first four splits):{1,2,3,4,5,6,7,8,9,10,11,12,13,14

}

{1,2,3,4,5,6,7,8,9,10,11,13,14

}

{12}

{11}{1,2,3,4,5,6,7,8,9,10,13,14}

{2,3,5,6,7,8,9,10,14}

{1,4,13}

{2,5,6} {3,7,8,9,10,14}

Sinn Fein

Christian Solidarity

Fianna Fail

Fine Gael Independents, Labour, Green

Brute force optimization: 70.2sAnchors method: 2.3s

Running time

“True” first order Learned first order

Candidates

Ran

ks

2 4 6 8 10 12 14

2468101214

Candidates

Ran

ks

2 4 6 8 10 12 14

2468101214 0

0.05

0.1

0.15

0.2

0.25

Page 27: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

27

Irish log likelihood comparison

50 85 144 243 412 698 1180 2000-5000

-4500

logl

ikel

ihoo

d

# samples (logarithmic scale)

Optimized 1-chain

Optimized Hierarchy

Page 28: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

28

76 171 389 882 20010

0.2

0.4

0.6

0.8

1

# samples (log scale)

Suc

cess

rat

eMajor party {FF, FG}

leaves recovered

Top partition recovered

All leaves recovered

Full tree recovered

Bootstrapped substructures: Irish data

Page 29: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

29

is a natural notion of independence for rankingscan be exploited for efficient inference, low sample complexityapproximately holds in many real datasets

Hierarchical riffled independencecaptures more structure in datastructure can be learned efficiently:

related to clustering and to graphical model structure learning efficient algorithm & polynomial sample complexity result

Riffled Independence…

Acknowledgements: Brendan Murphy and Claire Gormley provided the Irish voting datasets. Discussions with Marina Meila provided important initial ideas upon which this work is based.

Page 30: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

30

Sushi rankingDataset: 5000 preference rankings of 10 types of sushi

Types1. Ebi (shrimp)2. Anago (sea eel)3. Maguro (tuna)4. Ika (squid)5. Uni (sea urchin)6. Sake (salmon roe)7. Tamago (egg) 8. Toro (fatty tuna)9. Tekka-make (tuna roll)10. Kappa-maki (cucumber roll)

sushi

ran

ks

Prob(sushi i was ranked j)

Fatty tuna (Toro)is a favorite!

No one likes cucumber roll !

Page 31: Carnegie Mellon Jonathan HuangCarlos Guestrin Carnegie Mellon University ICML 2010 Haifa, Israel Learning Hierarchical Riffle Independent Groupings from.

31

{1,2,3,4,5,6,7,8,9,10}

{2} {1,3,4,5,6,7,8,9,10}

{1,3,5,6,7,8,9,10}{4}

{1,3,7,8,9,10} {5,6}

{3,7,8,9,10} {1}

{3,8,9} {7,10}

(sea eel)

(squid)

(sea urchin, salmoe roe)

(shrimp)

(tuna, fatty tuna, tuna roll) (egg, cucumber roll)

Sushi hierarchy