WEBIST 2009

Faceted Ranking In Collaborative Tagging Systems

J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23

1Core Security Technologies

2ITBA

3CONICET

WEBIST 2009, Lisbon, Portugal

The Problem (Faceted Reputation)

I Which �ickr photographers are the best regarding a facet, i.e.tag set, { sea, portugal }?

I Nodes are users/channels, edges are favorites and tags areassociated to the favorited content.

Single Ranking (1/3)

I Basic approach, single rank and �ltering. Scales well.I Everything is biased to the richer nodes, tags don't in�uence

the ranking.I G goes out, but why is D worstly ranked than A regarding

{sea, portugal}? Is D better than C?


I Basic approach, single rank and �ltering. Scales well.

I Everything is biased to the richer nodes, tags don't in�uencethe ranking.

I G goes out, but why is D worstly ranked than A regarding{sea, portugal}? Is D better than C?


I Basic approach, single rank and �ltering. Scales well.

I Everything is biased to the richer nodes, tags don't in�uencethe ranking.

I G goes out, but why is D worstly ranked than A regarding{sea, portugal}? Is D better than C?

Edge-intersection, 1st gold standard (1/3)

I Filtering edges including the conjunction of tags.

I Adequate tag bias, slightly restrictive.







Node-intersection, 2nd gold standard (1/2)I Filtering edges including the disjunction of tags to rank.I Plus �ltering conjuntion of nodes involved in every tag edge

after ranking.I Adequate tag bias, slightly irrestrictive, possibly one tag

prevails over the other.

c

Node-intersection, 2nd gold standard (2/2)

I Filtering edges including the disjunction of tags to rank.

I Plus �ltering conjuntion of nodes involved in every tag edgeafter ranking.

I Adequate tag bias, slightly irrestrictive, possibly one tagprevails over the other.

The Scalability ProblemI The previous two algorithms don't scale for online queries.I Another possibility is computing singleton facets o�ine, and

later merge the results online.I O�ine time and spatial complexity will grow linearly on

#edges × #tags per edge. Scaling nicely.

0.1

1

10

100

1000

10000

100000

1 10 100 1000

# ed

ges

# tags

YouTubeFlickr

Singleton facets, computed o�ine (1/2)

I Singleton facet subgraphs used in ranking, after that only bestK users stored, where K is small.

Singleton facets, computed o�ine (2/2)I Singleton facet subgraphs used in ranking, after that only best

K users stored, where K is small.

Probability-product

I Inspired by the probability independence rule, multiplyPageRank probability of single tags.

A

B

C

D

E

F

sea

0.09

0.14

0.14

0.38

0.14

0.09

×

portugal

0.02

0.04

0.40

0.39

0.07

0.05

=

0.0018

0.0056

0.0560

0.1482

0.0098

0.0045

rank!

#6

#4

#2

#1

#3

#5

I Possible bias towards the heaviest tag, eclipsing the others.

Rank-sum

I Lowest accumulated ordinal/position sum gets the best ranks.

A

B

C

D

E

F

sea

#3

#2

#2

#1

#2

#3

+

portugal

#6

#5

#2

#1

#3

#4

=

9

7

4

2

5

7

rank!

#5

#4

#2

#1

#3

#4

I Avoids this kind of topic drift towards one of the tags.

Winners-intersection

I Top W (small) nodes per singleton facet are used to build anew small graph.

I W = 500 in experiments (W = 3 in example).

A

B

C

D

E

F

sea

#3

#2

#2

#1

#2

#3

∩

portugal

#2

#1

#3

= C

D

E

Experiments, comp. with Edge-intersection, OSimdarker is better results

More experiments (�ickr)

Conclusions

I Exist approximate and scalable methods for faceted ranking incollaborative tagging systems.

I Functional web prototype: Egg-O-Matic

http://egg-o-matic.itba.edu.ar

I Loose Ends

I Using weighted graphs.I Scienti�c cites dataset (real egos!).I Industrial-sized dataset (10^7 instead of 10^5 edges)

Prototype (1/2)

Prototype (2/2, last slide, thanks!)

WEBIST 2009

Technology

single ranking

conjunction of tags

tag edge

tags dont inuencethe

disjunction of tags

adequate tag bias

filtering edges

faceted ranking