Algorithms and Incentives for Robust Ranking Rajat Bhattacharjee Ashish Goel Stanford University Algorithms and incentives for robust rankingAlgorithms.

Post on 22-Dec-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Algorithms and Incentives for Robust

Ranking

Rajat Bhattacharjee

Ashish Goel

Stanford University

Algorithms and incentives for robust ranking. ACM-SIAM Symposium on Discrete Algorithms (SODA), 2007.

Incentive based ranking mechanisms. EC Workshop, Economics of Networked Systems, 2006.

Algorithms and incentives for robust ranking

Outline

Motivation Model Incentive Structure Ranking Algorithm

Algorithms and incentives for robust ranking

Content : then and now

Traditional Content generation was

centralized (book publishers, movie production companies, newspapers)

Content distribution was subject to editorial control (paid professionals: reviewers, editors)

Internet Content generation is

mostly decentralized (individuals create webpages, blogs)

No central editorial control on content distribution (instead there are ranking and reco. systems like google, yahoo)

Algorithms and incentives for robust ranking

Heuristics Race

PageRank (uses link structure of the web) Spammers try to game the system by creating fraudulent link

structures Heuristics race: search engines and spammers have

implemented increasingly sophisticated heuristics to counteract each other

New strategies to counter the heuristics [Gyongyi, Garcia-Molina]

Detecting PageRank amplifying structures sparsest cut

problem (NP-hard) [Zhang et al.]

Algorithms and incentives for robust ranking

Amplification Ratio [Zhang, Goel, …]

Consider a set S, which is a subset of VIn(S): total weight of edges from V-S to S

Local(S): total weight of edges from S to S

10

S

w(S) = Local(S) + In(S)

Amp(S) = w(S)/In(S)

High Amp(S) → S is dishonest

Low Amp(S) → S is honest

Collusion free graph:where all sets are honest

Algorithms and incentives for robust ranking

Heuristics Race Then why do search engines

work so well? Our belief: because heuristics

are not in public domain Is this “the solution”?

Feedback/click analysis [Anupam et al.] [Metwally et al.] Suffers from click spam Problem of entities with little

feedback Too many web pages, can’t

put them on top slots to gather feedback

Algorithms and incentives for robust ranking

Ranking reversal

Ranking reversal

Entity A is better than entity B, but B is ranked higher than A

Keyword: Search Engine

Algorithms and incentives for robust ranking

Our result Theorem we would have liked to prove

Here is a reputation system and it is robust, i.e., has no ranking reversals even in the presence of malicious behavior

Theorem we prove Here is a ranking algorithm and incentive structure, which when

applied together imply an arbitrage opportunity for the users of the system whenever there is a ranking reversal (even in the presence of malicious behavior)

Algorithms and incentives for robust ranking

Where is the money?

Examples Amazon.com: better recommendations → more purchases →

more revenue Netflix: better recommendations → increased customer

satisfaction → increased registration → more revenue Google/Yahoo: better ranking → more eyeballs → more

revenue through ads

Revenue per entity Simple for Amazon.com and Netflix For Google/Yahoo, we can distribute the revenue from a user

on the web pages he looks at (other approaches possible)

Algorithms and incentives for robust ranking

Why share?

Because they will take it anyway!!!

My precious

Algorithms and incentives for robust ranking

Less compelling reasons

Difficulty of eliciting honest feedback is well known [Resnick et al.] [Dellarocas]

Search engine rankings are self-reinforcing [Cho, Roy] Strong incentive for players to game the system

Ballot stuffing and bad mouthing in reputation systems [Bhattacharjee, Goel] [Dellarocas]

Click spam in web rankings based on clicks [Anupam et al.] Web structures have been devised to game PageRank

[Gyongyi, Garcia-Molina] Problem of new entities

How should the system discover high quality, new entities in the system?

How should the system discover a web page whose relevance has suddenly changed (may be due to some current event)?

Algorithms and incentives for robust ranking

Outline

Motivation Model Incentive Structure Ranking Algorithm

Algorithms and incentives for robust ranking

I-U Model

Inspect (I) User reads a snippet attached to a search result (Google/Yahoo) Looks at a recommendation for a book (Amazon.com)

Utilize (U) User goes to the actual web page (Google/Yahoo) Buys the book (Amazon.com)

Algorithms and incentives for robust ranking

I-U Model

Entities Web pages (Google/Yahoo), Books (Amazon.com) Each entity i has an inherent quality qi (think of it as the

probability that a user would utilize entity i, conditioned on the fact that the entity was inspected by the user)

The qualities qi are unknown, but we wish to rank entities according to their qualities

Feedback Tokens (positive and negative) placed on an entity by users Ranking is a function of the relative number of tokens

received by entities Slots

Placeholders for the results of a query

Algorithms and incentives for robust ranking

Sheep and Connoisseurs

• Sheep can appreciate a high quality entity when shown

• But wouldn’t go looking for a high quality entity

• Most users are sheep

• Connoisseurs will dig for a high quality entity which is not ranked high enough• The goal of this scheme is to aggregate the information that the connoisseurs have

Algorithms and incentives for robust ranking

User response

Algorithms and incentives for robust ranking

I-U Model

User response to a typical query Chooses to inspect the top j positions User chooses j at random from an unknown but fixed distribution Utility generation event for ei occurs if the user utilizes an entity ei

(assuming ei is placed among the top j slots) Formally

Utility generation event is captured by random variable

Gi = Ir(i) Ui

r(i) : rank of entity ei

Ir(i),Ui : independent Bernoulli random variables E[Ui] = qi (unknown) E[I1] ≥ E[I2] ≥ … ≥ E[Ik] (known)

Algorithms and incentives for robust ranking

Outline

Motivation Model Incentive Structure Ranking Algorithm

Algorithms and incentives for robust ranking

Information Markets

View the problem as an info aggregation problem Float shares of entities and let the market decide their value

(ranking) [Hanson] [Pennock] Rank according to the price set by the market Work best for predicting outcomes which are objective

Elections (Iowa electronic market)

Distinguishing features of the ranking problem Fundamental problem: outcome is not objective Revenue: because of more eyeballs or better quality? Eyeballs in turn depend on the price set by the market However, an additional lever: the ranking algorithm

Algorithms and incentives for robust ranking

Game theoretic approaches

Example: [Miller et al.] Framework to incentivize honest feedback Counter lack of objective outcomes by comparing a user’s

reviews to that of his peers Selfish interests of a user should be in line with the desirable

properties of the system

Doesn’t address malicious users Benefits from the system, may come from outside the system

as well Revenue from outcome of these systems might overwhelm

the revenue from the system itself

Algorithms and incentives for robust ranking

Ranking mechanism: overview

Overview: Users place token (positive and negative) on the entities Ranking is computed based on the number of tokens on the

entities Whenever a revenue generation event takes place, the

revenue is shared among the users

Ranking algorithm Input: feedback scores of entities Output: probabilistic distribution over rankings of the entities Ensures that the number of inspections an entity gets is

proportional to the fraction of tokens on it

Algorithms and incentives for robust ranking

Incentive structure

A token is a three tuple: (p, u, e) p : +1 or -1 depending on whether a token is a positive token

or a negative token u : user who placed the token e : entity on which the token was placed Net weight of the tokens a user can place is bounded, that is

pi| is bounded User cannot keep placing positive tokens without placing a

negative token and vice versa

Algorithms and incentives for robust ranking

User account

Each user has an account Revenue shares are added or deducted from a user’s account Withdrawal is permitted but deposits are not Users can make profits from the system but not gain control by

paying If a user’s share goes negative: remove it from the system for

some pre-defined time

Let <1 and s>1 be pre-defined system parameters The fraction of revenue that the system distributes as incentives

to the users: Parameter s will be set later

Algorithms and incentives for robust ranking

Revenue share

Suppose a revenue generation event takes place for an entity e at time t R: revenue generated

For each token i placed on entity e ai is the net weight (positive - negative) of

tokens placed on entity e before token i was placed on e

The revenue shared by the system with the user who placed token i is proportional to

piR/ais

Adds up to at most R Negative token: the revenue share is negative,

deduct from the user’s account

1

2

3

4

5

6

7

8

Algorithms and incentives for robust ranking

Revenue share Some features

Parameter s controls relative importance of tokens placed earlier Tokens placed after token i have no bearing on the revenue

share of the user who placed token i Hence s is strictly greater than 1

Incentive for discovery of high quality entities Hence the choice of diminishing rewards

Emphasis is on making the process as implicit as possible

Resistance to racing The system shouldn’t allow a repeated cycle of actions which

pushes A above B and then B above A and so on We can add more explicit feature by multiplying any negative

revenue by (1+) where is an arbitrarily small positive number

Algorithms and incentives for robust ranking

Ranking by quality Either the entities are ranked by quality, or, there exists a profitable

arbitrage opportunity for the users in correcting the ranking

Ranking reversal: A pair of entities (i,k) such that qi<qk and i>k qi, qk: quality of entity i and k resp. i, k: number of tokens on entity i and k resp.

Revenue/utility generated by the entity: f(r,q) r: relative number of tokens placed on an entity q: quality of the entity For the I-U Model, our ranking algorithm ensures f(r,q) is

proportional to qr

Objective: A ranking reversal should present a profitable arbitrage opportunity

Algorithms and incentives for robust ranking

Arbitrage

There exists a pair of entities A and B

Placing a positive token on A and placing a negative token on B

The expected profit from A is more than the expected loss from B

1

2

3

4

5

6

7

8

1

2

3

4

5

Algorithms and incentives for robust ranking

Proof (for separable rev fns)

Suppose f(ri, qi) i-s < f(rk, qk) k

-s

ri = i (l l)-1, rk= k(l l)-1 It is profitable to put a negative token on entity i and a positive token on

entity k Assumption: f is separable, that is f(r,q) = qr

Choose parameter s greater than f(ri, qi) i

-s < f(ri, qk) i-s

f is increasing in q f(ri, qk) i

-s = qkrii

-s = qk i-s (l l)-

Definition of separable function Similarly f(rk, qk) k

-s = qk rkk

-s = qk k-s (l l)-

However qki-s(l l)-< qk k

-s (l l)-

i > k and s > Hence, f(ri, qi) i

-s < f(rk, qk) k-s

Algorithms and incentives for robust ranking

Proof (I-U Model)

The rate at which revenue is generated by entity i (k) is proportional to (ensured by our ranking algorithm) qii (qkk)

Rate at which incentives are generated by placing a positive token on entity k is qkkk

s

Loss due to placing a negative token on entity i is qiiis

If s>1, qkk1-s > qii

1-s

qi < qk (ranking reversal) i

> k (ranking reversal)

Thus a profitable arbitrage opportunity exists in correcting the system

Algorithms and incentives for robust ranking

Outline

Motivation Model Incentive Structure Ranking Algorithm

Algorithms and incentives for robust ranking

Naive approach

Order the entities by the net number of tokens they have Problem?

Incentive for manipulation

Example: Slot 1: 1,000,000 inspections Slot 2: 500,000 inspections Entity 1: 1000 tokens Entity 2: 999 tokens

Algorithms and incentives for robust ranking

Ranking Algorithm

Proper ranking

If entity e1 has more positive feedback than entity e2, then if the user chooses to inspect the top t (for any t) slots, then the probability that e1 shows up should be higher than the probability that e2 shows up among the top t slots

Random variable Xe gives the position of entity e Entity e1 dominates e2 if for all t, Pr[Xe1 ≤ t] ≥ Pr[Xe2 ≤ t] Proper ranking: if the feedback score of e1 is more than the

feedback score of e2, then e1 dominates e2

Distribution returned by the algorithm is a proper ranking

Algorithms and incentives for robust ranking

Majorized case

p : vector giving the normalized expected inspections of slots

S = E[I1] + E[I2] + … + E[Ik]

p = {E[I1]/S, E[I2]/S, …, E[Ik]/S}

: vector giving the normalized number of tokens on entities Special case: p majorizes

For all i, the sum of the i largest components of p is more than the sum of the i largest components of

Algorithms and incentives for robust ranking

Majorized case

Typically, the importance of top slots in a ranking system is far higher than the lower slots Rapidly decaying tail

The number of entities is order of magnitude more than the number of significant slots Heavy tail

Hence for web ranking p majorizes We believe for most applications p majorizes

Restrict to the majorized case here The details of the general case are in the paper

Algorithms and incentives for robust ranking

Hardy, Littlewood, Pólya

=1

=1

• Theorem [Hardy, Littlewood, Pólya]• The following two statements are equivalent: (1) The vector x is majorized by the vector y, (2) There exists a doubly stochastic matrix, D, such that x = Dy

• Interpret Dij as the probability that entity i shows up at position j

• This ensures that the number of inspections that an entity gets is directly proportional to its feedback score

Doubly stochastic matrix(Dij ≥ 0, ∑j Dij = 1, ∑j Dij = 1)

Algorithms and incentives for robust ranking

Birkhoff von Neumann Theorem

Hardy, Littlewood, Pólya theorem on majorization doesn’t guarantee that the ranking we obtain is proper We present a version of the theorem which takes care of this

Theorem [Birkhoff, von Neumann] An nxn matrix is doubly stochastic if and only if it is a convex

combination of permutation matrices Convex combination of permutation matrices Distribution over

rankings

Algorithms for computing Birkhoff von Neumann distribution O(m2) [Gonzalez, Sahni] O(mn log K) [Gabow, Kariv]

Algorithms and incentives for robust ranking

Conclusion

Theorem Here is a ranking algorithm and incentive structure, which when

applied together imply an arbitrage opportunity for the users of the system whenever there is a ranking reversal

Resistance to gaming We don’t make any assumptions about the source of the error in

ranking - benign or malicious So by the same argument the system is resistant to gaming as

well

Resistance to racing

Algorithms and incentives for robust ranking

Thank You

top related