Top Banner
EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics )
23

EK Ch 17: Power laws and rich-get-richer phenomena (with an application of

Dec 30, 2015

Download

Documents

jamalia-leblanc

EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ). Numbers. Your grades so far in this class. The weight of an apple. The temperature in Chicago on July 4 th . The height of a Dutch man. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

EK Ch 17: Power laws andrich-get-richer phenomena

(with an application of Web Spam detectionSpam, Damn Spam and Statistics)

Page 2: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Numbers

Your grades so far in this class. The weight of an apple.

The temperature in Chicago on July 4th. The height of a Dutch man. The speed of a car on I-90.

Most instances are typical.Seeing a rare number is very surprising.

These numbers are well-characterized by the average and the standard deviation.

Page 3: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

City populations

1. New York 8,310,2122. Los Angeles 3,834,340 3. Chicago 2,836,658

230. Cambridge, MA 101,335

240. Gainesville, FL 95,447

250. McKinney, TX 54,369

A few cities with high population

Many cities with low population

Page 4: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

City populations

Page 5: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Power Law: Fraction f(k) of items with popularity k is proportional to k-c.

f(k) k-c

log [f(k)] log [k-c]

log [f(k)] -c log [k]

Page 6: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

City populations

Page 7: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Number of Web page in-links (Broder+)

Page 8: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Other examples

Page 9: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Length of the URL’s host

Page 10: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Number of host name resolutions to a single IP

Page 11: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Web page out-degrees

Page 12: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Web page in-degrees

Page 13: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Word count variance

Page 14: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Content evolution

Page 15: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Cluster size

Page 16: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

… because they care to know ;-)

Page 17: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Why does data exhibit power laws?

Imitation Power law

Page 18: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Constructing the web

1. Pages are created in order, named 1, 2, …, N2. When created, page j links to a page by

a) With probability p, picking a page i uniformly at random from 1, …, j-1

b) With probability (1-p), pick page i uniformly at random and link to the page that i links too

Imitation

Page 19: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

The rich get richer

2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too

1/43/4

Page 20: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

The rich get richer

2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too

Equivalently,

2 b) With prob. (1-p), pick a page proportional to its in-degree and link to it

Page 21: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Food for thought

Why is Harry Potter popular?

If we could re-play history, would we still read Harry Potter, or would it be some other book?

Page 22: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Information cascades and the rich

Information cascade = so some people get a little bit richer by chance

and then rich-get-richer dynamics = the random rich people get a lot richer very fast

Page 23: EK Ch 17:  Power laws and rich-get-richer phenomena (with an application of

Music download site – 8 worlds

1. “Let’s go driving,” Barzin

2. “Silence is sexy,” Einsturzende Neubauten

3. “Go it alone,” Noonday Underground

10.“Picadilly Lilly,” Tiger Lillies

1. “Let’s go driving,” Barzin

2. “Silence is sexy,” Einsturzende Neubauten

3. “Go it alone,” Noonday Underground

10.“Picadilly Lilly,” Tiger Lillies

18

3

47

2