http://www.uni- klu.ac.at 1 ITEC, Klagenfurt University, Austria
May 19, 2015
http://www.uni-klu.ac.at
1ITEC, Klagenfurt University, Austria
Department for Information Technology, Klagenfurt University, Austria
What is a Hype and Where Can I Get One?
Mathias [email protected]
http://www.uni-klu.ac.at
3
What is this about …
● Power Laws & Pareto Distributions● Just a Theory?● Conclusions
ITEC, Klagenfurt University, Austria
by betta_designhttp://www.flickr.com/photos/betta_design/2200198472/
http://www.uni-klu.ac.at
4
The Long Tail
● Common for certain distributions• Zipf‘s Law• Power Law• Pareto Distribution
● In Web 2 Context• Chris Anderson …
ITEC, Klagenfurt University, Austria
maitland 82 - http://www.flickr.com/photos/maitland82/346065497/
http://www.uni-klu.ac.at
5
Zipf‘s Law
● Few events occur often, many occur rarelyo Pn ~ 1/na ... Frequency of the nth ranked item, a
close to 1.
● Prominent exampleso Ranking of words in documentso Ranking of cities and their sizeo Ranking of movies and sold cinema ticketso … and many more
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
6
Zipf‘s Law
● Plot of the word frequency in Wikipediao Most popular: the, of, and
from http://en.wikipedia.org/wiki/Zipf's_law
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
7
Pareto Distribution
● 80:20 Rule● Economics● Continous (Zipf is discrete)● Practical issues
o Time Management, …
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
8
Power Law
● Made famous by Albert Barabasio Scale free networks (web, power supply, …)o In-degree of web sites, etc.
● Defines actually a class of distributionso f(x)=a*x^b + e
● Pareto and Zipf are part of the group
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
9
How to detect a power law?● Simple empirical tests
o Draw points on a log-log ploto Is it a „straight line“?
ITEC, Klagenfurt University, Austria
1 2 5 10 20 50 100
0.02
0.05
0.10
0.20
0.50
1.00
Some Power Law
x
y
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Some Power Law
x
y
http://www.uni-klu.ac.at
10
How to detect a power law?● Statistical Means
o E.g. KS-Test, Chi-Square Testo Open research issue …
• See e.g. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. arXiv:0706.1062v1 (2007)
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
11
A note on plots …
ITEC, Klagenfurt University, Austria
Taken from phun.org, tnx to enzo nadrag
http://www.uni-klu.ac.at
12
A note on statistical means …
ITEC, Klagenfurt University, Austria http://www.phun.org/newspics/funny_friday/2538.jpg tnx to Enzo Nadrag
http://www.uni-klu.ac.at
13
Zipf, Pareto & Power Law: Conclusions● They emerge when people are involved● They have interesting characteristics
o Mean has virtually no informationo Area under the curve (cp. amazon’s long tail strategy)
● Power laws emerge somehow …o Multiple generative models (preferntial attachement,
memory kernels, etc.)o No one knows for sure
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
14
Is this just theory?
● Basically: YES!● But there are related practical questions
o Are you using Flickr?• How many “interesting” photos did you publish?• How many views do your photos have?
o Imagine you publish a video on YouTube• What are the chances that your video is a big hit?• How to “help out” the process of getting a big hit?• Can one distinguish between hit or flop?
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
15
Is this just theory? (2)
● More related practical questions o Do you have a website?
• How to “flat out” resource popularity?• How select popular resources (e.g. for caching, adaptation,
preprocessing)?
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
16
Big hits on YouTube
ITEC, Klagenfurt University, Austria
© 2007 by Aigner Thomas and Oraze Manuel
http://www.uni-klu.ac.at
17
Getting popular …
1. Starting with the first view (user)2. Some other users find the same resource3. They point other to it
o Blogging, Digging, word of moutho Multiplicator of information – cp. Metcalfe’s
law
4. Number of views (users) “explodes”
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
18
Some graphs …
ITEC, Klagenfurt University, Austria
Data from del.icio.us
Shows • bookmarks / day• relative user count
http://www.uni-klu.ac.at
19
Observations
● There is an initial bend in the curve● The mean user # at the bend is rather
small o Around 50
● There are outlierso Google Video was doomed
to be a success
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
20
Conclusions
● If there is a bend …o Chances are better for a big hit.
● Time is still an issueo Slow start, long vs. short hype, etc.
● Resources without this bend:o Better Chances that they are shelf warmers o Decision support for portfolio adaptation
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
21
The Flickr way
● Flickr defined “Interestingness”o Patented o combining views, comments, age, etc.
● Interesting photos are presentedo Users see new photoso Not all photos (2.000 new / minute, checked
Feb. 1 2008, ~ 11.oo UTC)o They have no “big hit”
ITEC, Klagenfurt University, Austria
Kudos given to Horst Gutmann and Marian Kogler
http://www.uni-klu.ac.at
22
The YouTube way
● Smaller resource data base than Flickro Around 45 videos a day (65.000 a day)
● But a lot more views (data Feb. 1st, 08)o 73.245.607 for „Evolution of Dance“ o 20 most viewed have > 30M views
● Not obvious counter strategy o Might not (yet) be necessary
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
23
Digg
● Assumption: Diggs also follow a power lawo Quite reasonable …
● How to avoid the Digg-effect?o Digg has a mirror …
ITEC, Klagenfurt University, Austria
http://www.uni-klu.ac.at
24ITEC, Klagenfurt University, Austria
Thanks ...
... for your attention
You are interested?Then talk to me …
by Gexydafhttp://www.flickr.com/photos/gexydaf/2208215419/
http://www.uni-klu.ac.at
25
Mathias Lux
Affiliationo Klagenfurt University, ITEC
Contacto mathias @ juggle.ato http://www.semanticmetadata.net
ITEC, Klagenfurt University, Austria