Page 1
1/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Two unrelated talks
MARCO BRESSAN
January 30, 2012
Page 2
2/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Outline
1 Local computation of PageRank: the ranking sideIntroductionMotivationsLocal ranking in theoryLocal ranking in practiceConclusions
2 psort, yet another fast stable external sorting softwareIntroductionMaking sorting a complicate taskInside psortConclusions
3 Conclusions
Page 3
3/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Local computation of PageRank:the ranking side
Page 4
4/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Ranking robustly
Rank a graph’s nodes
1. the graph 2. external factors
• (varying) parameters• graph availability• . . .
Is ranking robust?
How is ranking influenced by external factors?
Page 5
4/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Ranking robustly
Rank a graph’s nodes
1. the graph 2. external factors
• (varying) parameters• graph availability• . . .
Is ranking robust?
How is ranking influenced by external factors?
Page 6
5/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
PageRank
u
v
PageRank of node v:
P (v) =
α
∑u→v
P (u)
o(u)
+1− αn
n = |G| α = damping factor
Applicationsweb search, web crawling, web spam detection, personalized web search, social network
mining, ranking in databases, structural re-ranking, opinion mining, word sense
disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .
Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.
Page 7
5/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
PageRank
u
v
PageRank of node v:
P (v) = α∑u→v
P (u)
o(u)+
1− αn
n = |G| α = damping factor
Applicationsweb search, web crawling, web spam detection, personalized web search, social network
mining, ranking in databases, structural re-ranking, opinion mining, word sense
disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .
Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.
Page 8
5/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
PageRank
u
v
PageRank of node v:
P (v) = α∑u→v
P (u)
o(u)+
1− αn
n = |G| α = damping factor
Applicationsweb search, web crawling, web spam detection, personalized web search, social network
mining, ranking in databases, structural re-ranking, opinion mining, word sense
disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .
Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.
Page 9
6/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Choose the damping, choose the ranking?
P (v) = α∑u→v
P (u)
o(u)+
1− αn
Is PageRank’s rankingrobust to small variationsin α ?
Results1. not robust in theory (permutation theorem, reversal theorem)2. novel tools for checking robustness (lineage analysis)3. somewhat robust in real-world graphs (experiments)
Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?
J. Discrete Algorithms 8(2): 199-213 (2010)
Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?
Proc. of WAW 2009: 76-89
Page 10
6/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Choose the damping, choose the ranking?
P (v) = α∑u→v
P (u)
o(u)+
1− αn
Is PageRank’s rankingrobust to small variationsin α ?
Results1. not robust in theory (permutation theorem, reversal theorem)2. novel tools for checking robustness (lineage analysis)3. somewhat robust in real-world graphs (experiments)
Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?
J. Discrete Algorithms 8(2): 199-213 (2010)
Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?
Proc. of WAW 2009: 76-89
Page 11
7/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?
Local computation
u
v
Ranking
0.15
0.2
0.10.3
0.25
In many applicationsonly the rank matters!
Is it possible to compute the rank locally?
• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)
Page 12
7/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?
Local computation
u
v
Ranking
0.15
0.2
0.10.3
0.25
1st
2nd3rd
4th
5th
In many applicationsonly the rank matters!
Is it possible to compute the rank locally?
• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)
Page 13
7/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?
Local computation
u
v
Ranking
0.15
0.2
0.10.3
0.25
1st
2nd3rd
4th
5th
In many applicationsonly the rank matters!
Is it possible to compute the rank locally?
• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)
Page 14
8/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (I): crawling
The visited graph expands startingfrom seed nodes.
Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?
Page 15
8/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (I): crawling
The visited graph expands startingfrom seed nodes.
Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?
Page 16
8/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (I): crawling
The visited graph expands startingfrom seed nodes.
Which red nodes should be visitednow? And in what order?
Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?
Page 17
8/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (I): crawling
The visited graph expands startingfrom seed nodes.
Which red nodes should be visitednow? And in what order?
Order the nodes with PageRank!
Cho et al. Efficient crawling through URLordering. Computer Networks, 1998.
Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?
Page 18
9/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (II): ranking withcompetitors
Retrieve graph structure using e.g. Google’s link:
Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.
Is it possible to compute this rank efficiently, using few queries?
Page 19
9/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (II): ranking withcompetitors
Retrieve graph structure using e.g. Google’s link:
Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.
Is it possible to compute this rank efficiently, using few queries?
Page 20
9/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (II): ranking withcompetitors
Retrieve graph structure using e.g. Google’s link:
Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.
Is it possible to compute this rank efficiently, using few queries?
Page 21
9/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (II): ranking withcompetitors
Retrieve graph structure using e.g. Google’s link:
Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.
Is it possible to compute this rank efficiently, using few queries?
Page 22
9/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (II): ranking withcompetitors
Retrieve graph structure using e.g. Google’s link:
Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.
Is it possible to compute this rank efficiently, using few queries?
Page 23
10/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (III): social networkmining
Rank key users in social networks
Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.
Full graph not available (privacy settings).
Is it still possible to pretend correctness of the output ranking?
Page 24
10/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (III): social networkmining
Rank key users in social networks
Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.
Full graph not available (privacy settings).
Is it still possible to pretend correctness of the output ranking?
Page 25
10/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (III): social networkmining
Rank key users in social networks
Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.
Full graph not available (privacy settings).
Is it still possible to pretend correctness of the output ranking?
Page 26
10/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (III): social networkmining
Rank key users in social networks
Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.
Full graph not available (privacy settings).
Is it still possible to pretend correctness of the output ranking?
Page 27
10/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Motivating examples (III): social networkmining
Rank key users in social networks
Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.
Full graph not available (privacy settings).Is it still possible to pretend correctness of the output ranking?
Page 28
11/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Formal definition of the problem
Input
• graph G of size n
• target nodes v1, . . . , vk
• score separation ε > 0
Output
• ranking of v1, v2, . . . , vk
If (1− ε) < P (vi)P (vj)
< (1 + ε)
any ranking of vi, vj is valid
Cost Model• computation for free• but visiting G costs
(query to link server)
cost of ranking = |queries| = |nodes visited|
Page 29
12/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?
Our contribution: NO!
NO in practice: experimental results
1. real web/social graphs behave like worst-case input instancesfor local ranking
2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms
Page 30
12/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?Our contribution: NO!
NO in theory: lower bounds
1. Every deterministic local ranking algorithm has an adversarialgraph forcing Ω(n) queries (and can be tightened)
2. Every randomized local ranking algorithm has an adversarialgraph forcing Ω(n) queries
even to rank the top k nodes,even if their scores are highly separated!
=⇒ a general low-cost local ranking algorithm does not exist
NO in practice: experimental results
1. real web/social graphs behave like worst-case input instancesfor local ranking
2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms
Page 31
12/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Is it possible to compute the rank locally?Our contribution: NO!
NO in practice: experimental results
1. real web/social graphs behave like worst-case input instancesfor local ranking
2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms
Page 32
13/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (I): deterministic algorithms
Every det.algorithm has anadversarial graphforcing cost Ω(n)
n(1 −O(εk))
Theorem 1 (paper Thm. 4)
Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2
20k . For
any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the
top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking
according to Pα(·), algorithm A performs Ω(n) queries.
n(1−O(εk)) queries.
Page 33
13/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (I): deterministic algorithms
Every det.algorithm has anadversarial graphforcing cost Ω(n)
n(1 −O(εk))
Theorem 1 (paper Thm. 4)
Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2
20k . For
any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the
top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking
according to Pα(·), algorithm A performs Ω(n) queries.
n(1−O(εk)) queries.
Page 34
13/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (I): deterministic algorithms
Every det.algorithm has anadversarial graphforcing cost Ω(n)
n(1 −O(εk))
Theorem 1 (paper Thm. 4)
Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2
20k . For
any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the
top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking
according to Pα(·), algorithm A performs Ω(n) queries.
n(1−O(εk)) queries.
Page 35
13/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (I): deterministic algorithms
Every det.algorithm has anadversarial graphforcing cost Ω(n)
n(1 −O(εk))
Theorem 1 (paper Thm. 4)
Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2
20k . For
any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the
top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking
according to Pα(·), algorithm A performs Ω(n) queries.
n(1−O(εk)) queries.
Page 36
13/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (I): deterministic algorithms
Every det.algorithm has anadversarial graphforcing cost Ω(n)
n(1 −O(εk))
Theorem 1 (paper Thm. 4)
Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2
20k . For
any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the
top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking
according to Pα(·), algorithm A performs Ω(n) n(1−O(εk)) queries.
Page 37
14/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (II): randomized algorithms
Every rand.(Las Vegas orMonte Carlo)algorithm has anadvers. graphforcing costΩ(α√nε
)
Ω(n)
[v3 v10 ... v7]
link
serv
er v1
(109 nodes)
v2
v20
AR
AN
DO
M
graph G
~104.5 queries
Theorem 2 (paper Thm. 3)
Choose k > 1, n0 ≥ 6k3, a damping factor α ∈ (0, 1), and ε ∈[α2k2
4n0, α
2
24k
]. Then
1. for any Las Vegas local algorithm A
2. for any Monte Carlo local algorithm A with constant confidence
there exists a graph of size n ∈ Θ(n0) where the top k nodes v0, . . . , vk−1 are
ε-separated and, to compute their relative ranking, A performs in expectation Ω(α√nε
)queries.
Page 38
14/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Lower bounds (II): randomized algorithms
Every rand.(Las Vegas orMonte Carlo)algorithm has anadvers. graphforcing costΩ(α√nε
)Ω(n) [v3 v10 ... v7]
link
serv
er v1
(109 nodes)
v2
v20
AR
AN
DO
M
graph G
~104.5 108 queries
Theorem 2 (paper Thm. 3)
Choose k > 1, n0 ≥ 6k3, a damping factor α ∈ (0, 1), and ε ∈[α2k2
4n0, α
2
24k
]. Then
1. for any Las Vegas local algorithm A
2. for any Monte Carlo local algorithm A with constant confidence
there exists a graph of size n ∈ Θ(n0) where the top k nodes v0, . . . , vk−1 are
ε-separated and, to compute their relative ranking, A performs in expectation Ω(α√nε
)queries.
Page 39
15/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
What happens in practice?
Two experiments
1. Hardness of real-world graphs
Compute the minimal number of nodes that an algorithm mustvisit to always guarantee a correct ranking.
2. Performance of approximation algorithms
Evaluate cost and accuracy of local ranking algorithms derivedfrom state-of-the-art local score approximation algorithms.
Datasets
nodes arcs crawled.it 40M 1150M 2004
LiveJournal 5M 79M 2008
publicly available from LAW
- Univ. Milan
http://law.dsi.unimi.it
Page 40
16/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 1: hardness of real-world graphs (1/2)
Breakdown of a local ranking algorithm
1. Visit ancestors
Thm.: must visit at least|minset(G, u, v)|ancestors
2. Compute ranking
Thm.: must agree withnatural PageRank scoreapproximation
|minset(G, u, v)| ≤ cost of ranking u, v in graph G
Page 41
16/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 1: hardness of real-world graphs (1/2)
Breakdown of a local ranking algorithm
1. Visit ancestors
Thm.: must visit at least|minset(G, u, v)|ancestors
2. Compute ranking
Thm.: must agree withnatural PageRank scoreapproximation
|minset(G, u, v)| ≤ cost of ranking u, v in graph G
Page 42
17/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 1: hardness of real-world graphs (2/2)
103
104
105
106
107
.01.02.04.08.16.32.641.282.56
ave
rage n
um
ber
of vi
site
d n
od
es
ε
.it web graphLiveJournal graph
Page 43
18/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.
vpruning
threshold = 10%
Page 44
18/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.
v
35%
24%17%
10%
pruningthreshold = 10%
Page 45
18/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.
v
35%
24%17%
10%
<10%
<10%
<10%
<10%<10%
<10%
pruningthreshold = 10%
Page 46
19/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
.it web graph
103
104
105
106
10-710-610-510-410-310-210-1
ave
rage c
ost
pruning threshold
(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)
(2.56,5.12)(1.28,2.56)
Page 47
20/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
LiveJournal graph
103
104
105
106
10-710-610-510-410-310-210-1
ave
rage c
ost
pruning threshold
(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)
(2.56,5.12)(1.28,2.56)
Page 48
21/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
.it web graph
-0.2
0
0.2
0.4
0.6
0.8
1
10-7
10-6
10-5
10-4
10-3
10-2
10-1
pruning threshold
(2.56,5.12)(1.28,2.56)(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)
fra
ctio
n o
f co
rre
ctly
ra
nke
d n
od
e p
air
s
Page 49
22/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Exp. 2: performance of approximationalgorithms
LiveJournal graph
-0.2
0
0.2
0.4
0.6
0.8
1
10-710-610-510-410-310-210-1
fra
ctio
n o
f co
rre
ctly
ra
nke
d n
od
e p
air
s
pruning threshold
(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)
(1.28,2.56)(2.56,5.12)
Page 50
23/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
1. Local computation of PageRank ranking is infeasible
2. Cost of exact local ranking algorithms bounded by minsets
3. Tested real web/social graphs are near worst-case
4. And approximation is not trivial
Marco Bressan, Luca Pretto. Local computation of PageRank: the ranking side.Proc. of CIKM 2011: 631-640
Page 51
24/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
psort, yet another fast stableexternal sorting software
Page 52
25/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
In a nutshell
the psort sorting library
• written in C++• handles large datasets (> TB)• stable sorting• fast• designed for PC-class machines
ideal applications of psort
• sorting large databases• sorting large log files• sorting on commodity machines• . . .
Page 53
25/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
In a nutshell
the psort sorting library
• written in C++• handles large datasets (> TB)• stable sorting• fast• designed for PC-class machines
ideal applications of psort
• sorting large databases• sorting large log files• sorting on commodity machines• . . .
Page 54
26/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
psort and the Sort Benchmark (1/2)
The PennySort Benchmark
Sort what you can in 0.01$ of computing time.
1998
1999
2000
2002
2003
2007
2008
2009
2011
0 GB
50 GB
100 GB
150 GB
200 GB
250 GB
300 GB
350 GB
400 GBye
arly
rec
ord
(Sor
t Ben
chm
ark)
psort
Source: http://sortbenchmark.org
Paolo Bertasi, Marco Bressan, Enoch Peserico. psort, yet another fast stable sorting software.
ACM Journal of Experimental Algorithmics 16: (2011)
Page 55
27/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
psort and the Sort Benchmark (2/2)
The Datamation BenchmarkSort 100MB disk-to-disk as fast as you can.
440 msNOW-sort (2001)
980 sthunder (1987)
psort (2011)
Paolo Bertasi, Michele Bonazza, Marco Bressan, Enoch Peserico: Datamation. A Quarter of a
Century and Four Orders of Magnitude Later. CLUSTER 2011: 605-609
Page 56
28/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
psort and the STXXL library
101
102
103
1040
20
40
60
80
100
120
140
160
180
200
sort size (in MB)
sort
spe
ed (
in M
B/s
)
stxxl on disks (8,8)stxxl on disks (8,32)stxxl on disks (8,128)stxxl on RAID (8,8)stxxl on RAID (8,32)stxxl on RAID (8,128)psort on RAID (8,8)psort on RAID (8,32)psort on RAID (8,128)
Page 57
29/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Machine budget for Sort Benchmark 2011
Power Supply Unit15 EUR
Case22 EUR
CPU38 EUR
RAM47 EURMotherboard
60 EUR
Hard Disks215 EUR
Assembly fee35 EUR
Page 58
30/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
The big picture
psort execution diagram
CPU/cache
main memory
external memory
mergesort heap merge heap merge
1st disk pass 2nd disk pass
time
1MB, 10GB/s
1GB, 3GB/s
1TB, 0.7GB/s
Page 59
31/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
The big picture - now complicated
Hardware/software details you must deal with:
I/O• hdd quality• file system• scheduling
• buffer size• direct transfer• data placement
memory• size• bandwidth• latency
• page size• access pattern• conflicts
cache• size• speed
• line size• associativity
Page 60
32/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Hard disks
The speed curve of 13 “identical” WD1600JS disks
0 50 100 1500
50
100
150
Bandw
idth
(M
B/s
)
Distance from the outer rim (in GB)
Page 61
33/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Memory
Why main memory is not really a RAM
0.5
1
1.5
2
2.5
3
3.5
4
4.5
struct size (bytes)
band
wid
th(G
B/s
)
sequential readrandom readsequential writerandom write
20 22 24 26 28 210 212 214 216 218
L2 c
ach
e lin
e s
ize
Page 62
34/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
CPU
Is a dual-core always worth its price?
0
5e+09
1e+10
1.5e+10
2e+10
2.5e+10
3e+10
16 18 20 22 24 26 28 30
band
wid
th (
MB
/s)
log2( bytes visited )
Intel dual core readIntel dual core write
AMD single core readAMD single core write
Page 63
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 64
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 65
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 66
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 67
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 68
35/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
A list of psort’s tricks
general• fast polling• payload
detachment
• key pre/postprocessing• . . .
diskaccess
• O_DIRECT• independent
disks
• uniform fetching• . . .
mergesort • smart merging• quasi-in-place
• special base case• . . .
heapsort• key caching• key offsetting
• payload interleaving• . . .
Page 69
36/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Smart merging (1/3)
Naive merging
void merge(T *s1, T *s2, T *out, int size) int i = 0, j = 0, k = 0;bool bit;while ((i < size) & (j < size))
if (s1[i] > s2[j]) // READ + READout[k] = s2[j]; // READj++;
else out[k] = s1[i]; // (READ)i++;
k++;...
total mem READs per iteration: 3
Page 70
36/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Smart merging (1/3)
Naive merging
void merge(T *s1, T *s2, T *out, int size) int i = 0, j = 0, k = 0;bool bit;while ((i < size) & (j < size))
if (s1[i] > s2[j]) // READ + READout[k] = s2[j]; // READj++;
else out[k] = s1[i]; // (READ)i++;
k++;...
total mem READs per iteration: 3
Page 71
37/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Smart merging (2/3)
Smart merging
void merge(T* s1, T* s2, T* out, int size) int i = 0, j = 0, k = 0;bool bit;T cache[ 2 ];cache[0] = s1[0];cache[1] = s2[0];while ((i < size) & (j < size))
if (cache[0] > cache[1]) out[k] = cache[1];cache[1] = s2[j]; // READj++;
else out[k] = cache[0];cache[0] = s1[i]; // (READ)i++;
k++;...
total mem READs per iteration: 1
Page 72
37/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Smart merging (2/3)
Smart merging
void merge(T* s1, T* s2, T* out, int size) int i = 0, j = 0, k = 0;bool bit;T cache[ 2 ];cache[0] = s1[0];cache[1] = s2[0];while ((i < size) & (j < size))
if (cache[0] > cache[1]) out[k] = cache[1];cache[1] = s2[j]; // READj++;
else out[k] = cache[0];cache[0] = s1[i]; // (READ)i++;
k++;...
total mem READs per iteration: 1
Page 73
38/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Smart merging (3/3)
Time required to merge two sequences
0
100000
200000
300000
400000
500000
600000
700000
800000
10 12 14 16 18 20 22 24
tim
e in m
icro
seconds
log2( merge size )
smart mergenaive merge
Page 74
39/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Quasi-in-place mergesort (1/3)
traditional mergesort
void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) merge(&input[j * subsize],
&input[(j + 1) * subsize],&output[j * subsize * 2],subsize);
T* tmp = input; // swap input and outputinput = output;output = tmp;
extra space = N
Page 75
39/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Quasi-in-place mergesort (1/3)
traditional mergesort
void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) merge(&input[j * subsize],
&input[(j + 1) * subsize],&output[j * subsize * 2],subsize);
T* tmp = input; // swap input and outputinput = output;output = tmp;
extra space = N
Page 76
40/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Quasi-in-place mergesort (2/3)
“quasi-in-place” mergesort
void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size/2); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) /* merge, overwriting the input vector */merge(&input[j * subsize],
&input[(j + 1) * subsize],&input[(j - 1) * subsize],subsize);
input = &input[-subsize]; // shift input left
// finally merge into the output vectormerge(input, &input[size/2], output, size/2);
extra space = N/2
Page 77
40/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Quasi-in-place mergesort (2/3)
“quasi-in-place” mergesort
void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size/2); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) /* merge, overwriting the input vector */merge(&input[j * subsize],
&input[(j + 1) * subsize],&input[(j - 1) * subsize],subsize);
input = &input[-subsize]; // shift input left
// finally merge into the output vectormerge(input, &input[size/2], output, size/2);
extra space = N/2
Page 78
41/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Quasi-in-place mergesort (3/3)
Average time required to compare two keys
0
0.5
1
1.5
2
2.5
3
3.5
4
10 12 14 16 18 20 22 24
rela
tive
uniti
es
log2( input size in bytes )
quasi-in-place
Page 79
42/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
1. Solving old problems really fast is still tricky
2. To do it, you must match today’s hardware
3. Solution: software engineering and tuning
Paolo Bertasi, Marco Bressan, Enoch Peserico. psort, yet another fast stable sorting software.
ACM Journal of Experimental Algorithmics 16: (2011)
Page 80
43/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
Ranking
1. Local computation of PageRank ranking infeasible in theory
2. On tested web/social graphs, infeasible also in practice
3. Rank analysis requires novel tools!
Sorting
1. Solving old problems really fast is still tricky
2. To do it, you must match today’s hardware
3. Software engineering and tuning are the ways
And of course now you should pay me twice! :-)
Page 81
43/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
Ranking
1. Local computation of PageRank ranking infeasible in theory
2. On tested web/social graphs, infeasible also in practice
3. Rank analysis requires novel tools!
Sorting
1. Solving old problems really fast is still tricky
2. To do it, you must match today’s hardware
3. Software engineering and tuning are the ways
And of course now you should pay me twice! :-)
Page 82
43/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
Ranking
1. Local computation of PageRank ranking infeasible in theory
2. On tested web/social graphs, infeasible also in practice
3. Rank analysis requires novel tools!
Sorting
1. Solving old problems really fast is still tricky
2. To do it, you must match today’s hardware
3. Software engineering and tuning are the ways
And of course now you should pay me twice! :-)
Page 83
43/43
Localcomputation ofPageRank: theranking sideIntroduction
Motivations
Local ranking intheory
Local ranking inpractice
Conclusions
psort, yet anotherfast stableexternal sortingsoftwareIntroduction
Making sorting acomplicate task
Inside psort
Conclusions
Conclusions
Conclusions
Ranking
1. Local computation of PageRank ranking infeasible in theory
2. On tested web/social graphs, infeasible also in practice
3. Rank analysis requires novel tools!
Sorting
1. Solving old problems really fast is still tricky
2. To do it, you must match today’s hardware
3. Software engineering and tuning are the ways
And of course now you should pay me twice! :-)