Page 1
Engineering Java 7’s Dual Pivot QuicksortUsing MaLiJAn
Sebastian Wild Markus E. Nebel Raphael Reitzig Ulrich Laube[wild, nebel, r_reitzi, laube] @cs.uni-kl.de
Computer Science DepartmentUniversity of Kaiserslautern
January 7, 2013Meeting on Algorithm Engineering & Experiments 2013
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 1 / 23
Page 2
Background
Since Java 7: new dual pivot Quicksort in JRE library
Basic algorithm by Vladimir YaroslavskiyOptimizations by Jon Bentley, Joshua Bloch and others(see java.core-libs.devel mailing list)
Motivated by experience with classic QuicksortValidated by running time benchmark
In this talk:Can we exploit special properties of dual pivot Quicksort?
Can we get more insight than running time measurements?
. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 2 / 23
Page 3
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Select two elements as pivots.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 4
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
p q
3 5 1 8 4 7 2 9 6
Only value relative to pivot counts.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 5
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
k
A[k] is medium ; go on
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 6
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
` k
A[k] is small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 7
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 5 1 8 4 7 2 9 6
` k
Swap small element to left end.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 8
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
` k
Swap small element to left end.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 9
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
` k
A[k] is large ; Find swap partner.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 10
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
g` k
A[k] is large ; Find swap partner:g skips over large elements.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 11
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 8 4 7 2 9 6
g` k
A[k] is large ; Swap
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 12
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 2 4 7 8 9 6
g` k
A[k] is large ; Swap
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 13
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 5 2 4 7 8 9 6
g` k
A[k] is old A[g], small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 14
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is old A[g], small ; Swap to left
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 15
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is medium ; go on
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 16
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is large ; Find swap partner.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 17
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
A[k] is large ; Find swap partner:g skips over large elements.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 18
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
3 1 2 5 4 7 8 9 6
g` k
g and k have crossed!Swap pivots in place
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 19
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
g` k
g and k have crossed!Swap pivots in place
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 20
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Partitioning done!
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 21
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
2 1 3 5 4 6 8 9 7
Recursively sort three sublists.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 22
Java 7’s Dual Pivot Quicksort – Example
Yaroslavskiy’s Dual Pivot Quicksort(used in Oracle’s Java 7 Arrays.sort(int[]))
1 2 3 4 5 6 7 8 9
Done.
Invariant: < p `→
> qg
←p 6 ◦ 6 q k
→?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
Page 23
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 24
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 1
A[k]: small
A[g]: —
∆(g− k): 1
BytecodeInstructions: 24
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 25
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 2
A[k]: medium
A[g]: —
∆(g− k): 1
BytecodeInstructions: 15
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 26
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 3
A[k]: large
A[g]: large
∆(g− k): 1
BytecodeInstructions: 10
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 27
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 4
A[k]: large
A[g]: small
∆(g− k): 2
BytecodeInstructions: 44
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 28
Control Flow Graph of Partitioning Loop
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Cycle 5
A[k]: large
A[g]: medium
∆(g− k): 2
BytecodeInstructions: 36
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
Page 29
Asymmetry
1 bc: 3k 6 g
2 bc: 7
t := A[k];t < p
4 bc: 3t > q
3 bc: 12
A[k] := A[`];A[`] := t;` := `+ 1;
5 bc: 5
A[g] > q
6 bc: 3k < g
7 bc: 2g := g− 1;
8 bc: 5
A[g] < p
9 bc: 14
A[k] := A[`];A[`] := A[g]` := `+ 1;
10 bc: 6
A[k] := A[g]
11 bc: 5
A[g] := t;g := g− 1;
12 bc: 2k := k+ 1
no
yes
no
yes
no
yes yes
yes
nono
yes no
Algorithm is asymmetric:
Cycles have different cost; Would rather execute cheap
ones often
Cycles chosen by classessmall , medium or large
Probability for classes dependson pivot values
; Maybe we can “influence pivot values accordingly”?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 5 / 23
Page 30
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Page 31
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Page 32
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Page 33
Pivot Sampling
Well-known optimization for classic Quicksort: median-of-three; pivot closer to median of whole list
In JRE7 Quicksort implementation: natural extension for 2 pivots:
p q
tertiles-of-five; pivots closer to tertiles of whole list
9 other possibilities to pick p and q out of 5 elements:
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
Page 34
Optimizing Pivot Sampling
Which are “good” pivot selection schemes?Is the symmetric choice best possible?
Need objective function to optimize
Typical approaches to judge efficiency:
A Count number of basic operations.(Here: number of executed Java Bytecode instructions.)
B Measure total running time.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 7 / 23
Page 35
Optimizing Pivot Sampling
Relative performance of pivot sampling compared to tertiles-of-five:Pivot Selection Scheme A 1 B 2
JRE7+5.14% +0.80%
JRE7(1,3) −1.85% −0.44%
+3.34% −0.42%
— (stack overflow!) +10.6%
+2.48% +2.73%
+11.3% +3.31%
+12.7% +3.29%
+16.4% +2.48%
+39.0% +5.87%
1Average number of executed bytecodes on almost sorted lists of length 105.2Average running time on random permutations of length 106.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 8 / 23
Page 36
Methods
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Page 37
Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
2
43 5 6
7
8
9 10
1112
View program as Markov chain over blocks
Termination via absorbing state
Transition i→ j has probability p(n)
i→j
depending on input size n
Visiting block i incurs constant costs c(i)Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Page 38
Model and Method
What made JRE7(1,3) faster than JRE7 ?
. . . hard to tell from total time/bytecodes.
Need a more detailed model of the program.
Idea: Decompose along control flow graph!
1
2
43 5 6
7
8
9 10
1112
View program as Markov chain over blocks
Termination via absorbing state
Transition i→ j has probability p(n)
i→j
depending on input size n
Visiting block i incurs constant costs c(i)Total cost is sum of block costs
Expected costs of program = expected costs of run of Markov chain
Latter easy to compute
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
Page 39
Maximum Likelihood Analysis
How to determine block costs and transition probabilities?
Transition Probabilities
1
2
Count transitions in executions on sample data; Allows arbitrary input distributions!
Take relative frequency as estimate for p(n)
i→j
Extrapolate p(n)
i→j to a function pi→j(n) in n
Block Costs
1
2
We consider two cost measures:A bc(i) = number of Bytecodes instructions in block i.
B t(i) = running time of block i
All steps are automated in our tool MaLiJAn3
3http://wwwagak.cs.uni-kl.de/malijan.htmlSebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 10 / 23
Page 40
Block Sampling
Running times t(i) in B are typically few nanoseconds; direct measurement not possible.
Idea: Sampling Based Approach
1 2 3
12
1 2 4 5 6 7 5 6 7 5 6 7 8 10
11 12
1
time
3 2 6 5 5 8 10sampling
ns
µs
In regular intervals, store current basic block (concurrently)We observe only ≈ 1h of all blocks ; repeat execution
Relative frequencies of observed samples approachrelative running time contribution of blocks.
Count in separate run how often block i gets executed in totalTogether, this allows to compute t(i)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 11 / 23
Page 41
A Decent Word of Caution
�1 Determining current block adds a small systematic error.
2 Java Specialty: Just-in-time Compilation
Running time heavily influenced by HotSpot JIT compilerJIT collects profiling information at beginning
; First input determines which optimizations are found
. . . more details in the paper
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 12 / 23
Page 42
Input Distributions
We consider 2 different input distributions:1 Random Permutations
well-studied in literature
2 Almost Sorted Lists
Random model by Brodal et al.4:A[i] chosen i. i. d. uniform in [i− d, i+ d]for constant d (here d = 100)
4G. Brodal, R. Fagerberg, G. Moruz: On the Adaptiveness of Quicksort,J. Exp. Algorithmics 12 (2008), pp. 3.2:1–3.2:20
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 13 / 23
Page 43
Results
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Page 44
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n
15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n
13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
22
23
24 log. plot, normalized by n lnnJRE7, JRE7(1,3)
model fits data well!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Page 45
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n
15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n
13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
22
23
24 log. plot, normalized by n lnnJRE7, JRE7(1,3)
model fits data well!
105 106 107 108
22
23
24
n
bcn
lnn
19.40n lnn+ 51n18.73n lnn+ 62nJRE7JRE7(1,3)
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Page 46
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10818
19
20
21log. plot, normalized by n lnn
JRE7, JRE7(1,3)model fits data well!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Page 47
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7
20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3)
19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
; asymptotically, JRE7(1,3) executes less Bytecodes!
Can we explain, why?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
Page 48
Cycle Costs
bc
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
In #Bytecodes:
Cycle 3 cheapest
Cycle 1 most expensive
of all cycles
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 15 / 23
Page 49
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Page 50
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Page 51
Asymptotic Cycle Frequencies
JRE7 JRE7(1,3) JRE7 JRE7(1,3)
0
0.2
0.4
random permutations almost sorted
· n lnn+ O(n)
; JRE7(1,3) executes
Cycle 3 more often
Cycle 1 less often
than JRE7
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
JRE7(1,3) executes cheap Cycle 3 more oftenand expensive Cycle 1 less often than JRE7.
; Asymptotically, less executed Bytecodes!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
Page 52
Running Time Results
How about running time?
HotSpot JIT compiler has two modes
-Xcomp JIT compiler without profiling informationwarmup profiling JIT with warmup on fixed input
; trigger JIT compilation
; Do Block Sampling for both modes
Should we expect same block running times?. . . stay tuned
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 17 / 23
Page 53
Cycle Costs
bc
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
but:smaller difference
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
Page 54
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
but:smaller difference
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
Page 55
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10820
22
24
105 106 107 10814
15
16
17
18
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Page 56
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7
10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3)
11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 10820
22
24
105 106 107 10814
15
16
17
18
JIT without profiling
; asymptotically, JRE7(1,3) faster!
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Page 57
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Page 58
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Page 59
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
Page 60
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Page 61
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Page 62
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
;For JRE7(1,3), the code created by profiling JITfor Cycle 3 is much slower than for JRE7!
; That’s the place to focus future research on.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Page 63
Cycle Costs
bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3)
0
0.5
1
-Xcomp with warmup
· cost(Cycle 5)
measures agreequalitatively
except for JRE7(1,3)with profiling JIT!
Cycle 1
1
2
43 5 6
7
8
9 10
1112
Cycle 2
1
2
43 5 6
7
8
9 10
1112
Cycle 3
1
2
43 5 6
7
8
9 10
1112
Cycle 4
1
2
43 5 6
7
8
9 10
1112
Cycle 5
1
2
43 5 6
7
8
9 10
1112
;For JRE7(1,3), the code created by profiling JITfor Cycle 3 is much slower than for JRE7!
; That’s the place to focus future research on.
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
Page 64
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
Page 65
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
Page 66
Asymptotic Expected Costs
Measure Algorithm Random Permutations Almost Sorted Lists
Bytecodes AJRE7 19.40n lnn+ 51n 15.10n lnn+ 68n
JRE7(1,3) 18.73n lnn+ 62n 13.52n lnn+ 85n
time -Xcomp BJRE7 20.10n lnn+ 26n 11.95n lnn+ 54n
JRE7(1,3) 19.95n lnn+ 32n 11.09n lnn+ 64n
time warmup BJRE7 10.02n lnn+ 9n 5.52n lnn+ 13n
JRE7(1,3) 11.39n lnn+ 15n 5.38n lnn+ 19n
105 106 107 108
10
12
105 106 107 1084
6
8
JIT with profiling and warmup
; asymptotically, JRE7(1,3) slower!
What changes with profiling enabled?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 22 / 23
Page 67
Conclusion
SummaryJava 7’s dual pivot Quicksort is highly asymmetric.
executes less Bytecodes than .
Almost sorted inputs amplify impact of pivot sampling.
Oracle’s profiling JIT compiler creates different code for JRE7(1,3),which potentially overcompensates gains.
Control flow graph decomposition supported by MaLiJAn makesdifference in code efficiency directly visible.
Open Problems? What causes different costs for Cycle 3?? Are the differences idiosyncracies of Java / Oracle’s JRE?? Performance of JRE7(1,3) on other inputs, especially with equal keys?
Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 23 / 23