Page 1
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.14
Quicksort
• Proposed by C.A.R. Hoare in 1962.• Divide-and-conquer algorithm.• Sorts “in place” (like insertion sort, but not
like merge sort).• Very practical (with tuning).• Can be viewed as a randomized Las Vegas
algorithm
Page 2
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.15
Quicksort an n-element array:1. Divide: Partition the array into two subarrays
around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray.
2. Conquer: Recursively sort the two subarrays.3. Combine: Trivial.
Divide and conquer
xx
Key: Linear-time partitioning subroutine.
≤ x≤ x≤ x≤ x xx ≥ x≥ x
Page 3
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.16
Pseudocode for quicksortQUICKSORT(A, p, r)
if p < rthen q ← PARTITION(A, p, r)
QUICKSORT(A, p, q–1)QUICKSORT(A, q+1, r)
Initial call: QUICKSORT(A, 1, n)
Page 4
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.17
xx
Partitioning subroutinePARTITION(A, p, r) ⊳ A[p . . r]
x ← A[p] ⊳ pivot = A[p]i ← pfor j ← p + 1 to r
do if A[ j] ≤ xthen i ← i + 1
exchange A[i] ↔ A[ j]exchange A[p] ↔ A[i]return i
≤ x≤ x ≥ x≥ x ??p i rj
Invariant:
Page 5
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.18
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
Page 6
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.19
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
Page 7
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.20
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
Page 8
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.21
Example of partitioning
66 1010 1313 55 88 33 22 1111
i j66 55 1313 1010 88 33 22 1111
Page 9
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.22
Example of partitioning
66 1010 1313 55 88 33 22 1111
i j66 55 1313 1010 88 33 22 1111
Page 10
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.23
Example of partitioning
66 1010 1313 55 88 33 22 1111
i j66 55 1313 1010 88 33 22 1111
Page 11
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.24
Example of partitioning
66 1010 1313 55 88 33 22 1111
i j66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
Page 12
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.25
Example of partitioning
66 1010 1313 55 88 33 22 1111
i j66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
Page 13
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.26
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
i j66 55 33 22 88 1313 1010 1111
Page 14
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.27
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
i j66 55 33 22 88 1313 1010 1111
Page 15
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.28
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
i j66 55 33 22 88 1313 1010 1111
Page 16
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.29
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 1313 1010 88 33 22 1111
66 55 33 22 88 1313 1010 1111
i22 55 33 66 88 1313 1010 1111
Page 17
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.30
xx
Analysis of quicksort
• Assume all input elements are distinct.• In practice, there are better partitioning
algorithms for when duplicate input elements may exist.
• What is the worst case running time of Quicksort ?
≤ x≤ x ≥ x≥ x
Page 18
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.31
Worst-case of quicksort
• Input sorted or reverse sorted.• Partition around min or max element.• One side of partition always has no elements.
)()()1(
)()1()1()()1()0()(
2nnnT
nnTnnTTnT
Θ=
Θ+−=Θ+−+Θ=Θ+−+=
(arithmetic series)
Page 19
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.32
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
Page 20
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.33
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
T(n)
Page 21
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.34
cnT(0) T(n–1)
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
Page 22
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.35
cnT(0) c(n–1)
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
T(0) T(n–2)
Page 23
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.36
cnT(0) c(n–1)
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
T(0) c(n–2)
T(0)
Θ(1)
O
Page 24
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.37
cnT(0) c(n–1)
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
T(0) c(n–2)
T(0)
Θ(1)
O
( )2
1nk
n
kΘ=
Θ ∑
=
Page 25
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.38
cnΘ(1) c(n–1)
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
Θ(1) c(n–2)
Θ(1)
Θ(1)
O
( )2
1nk
n
kΘ=
Θ ∑
=
h = n
Page 26
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.39
Nice-case analysis
If we’re lucky, PARTITION splits the array evenly:T(n) = 2T(n/2) + Θ(n)
= Θ(n lg n) (same as merge sort)
What if the split is always 109
101 : ?
( ) ( ) )()( 109
101 nnTnTnT Θ++=
Page 27
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.40
Analysis of nice case)(nT
Page 28
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.41
Analysis of nice casecn
( )nT 101 ( )nT 10
9
Page 29
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.42
Analysis of nice casecn
cn101 cn10
9
( )nT 1001 ( )nT 100
9 ( )nT 1009 ( )nT 100
81
Page 30
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.43
Analysis of nice casecn
cn101 cn10
9
cn1001 cn100
9 cn1009 cn100
81
Θ(1)
Θ(1)
… …log10/9n
cn
cn
cn
…
Page 31
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.44
log10n
Analysis of nice casecn
cn101 cn10
9
cn1001 cn100
9 cn1009 cn100
81
Θ(1)
Θ(1)
… …log10/9n
cn
cn
cn
T(n) ≤ cn log10/9n + Ο(n)
…
cn log10n ≤
Page 32
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.45
Randomized quicksort• Partition around a random element. I.e.,
around A[t] , where t chosen uniformly at random from {p…r}
• We will show that the expected time is O(n log n)
Page 33
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.46
“Paranoid” quicksort
• Will modify the algorithm to make it easier to analyze:
• Repeat:• Choose the pivot to be a random element of the array• Perform PARTITION
• Until the resulting split is “lucky”, i.e., not worse than 1/10: 9/10• Recurse on both sub-arrays
Page 34
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.51
Quicksort in practice
• Quicksort is a great general-purpose sorting algorithm.
• Quicksort is typically over twice as fast as merge sort.
• Quicksort can benefit substantially from code tuning.
• Quicksort behaves well even with caching and virtual memory.
• Quicksort is great!
Page 35
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.52
More intuitionSuppose we alternate lucky, unlucky, lucky, unlucky, lucky, ….
L(n) = 2U(n/2) + Θ(n) luckyU(n) = L(n – 1) + Θ(n) unlucky
Solving:L(n) = 2(L(n/2 – 1) + Θ(n/2)) + Θ(n)
= 2L(n/2 – 1) + Θ(n)= Θ(n lg n)
How can we make sure we are usually lucky?Lucky!
Page 36
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.53
Randomized quicksort analysis
Let T(n) = the random variable for the running time of randomized quicksort on an input of size n, assuming random numbers are independent.For k = 0, 1, …, n–1, define the indicator random variable
Xk =1 if PARTITION generates a k : n–k–1 split,0 otherwise.
E[Xk] = Pr{Xk = 1} = 1/n, since all splits are equally likely, assuming elements are distinct.
Page 37
Expected valueExpected value = Outcome • Probability
E[Xk] = 1•Pr{Xk = 1} + 0•Pr{Xk = 0}= 1•(1/n) + 0 •((n-1)/n)= 1/n
Dice Example:Dice Example:
Xk : Indicator random variable of dice throwing
Xk = 1 if dice’s value is k0 otherwise
E[X5] = 1•Pr{X5 = 1} + 0•Pr{X5 = 1}= 1•(1/6) + 0•(5/6) = 1/6
Page 38
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.54
Analysis (continued)
T(n) =
T(0) + T(n–1) + Θ(n) if 0 : n–1 split,T(1) + T(n–2) + Θ(n) if 1 : n–2 split,
MT(n–1) + T(0) + Θ(n) if n–1 : 0 split,
( )∑−
=Θ+−−+=
1
0)()1()(
n
kk nknTkTX .
Page 39
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.55
Calculating expectation( )
Θ+−−+= ∑
−
=
1
0)()1()()]([
n
kk nknTkTXEnTE
Take expectations of both sides.
Page 40
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.56
Calculating expectation( )
( )[ ]∑
∑−
=
−
=
Θ+−−+=
Θ+−−+=
1
0
1
0
)()1()(
)()1()()]([
n
kk
n
kk
nknTkTXE
nknTkTXEnTE
Linearity of expectation.
Page 41
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.57
Calculating expectation( )
( )[ ]
[ ] [ ]∑
∑
∑
−
=
−
=
−
=
Θ+−−+⋅=
Θ+−−+=
Θ+−−+=
1
0
1
0
1
0
)()1()(
)()1()(
)()1()()]([
n
kk
n
kk
n
kk
nknTkTEXE
nknTkTXE
nknTkTXEnTE
Independence of Xk from other random choices.
Page 42
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.58
Calculating expectation( )
( )[ ]
[ ] [ ]
[ ] [ ] ∑∑∑
∑
∑
∑
−
=
−
=
−
=
−
=
−
=
−
=
Θ+−−+=
Θ+−−+⋅=
Θ+−−+=
Θ+−−+=
1
0
1
0
1
0
1
0
1
0
1
0
)(1)1(1)(1
)()1()(
)()1()(
)()1()()]([
n
k
n
k
n
k
n
kk
n
kk
n
kk
nn
knTEn
kTEn
nknTkTEXE
nknTkTXE
nknTkTXEnTE
Linearity of expectation; E[Xk] = 1/n .
Page 43
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.59
Calculating expectation( )
( )[ ]
[ ] [ ]
[ ] [ ]
[ ] )()(2
)(1)1(1)(1
)()1()(
)()1()(
)()1()()]([
1
1
1
0
1
0
1
0
1
0
1
0
1
0
nkTEn
nn
knTEn
kTEn
nknTkTEXE
nknTkTXE
nknTkTXEnTE
n
k
n
k
n
k
n
k
n
kk
n
kk
n
kk
Θ+=
Θ+−−+=
Θ+−−+⋅=
Θ+−−+=
Θ+−−+=
∑
∑∑∑
∑
∑
∑
−
=
−
=
−
=
−
=
−
=
−
=
−
=
Summations have identical terms.
Page 44
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.60
Hairy recurrence
[ ] )()(2)]([1
2nkTE
nnTE
n
kΘ+= ∑
−
=
(The k = 0, 1 terms can be absorbed in the Θ(n).)
Prove: E[T(n)] ≤ an lg n for constant a > 0 .
Use fact: 21
2812
21 lglg nnnkk
n
k∑
−
=−≤ (exercise).
• Choose a large enough so that an lg ndominates E[T(n)] for sufficiently small n ≥ 2.
Page 45
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.61
Substitution method
[ ] )(lg2)(1
2nkak
nnTE
n
kΘ+≤ ∑
−
=
Substitute inductive hypothesis.
Page 46
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.62
Substitution method
[ ]
)(81lg
212
)(lg2)(
22
1
2
nnnnna
nkakn
nTEn
k
Θ+
−≤
Θ+≤ ∑−
=
Use fact.
Page 47
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.63
Substitution method
[ ]
Θ−−=
Θ+
−≤
Θ+≤ ∑−
=
)(4
lg
)(81lg
212
)(lg2)(
22
1
2
nannan
nnnnna
nkakn
nTEn
k
Express as desired – residual.
Page 48
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.64
Substitution method
[ ]
nan
nannan
nnnnna
nkakn
nTEn
k
lg
)(4
lg
)(81lg
212
)(lg2)(
22
1
2
≤
Θ−−=
Θ+
−=
Θ+≤ ∑−
=
if a is chosen large enough so that an/4 dominates the Θ(n).
,
Page 49
September 20, 2004 (c) Piotr Indyk & Charles Leiserson L4.66
Randomized Algorithms
• Algorithms that make decisions based on random coin flips.
• Can “fool” the adversary.• The running time (or even correctness) is a
random variable; we measure the expectedrunning time.
• We assume all random choices are independent .
• This is not the average case !