Top Banner
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting
15

CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

CSC 2300Data Structures & Algorithms

March 23, 2007

Chapter 7. Sorting

Page 2: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Today – Sorting

Quicksort – Algorithm Pivot Analysis

Worst Case Best Case Average Case

Page 3: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Quicksort – Algorithm

1. If the number of elements in S is 0 or 1, then return.

2. Pick any element v in S. This is called the pivot.

3. Partition S – {v} into two disjoint groups:

S1 = { x ε S – {v} | x ≤ v}

and

S2 = { x ε S – {v} | x ≥ v}.

4. Return { quicksort(S1) followed by v followed by quicksort(S2)}.

Page 4: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Quicksort – Example

Page 5: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Quicksort – Partition Strategy Example. Input: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Say 6 is chosen as pivot. 8 1 4 9 0 3 5 2 7 6

i j pivot 8 1 4 9 0 3 5 2 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

j i pivot 2 1 4 5 0 3 6 8 7 9

pivot

Page 6: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Choices of Pivot

Four suggestions: First element of array; Larger of first two distinct elements of array; Middle element of array; Randomly.

What do you think about these choices? All bad choices. Why?

Page 7: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Good Choice of Pivot

Best choice: median of array. Disadvantage? Practical choice: Median of Three. What is it? Median of left, right, and center elements. Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Median of 8, 6, and 0.

Page 8: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Example

Example: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Pivot = Median of 8, 6, and 0. What should new array look like? Recall what we have done:

8 1 4 9 0 3 5 2 7 6i j pivot

Can we do better?0 1 4 9 6 3 5 2 7 8i pivot j

Where should we move pivot?0 1 4 9 7 3 5 2 6 8

i j pivot

Page 9: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Median-of-Three Code

Page 10: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Quicksort – Analysis

Quicksort is recursive. We thus get a recurrence formula:

T(0) = T(1) = 1,

T(N) = T(i) + T(N – i – 1) + cN,

where i denotes the number of elements in S1. What value of i gives worst case? What value of i gives best case?

Page 11: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Worst Case Analysis

We have i = 0, always. What does that say about the pivot? Always the smallest element. Recurrence becomes

T(N) = T(0) + T(N – 1) + cN. Ignore T(0), and get

T(N) = T(N – 1) + cN. Hence

T(N – 1) = T(N – 2) + c(N – 1),T(N – 2) = T(N – 3) + c(N – 2),…T(2) = T(1) + c(2).

We getT(N) = T(1) + c ∑ i = 1 + c [ N(N+1)/2 – 1] = O(N2).

Page 12: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Best Case Analysis

We have i = N/2, always. What does that say about the pivot? Always the median. Recurrence becomes

T(N) = T(N/2) + T(N/2) + cN = 2 T(N/2) + cN. Do you remember how to solve this recurrence? Divide by N to get

T(N)/N = T(N/2)/(N/2) + c. Thus,

T(N/2)/(N/2) = T(N/4)/(N/4) + c,T(N/4)/(N/4) = T(N/8)/(N/8) + c,…T(2)/2 = T(1)/1 + c.

We getT(N)/N = T(1)/1 + c logN,

and soT(N) = N + c N logN = O(N log N).

Page 13: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Average Case Analysis

Always much harder than worst and best cases. What can we assume about the pivot? Assume that each of the sizes for S1 is equally likely and

thus has probability 1/N. The average value of T(i) is thus (1/N) ∑ T(j). What can we say about the value of T(N – i – 1)? Recurrence becomes

T(N) = (2/N) ∑ T(j) + cN. Does this recurrence look familiar? When we did an internal path length analysis in Chapter 4

(Trees).

Page 14: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Average Case Analysis

Recurrence:

T(N) = (2/N) ∑ T(j) + cN. How can we solve this recurrence? Divide by N? No, multiply by N! We get this recurrence:

N T(N) = 2 ∑ T(j) + cN2. How do we get rid of the ∑ T(j) ? We use this recurrence:

(N – 1)T(N – 1) = 2 ∑ T(j) + c(N – 1)2. Subtracting one recurrence from the other, we get

NT(N) – (N – 1)T(N – 1) = 2 T(N – 1) + c(2N – 1). Simplifying and dropping the c term, we get

NT(N) = (N+1) T(N – 1) + 2cN.

Page 15: CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.

Recurrence

Recurrence:NT(N) = (N+1) T(N – 1) + 2cN.

How can we solve this recurrence? Divide by N? Divide by N+1? No, divide by N(N+1)! We get this recurrence:

T(N)/(N+1) = T(N – 1)/N + 2c/(N+1). What to do now? We can telescope:

T(N – 1)/N = T(N – 2)/(N – 1) + 2c/N,T(N – 2)/(N – 1) = T(N – 3)/(N – 2) + 2c/(N – 1),…T(2)/3 = T(1)/2 + 2c/3.

We get this solution:T(N)/(N+1) = T(1)/2 + 2c ∑ (1/i).

What does ∑ (1/i) equal? We get T(N) = O(N log N).