Top Banner
Copyright (C) Gal Kaminka 2 003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department
36

Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

Dec 18, 2015

Download

Documents

Osborn Sharp
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

Copyright (C) Gal Kaminka 2003 1

Data Structures and Algorithms

Sorting II:

Divide and Conquer Sorting

Gal A. Kaminka

Computer Science Department

Page 2: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

2

Last week: in-place sorting

Bubble Sort – O(n2) comparisons O(n) best case comparisons, O(n2) exchanges

Selection Sort - O(n2) comparisons O(n2) best case comparisons O(n) exchanges (always)

Insertion Sort – O(n2) comparisons O(n) best case comparisons Fewer exchanges than bubble sort Best in practice for small lists (<30)

Page 3: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

3

This week

Mergesort O(n log n) always O(n) storage

Quick sort O(n log n) average, O(n^2) worst Good in practice (>30), O(log n) storage

Page 4: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

4

MergeSort A divide-and-conquer technique Each unsorted collection is split into 2

Then again Then again

Then again

……. Until we have collections of size 1 Now we merge sorted collections

Then again Then again

Then again Until we merge the two halves

Page 5: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

5

MergeSort(array a, indexes low, high)1. If (low < high)

2. middle(low + high)/2

3. MergeSort(a,low,middle) // split 1

4. MergeSort(a,middle+1,high) // split 2

5. Merge(a,low,middle,high) // merge 1+2

Page 6: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

6

Merge(arrays a, index low, mid, high)1. bempty array, tmid+1, ilow, tllow

2. while (tl<=mid AND t<=high)

3. if (a[tl]<=a[t])

4. b[i]a[tl]

5. ii+1, tltl+1

6. else

7. b[i]a[t]

8. ii+1, tt+1

9. if tl<=mid copy a[tl…mid] into b[i…]

10. else if t<=high copy a[t…high] into b[i…]

11. copy b[low…high] onto a[low…high]

Page 7: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

7

An example

Initial: 25 57 48 37 12 92 86 33

Split: 25 57 48 37 12 92 86 33

Split: 25 57 48 37 12 92 86 33

Split: 25 57 48 37 12 92 86 33

Merge: 25 57 37 48 12 92 33 86

Merge: 25 37 48 57 12 33 86 92

Merge: 12 25 33 37 48 57 86 92

Page 8: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

8

The complexity of MergeSort

Every split, we half the collection How many times can this be done?

We are looking for x, where 2x = n

x = log2 n

So there are a total of log n splits

Page 9: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

9

The complexity of MergeSort

Each merge is of what run-time? First merge step: n/2 merges of 2 n Second merge step: n/4 merges of 4 n Third merge step: n/8 merges of 8 n …. How many merge steps? Same as splits log n

Total: n log n steps

Page 10: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

10

Storage complexity of MergeSort

Every merge, we need to hold the merged array:

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4

1 2 3 4 5 6

Page 11: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

11

Storage complexity of MergeSort

So we need temporary storage for merging Which is the same size as the two collections together

To merge the last two sub-arrays (each size n/2)

We need n/2+n/2 = n temporary storage

Total: O(n) storage

Page 12: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

12

MergeSort summary

O(n log n) runtime (best and worst) O(n) storage (not in-place) Very naturally done using recursion

But note can be done without recursion!

In practice: Can be improved by combining with insertion sort Split down to arrays of size 20-30, then insert-sort Then merge

Page 13: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

13

QuickSort

Key idea: Select a item (called the pivot) Put it into its proper FINAL position Make sure:

All greater item are on one side (side 1) All smaller item are on other side (side 2)

Repeat for side 1 Repeat for side 2

Page 14: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

14

Short example

25 57 48 37 12 92 86 33 Let’s select 25 as our initial pivot. We move items such that:

All left of 25 are smaller All right of 25 are larger As a result 25 is now in its final position

12 25 57 48 37 92 86 33

Page 15: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

15

Now, repeat (recursively) for left and right sides

12 25 57 48 37 92 86 33 Sort 12 Sort 57 48 37 92 86 33

12 needs no sorting For the other side, we repeat the process

Select a pivot item (let’s take 57) Move items around such that left items are smaller,

etc.

Page 16: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

16

12 25 57 48 37 92 86 33

Changes into

12 25 48 37 33 57 92 86

And now we repeat the process for left

12 25 37 33 48 57 92 86

12 25 33 37 48 57 92 86

12 25 33 37 48 57 92 86

And for the right

12 25 33 37 48 57 86 92

12 25 33 37 48 57 86 92

Page 17: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

17

QuickSort(array a; index low, hi)

1. if (low >= hi)

2. return ; // a[low..hi] is sorted

3. pivotfind_pivot(a,low,hi)

4. p_index=partition(a,low,high,pivot)

5. QuickSort(a,low,p_index-1)

6. QuickSort(a,p_index+1,hi)

Page 18: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

18

Key questions

How do we select an item (FindPivot())? If we always select the largest item as the pivot

Then this process becomes Selection Sort Which is O(n2)

So this works only if we select items “in the middle” Since then we will have log n divisions

How do we move items around efficiently (Partition()?) This offsets the benefit of partitioning

Page 19: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

19

FindPivot

To find a real median (middle item) takes O(n) In practice however, we want this to be O(1) So we approximate:

Take the first item (a[low]) as the pivot Take the median of {a[low],a[hi],a[(low+hi)/2]}

FindPivot(array a; index low, high)

1. return a[low]

Page 20: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

20

Partition (in O(n))

Key idea: Keep two indexes into the array

up points at lowest item >= pivot down points at highest item <= pivot

We move up, down in the array Whenever they point inconsistently, interchange

At end: up and down meet in location of pivot

Page 21: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

21

partition(array a; index low,hi ; pivot; index pivot_i)

1. downlow, uphi

2. while(down<up)

3. while (a[down]<=pivot && down<hi)

4. downdown + 1

5. while (a[hi]>pivot)

6. upup – 1

7. if (down < up)

8. swap(a[down],a[up])

9. a[pivot_i]=a[up]

10. a[up] = pivot

11. return up

Page 22: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

22

Example: partition() with pivot=25

First pass through loop on line 2:

25 57 48 37 12 92 86 33

down up

Page 23: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

23

Example: partition() with pivot=25

First pass through loop on line 2:

25 57 48 37 12 92 86 33

down up

We go into loop in line 3 (while a[down]<=pivot)

Page 24: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

24

Example: partition() with pivot=25

First pass through loop on line 2:

25 57 48 37 12 92 86 33

down up

We go into loop in line 5 (while a[up]>pivot)

Page 25: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

25

Example: partition() with pivot=25

First pass through loop on line 2:

25 57 48 37 12 92 86 33

down up

We go into loop in line 5 (while a[up]>pivot)

Page 26: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

26

Example: partition() with pivot=25

First pass through loop on line 2:

25 57 48 37 12 92 86 33

down up

Now we found an inconsistency!

Page 27: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

27

Example: partition() with pivot=25

First pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

So we swap a[down] with a[up]

Page 28: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

28

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

Page 29: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

29

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

Move down again (increasing) – loop on line 3

Page 30: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

30

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

Now we begin to move up again – loop on line 5

Page 31: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

31

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

Again – loop on line 5

Page 32: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

32

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

down < up? No. So we don’t swap.

Page 33: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

33

Example: partition() with pivot=25

Second pass through loop on line 2:

25 12 48 37 57 92 86 33

down up

Instead, we are done. Just put pivot in place.

Page 34: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

34

Example: partition() with pivot=25

Second pass through loop on line 2:

12 25 48 37 57 92 86 33

down up

Instead, we are done. Just put pivot in place.

(swap it with a[up] – for us a[low] was the pivot)

Page 35: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

35

Example: partition() with pivot=25

Second pass through loop on line 2:

12 25 48 37 57 92 86 33

down up

Now we return 2 as the new pivot index

Page 36: Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.

36

Notes We need the initial pivot_index in partition() For instance, change FindPivot():

return pivot (a[low]), as well as initial pivot_index (low) Then use pivot_index in the final swap

QuickSort: Average O(n log n), Worst case O(n2)

works very well in practice (collections >30) Average O(n log n), Worst case O(n2) Space requirements O(log n) – for recursion