Top Banner
Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13- 140909-3 1 Mergesort and Review Chapter 13 6/15/15
59

Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Jan 02, 2016

Download

Documents

Daisy Arnold
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Adapted from instructor resource slidesNyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson

Education, Inc. All rights reserved. 0-13-140909-3

1

Mergesort and Review

Chapter 13

6/15/15

Page 2: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

2

Today• Any questions on project?• Exams

– Review questions– Easy question/fix (I added your points wrong)

come see me in class. If you need a re-grade, follow instructions in syllabus. Come see me in office hours, or make an appointment

• Review Thursday• Break• More (new) Sorting

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 3: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

3

Categories of Sorting Algorithms

• Selection sort– Make passes through a list– On each pass reposition correctly some element

(largest or smallest)

Page 4: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

4

Array Based Selection Sort Pseudo-Code//x[0] is reserved

For i = 1 to n-1 do the following:

//Find the smallest element in the sublist x[i]…x[n]

Set smallPos = i and smallest = x[smallPos]

For j = i + 1 to n-1 do the following:

If x[j] < smallest: //smaller element found

Set smallPos = j and smallest = x[smallPos]

End for

//No interchange smallest with x[i], first element of this sublist.

Set x[smallPos] = x[i] and x[i] = smallest

End for

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 5: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

5

In-Class Exercise #1: Selection Sort

• List of 9 elements:

90, 10, 80, 70, 20, 30, 50, 40, 60

Illustrate each pass…

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 6: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

6

Selection Sort Solution

Pass 0 90 10 80 70 20 30 50 40 60

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 10 90 80 70 20 30 50 40 60

2 10 20 80 70 90 30 50 40 60

3 10 20 30 70 90 80 50 40 60

4 10 20 30 40 90 80 50 70 60

5 10 20 30 40 50 80 90 70 60

6 10 20 30 40 50 60 90 70 80

7 10 20 30 40 50 60 70 90 80

8 10 20 30 40 50 60 70 80 90

Page 7: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

7

Categories of Sorting Algorithms

• Exchange sort– Systematically interchange pairs of elements

which are out of order– Bubble sort does this

Out of order, exchange In order, do not exchange

Page 8: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

8

Bubble Sort Algorithm

1. Initialize numCompares to n - 12. While numCompares != 0, do following

a. Set last = 1 // location of last element in a swap

b. For i = 1 to numPairsif xi > xi + 1

Swap xi and xi + 1 and set last = i

c. Set numCompares = last – 1End while

Page 9: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

9

In-Class Exercise #2: Bubble Sort

• List of 9 elements:

90, 10, 80, 70, 20, 30, 50, 40, 60

Illustrate each pass…

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 10: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

10

Bubble Sort Solution

Pass 0 90 10 80 70 20 30 50 40 60

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 10 80 70 20 30 50 40 60 90

2 10 70 20 30 50 40 60 80 90

3 10 20 30 50 40 60 70 80 90

4 10 20 30 40 50 60 70 80 90

5 10 20 30 40 50 60 70 80 90

Page 11: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

11

Categories of Sorting Algorithms

• Insertion sort– Repeatedly insert a new element into an already

sorted list

– Note this works well with a linked list implementation

All these have computing time O(n2)

All these have computing time O(n2)

Page 12: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

12

Insertion Sort Pseduo Code (Instructor’s Recommendation)

for j = 2 to A.length

key = A[j]

//Insert A[j] into the sorted sequence A[1..j-1]

i = j-1

while i > 0 and A[i] > key

A[i+1] = A[i]

i = i-1

A[i+1] = key

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 13: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

13

Insertion Sort Example

Pass 0 5 2 4 6 1 3

1 2 5 4 6 1 3

2 2 4 5 6 1 3

3 2 4 5 6 1 3

4 1 2 4 5 6 3

5 1 2 3 4 5 6

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 14: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

14

In-Class Exercise #3: Insertion Sort

• List of 5 elements:

9, 3, 1, 5, 2

Illustrate each pass, along with algorithm values of key, j and i…

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 15: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

15

Insertion Sort Solution

Pass 0 9 3 1 5 2 key j i

1 3 9 1 5 2 3 2 1,0

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

2 1 3 9 5 2 1 3 2,1,0

3 1 3 5 9 2 5 4 3,2

4 1 2 3 5 9 2 5 4,3,2,1

Page 16: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

16

Quicksort• A more efficient exchange sorting scheme than

bubble sort – A typical exchange involves elements that are far

apart – Fewer interchanges are required to correctly position

an element.• Quicksort uses a divide-and-conquer strategy

– A recursive approach – The original problem partitioned into simpler sub-

problems,– Each sub problem considered independently.

• Subdivision continues until sub problems obtained are simple enough to be solved directly

Page 17: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

17

Quicksort

• Choose some element called a pivot • Perform a sequence of exchanges so that

– All elements that are less than this pivot are to its left and – All elements that are greater than the pivot are to its

right.

• Divides the (sub)list into two smaller sub lists, • Each of which may then be sorted independently in

the same way.

Page 18: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

18

Quicksort

If the list has 0 or 1 elements, return. // the list is sorted

Else do:Pick an element in the list to use as the pivot.

  Split the remaining elements into two disjoint groups:SmallerThanPivot = {all elements < pivot}LargerThanPivot = {all elements > pivot}

 

 Return the list rearranged as: Quicksort(SmallerThanPivot),

pivot, Quicksort(LargerThanPivot).

Page 19: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

19

In-Class Exercise #4: Quicksort

• List of 9 elements– 30,10, 80, 70, 20, 90, 50, 40, 60

• Pivot is the first element • Illustrate each pass• Clearly denote each sublist

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 20: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

20

Quicksort Solution

Pass 0

30 10 80 70 20 90 50 40 60

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 20 10 30 70 80 90 50 40 60

2 10 20 30 50 60 40 70 90 80

3 10 20 30 40 50 60 70 80 90

TO DO: How does this change if you choose the pivot as the median?

Page 21: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

A heap is a binary tree with properties:

1. It is complete• Each level of tree completely filled• Except possibly bottom level (nodes in left most

positions)

2. The key in any node dominates the keys of its children

– Min-heap: Node dominates by containing a smaller key than its children

– Max-heap: Node dominates by containing a larger key than its children

21

Heaps

Page 22: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

22

Implementing a Heap

• Use an array or vector• Number the nodes from top to bottom

– Number nodes on each row from left to right• Store data in ith node in ith location of array

(vector)

Page 23: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

23

Implementing a Heap

• In an array implementation children of ith node are at myArray[2*i] and

myArray[2*i+1]• Parent of the ith node is at

myArray[i/2]

Page 24: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

24

Basic Heap Operations

• Construct an empty heap• Check if the heap is empty• Insert an item• Retrieve the largest/smallest element• Remove the largest/smallest element

Page 25: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

25

Basic Heap Operations

• Insert an item– Place new item at end of array– “Bubble” it up to the correct place

– Interchange with parent so long as it is greater/less than its parent

Page 26: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

26

Basic Heap Operations

• Delete max/min item– Max/Min item is the root, swap with last node in

tree– Delete last element– Bubble the top element down until heap property

satisfied• Interchange with larger of two children

Page 27: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

27Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 28: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

28

Percolate Down Algorithm

1. Set c = 2 * r2. While r <= n do following

a. If c < n and myArray[c] < myArray[c + 1]Increment c by 1

b. If myArray[r] < myArray[c]i. Swap myArray[r] and myArray[c]ii. set r = ciii. Set c = 2 * c

else Terminate repetitionEnd while

Page 29: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

29

Heapsort

• Given a list of numbers in an array– Stored in a complete binary tree

• Convert to a heap– Begin at last node not a leaf– Apply percolated down to this subtree– Continue

Page 30: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

30

Heapsort Algorithm

1. Consider x as a complete binary tree, use heapify to convert this tree to a heap

2. for i = n down to 2:a. Interchange x[1] and x[i] (puts largest element at end)b. Apply percolate_down to convert binary tree corresponding to sublist in x[1] .. x[i-1]

Page 31: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

31

Heapsort• Now swap element 1 (root of tree) with last

element

– This puts largest element in correct location• Use percolate down on remaining sublist

– Converts from semi-heap to heap

Page 32: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

32

Heapsort• Now swap element 1 (root of tree) with last

element

– This puts largest element in correct location• Use percolate down on remaining sublist

– Converts from semi-heap to heap

Page 33: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

33

In-Class Exercise #4: Heapsort

• For each step, want to draw the heap and array

• 30, 10, 80, 70, 20, 90, 40

Array?

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 2 3 4 5 6 7

30 10 80 70 20 90 40

30

10 80

70 20 90 40

Page 34: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

34

Step 1: Convert to a heap

• Begin at the last node that is not a leaf, apply the percolate down procedure to convert to a heap the subtree rooted at this node, move to the preceding node and percolat down in that subtree and so on, working our way up the tree, until we reach the root of the given tree. (HEAPIFY)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Page 35: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

35

Step 1 (ctd)

• What is the last node that is not a leaf?

• Apply percolate down

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

80

80

90 40

90

80 401 2 3 4 5 6 7

30 10 90 70 20 80 40

Page 36: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

36

Step 1 (ctd)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

10

70 20

70

10 20

1 2 3 4 5 6 7

30 70 90 10 20 80 40

Page 37: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

37

Step 1(ctd)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

30

70 90

10 20 80 40

90

70 80

10 20 30 40

1 2 3 4 5 6 7

90 70 80 10 20 30 40

We now have a heap!

Page 38: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

38

Step 2: Sort and Swap

• The largest element is now at the root• Correctly position the largest element by

swapping it with the element at the end of the list and go back and sort the remaining 6 elements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 2 3 4 5 6 7

90 70 80 10 20 30 40

1 2 3 4 5 6 7

40 70 80 10 20 30 90

Page 39: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

39

Step 2 (ctd)

• This is not a heap. However, since only the root changed, it is a semi-heap

• Use percolate down to convert to a heap

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

40

70 80

10 20 30

Page 40: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

40

Step 2 (ctd)

80

70 40

10 20 30

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

1 2 3 4 5 6 7

80 70 40 10 20 30 90

1. Swap

30

70 40

10 20 80

2. Prune

1 2 3 4 5 6 7

30 70 40 10 20 80 90

Page 41: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

41

Continue the pattern

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

70

30 40

10 20

1 2 3 4 5 6 7

70 30 40 10 20 80 90

1 2 3 4 5 6 7

20 30 40 10 70 80 90

20

30 40

10 70

Page 42: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

42

Continue the pattern

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

40

30 20

10

1 2 3 4 5 6 7

40 30 20 10 70 80 90

10

30 20

40

1 2 3 4 5 6 7

10 30 20 40 70 80 90

Page 43: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

43

Continue the pattern

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

30

10 20

1 2 3 4 5 6 7

30 10 20 40 70 80 90

20

10 30

1 2 3 4 5 6 7

20 10 30 40 70 80 90

Page 44: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

44

Complete!

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

20

10

1 2 3 4 5 6 7

20 10 30 40 70 80 9010

20

1 2 3 4 5 6 7

10 20 30 40 70 80 90

Page 45: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

45

Sorting Facts• Sorting schemes are either …

– internal -- designed for data items stored in main memory

– external -- designed for data items stored in secondary memory. (Disk Drive)

• Previous sorting schemes were all internal sorting algorithms:– required direct access to list elements

• not possible for sequential files– made many passes through the list

• not practical for files

Page 46: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

46

Mergesort

• Mergesort can be used both as an internal and an external sort.

• A divide and conquer algorithm• Basic operation in mergesort is merging,

– combining two lists that have previously been sorted

– resulting list is also sorted.

Page 47: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

47

Merge Algorithm

1. Open File1 and File2 for input, File3 for output2. Read first element x from File1 and

first element y from File23. While neither eof File1 or eof File2

If x < y thena. Write x to File3b. Read a new x value from File1

Otherwisea. Write y to File3b. Read a new y from File2

End while4. If eof File1 encountered copy rest of of File2 into File3.

If eof File2 encountered, copy rest of File1 into File3

Page 48: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

48

Mergesort Algorithm

Page 49: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

49

In-Class Exercise #6

• Take File1 and File2 and produce a sorted File 3

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

File 1 7 9 19 33 47 51 82 99          

File 2 11 18 24 49 61                

File 3                          

Page 50: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

50

Mergesort Solution

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

File 1 7 9 19 33 47 51 82 99          

File 2 11 18 24 49 61                

File 3 7  9  11  18  19  24  33  47  49  51  61  82  99

Page 51: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Fun Facts

• Most of the time spent in merging– Combining two sorted lists of size n/2– What is the runtime of merge()?

• Does not sort in-place– Requires extra memory to do the merging– Then copied back into the original memory

• Good for external sorting– Disks are slow– Writing in long streams is more efficient

51

O(n)

Page 52: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

52

Binary Merge Sort

• Given a single file

• Split into two files

Page 53: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

53

Binary Merge Sort

• Merge first one-element "subfile" of F1 with first one-element subfile of F2– Gives a sorted two-element subfile of F

• Continue with rest of one-element subfiles

Page 54: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

54

Binary Merge Sort

• Split again• Merge again as before

• Each time, the size of the sorted subgroups doubles

Page 55: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

55

Binary Merge Sort

• Last splitting gives two files each in order

• Last merging yields a single file, entirely in order

Note we always are limited to subfiles of some power of 2

Note we always are limited to subfiles of some power of 2

Page 56: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

56

Natural Merge Sort

• Allows sorted subfiles of other sizes– Number of phases can be reduced when file

contains longer "runs" of ordered elements• Consider file to be sorted, note in order

groups

Page 57: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

57

Natural Merge Sort

• Copy alternate groupings into two files– Use the sub-groupings, not a power of 2

• Look for possible larger groupings

Page 58: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

58

Natural Merge Sort

• Merge the corresponding sub files

EOF for F2, Copy remaining groups from F1

EOF for F2, Copy remaining groups from F1

Page 59: Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

59

Natural Merge Sort

• Split again, alternating groups

• Merge again, now two subgroups

• One more split, one more merge gives sort