Top Banner
Merge and Radix Merge and Radix Sorts Sorts Data Structures Data Structures Fall, 2007 Fall, 2007 13 13 th th
33

Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Mar 26, 2015

Download

Documents

Hannah Sullivan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge and Radix SortsMerge and Radix SortsData StructuresData Structures

Fall, 2007Fall, 2007

1313thth

Page 2: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (1/13)Merge Sort (1/13) Before looking at the merge sort algorithm to sort Before looking at the merge sort algorithm to sort

nn records, let us see how one may merge two records, let us see how one may merge two sorted lists to get a single sorted list.sorted lists to get a single sorted list.

MergingMerging The first one, Program 7.7, uses O(The first one, Program 7.7, uses O(nn) additional ) additional

space.space. It merges the sorted lists It merges the sorted lists

((listlist[[ii], … , ], … , listlist[[mm]) and (]) and (listlist[[m+1m+1], …, ], …, listlist[[nn]), ]), into a single sorted list, (into a single sorted list, (sortedsorted[[ii], … , ], … , sortedsorted[[nn]).]).

Page 3: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge (using O(Merge (using O(nn) space)) space)

Page 4: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (3/13)Merge Sort (3/13) OO(1) space merge(1) space merge

Steps in an Steps in an OO(1) space merge when the total number of (1) space merge when the total number of records, n is a perfect square */records, n is a perfect square */and the number of records in each of the files to be and the number of records in each of the files to be merged is a multiple of merged is a multiple of n */n */

Step1:Step1:Identify the Identify the n records with largest keyn records with largest key. This is done by . This is done by following right to left along the two files to be merged.following right to left along the two files to be merged.

Step2:Step2:Exchange the records of the second file that were Exchange the records of the second file that were identified in Step1identified in Step1 with those just to the left of those with those just to the left of those identified from the first file so that the identified from the first file so that the n record with n record with largest keys form a contiguous blocklargest keys form a contiguous block

Page 5: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort Merge Sort (4/13)(4/13)

OO(1) space merge (cont’d)(1) space merge (cont’d) Step3:Step3:

SwapSwap the block of the block of n largestn largest records with the records with the leftmostleftmost block (unless it is already the leftmost block). block (unless it is already the leftmost block). Sort the Sort the rightmost blockrightmost block

Step4:Step4:ReorderReorder the blocks excluding the block of largest the blocks excluding the block of largest records into records into nondecreasing ordernondecreasing order of the of the last key in the last key in the blocksblocks

Page 6: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (5/13)Merge Sort (5/13) OO(1) space merge (cont’d)(1) space merge (cont’d)

Step5:Step5:Perform as many merge sub steps as needed to Perform as many merge sub steps as needed to merge the merge the n-1 blocks other than the block with the n-1 blocks other than the block with the largest keys.largest keys.

0 1 2 3 w z u x 4 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t

0 1 2 3 4 z u x w 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t

Page 7: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (6/13)Merge Sort (6/13)

6, 7, 8 are merged

Segment one is merged (i.e., 0, 2, 4, 6, 8, a)

Change place marker (longest sorted sequence of records)

Segment one is merged (i.e., b, c, e, g, i, j, k)

Change place marker

Segment one is merged (i.e., o, p, q)

No other segment. Sort the largest keys.

Step6:Step6:Sort the block with the largest keysSort the block with the largest keys

Page 8: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

When selection sort is used to implement When selection sort is used to implement Step 4 each block is regarded as a single Step 4 each block is regarded as a single record with key equal to that of the last record with key equal to that of the last record in the block. The time needed for record in the block. The time needed for these is O(n).these is O(n).

The total time is O(n).The total time is O(n). The additional space used is O(1).The additional space used is O(1). Example: Example:

Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19)Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19)

Page 9: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

selection sortselection sortvoid selectionSort(int numbers[], int array_size) {void selectionSort(int numbers[], int array_size) { int i, j;int i, j; int min, temp;int min, temp;

for (i = 0; i < array_size-1; i++) {for (i = 0; i < array_size-1; i++) { min = i;min = i; for (j = i+1; j < array_size; j++) {for (j = i+1; j < array_size; j++) { if (numbers[j] < numbers[min])if (numbers[j] < numbers[min]) min = j;min = j; }} temp = numbers[i];temp = numbers[i]; numbers[i] = numbers[min];numbers[i] = numbers[min]; numbers[min] = temp;numbers[min] = temp; }}}} O(nO(n22))

Page 10: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (7/13)Merge Sort (7/13) Iterative merge sortIterative merge sort

1.1. We assume that the input sequence has We assume that the input sequence has nn sorted sorted lists, each of length 1.lists, each of length 1.

2.2. We merge these lists pairwise to obtain We merge these lists pairwise to obtain nn/2 lists of /2 lists of size 2.size 2.

3.3. We then merge the n/2 lists pairwise, and so on, We then merge the n/2 lists pairwise, and so on, until a single list remains.until a single list remains.

AnalysisAnalysis Total number of passes is the celling of logTotal number of passes is the celling of log22nn

merge two sorted list in linear time: O(merge two sorted list in linear time: O(nn)) The total computing time is O(The total computing time is O(n n loglog n n).).

Page 11: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (8/13)Merge Sort (8/13) merge_passmerge_pass Invokes Invokes mergemerge (Program 7.7) to merge the sorted sublists (Program 7.7) to merge the sorted sublists Perform one pass of the merge sort. It merges adjancent Perform one pass of the merge sort. It merges adjancent

pairs of subfiles from list into sorted.pairs of subfiles from list into sorted.

the number of elements in the listthe length of the subfile

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]length=2

n=10i= 0

0 1 3

4

list

sorted

4 5 7

8

Page 12: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

merge_sortmerge_sort: Perform a merge sort on the file: Perform a merge sort on the file

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

length=1

list

extra

n=102

list

4

extra

8

list

16

Page 13: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 14: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 15: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 16: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 17: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 18: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

listmergelistmerge:: Takes two sorted chains and returns an integer that points Takes two sorted chains and returns an integer that points

to the start of to the start of the sorted listthe sorted list

The link field in each record is initially set to -1

Since the elements were numbered from 0 to n-1, we use list[n] to store the start pointer

Page 19: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

rmergermerge: sort the list, list[lower], …, list[upper]. : sort the list, list[lower], …, list[upper]. The link field in each record is initially set to -1The link field in each record is initially set to -1

start = rmerge(list, 0, n-1);

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

lower=upper=

middle=

0

9442211001

0

1

1

02

1

2

2

0

42

3

334

3

43

4

0

3

0

294

5

776655

5

6

6

6

5

76

7

7

5

97

8

88

8

9

9

9

8

5

8

5

7

0

5

0

4= 0

list

Page 20: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Merge Sort (13/13)Merge Sort (13/13) Variation:Variation: Natural merge sort Natural merge sort ::

We can modify We can modify merge_sortmerge_sort to take into account the to take into account the prevailing order within the input list.prevailing order within the input list.

In this implementation we make an initial pass over the data In this implementation we make an initial pass over the data to determine the sequences of records that are in order.to determine the sequences of records that are in order.

The merge sort then uses these initially ordered sublists for The merge sort then uses these initially ordered sublists for the remainder of the passes.the remainder of the passes.

Page 21: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Heap Sort (1/3)Heap Sort (1/3) The challenges of merge sortThe challenges of merge sort

The merge sort requires additional storage proportional to The merge sort requires additional storage proportional to the number of records in the file being sorted.the number of records in the file being sorted.

By using the By using the OO(1) space merge algorithm, the space (1) space merge algorithm, the space requirements can be reduced to requirements can be reduced to OO(1), but significantly (1), but significantly slower than the original one.slower than the original one.

Heap sortHeap sort Require only a fixed amount of additional storageRequire only a fixed amount of additional storage Slightly slower than merge sort using O(Slightly slower than merge sort using O(nn) additional ) additional

spacespace Faster than merge sort using Faster than merge sort using OO(1) additional space.(1) additional space. The worst case and average computing time is O(The worst case and average computing time is O(n n loglog n n), ),

same as merge sortsame as merge sort UnstableUnstable

Page 22: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

adjustadjust adjust the binary tree to establish the heapadjust the binary tree to establish the heap

/* compare root and max. root */

/* move to parent */ [1]

[2] [3]

[4] [5] [6][7]

[8] [9] [10]

26

5 77

1 61 11 59

15 48 19

rootkey =

root = 1

n = 10

26

child = 23

77

67

59

14

26

Page 23: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Heap Sort (3/3)Heap Sort (3/3)

[1]

[2] [3]

[4][5] [6][7]

[8] [9] [10]

26

5 77

1 61 11 59

15 48 19

heapsortheapsort

n = 10i = 54

48

1

32

61

19

5

1

77

59

26

9

5

77

61

48

15

5

8

1

61

59

26

1

7

5

59

48

19

5

6

1

48

26

11

1

5

1

26

19

15

1

4

5

19

15

5

3

1

15

11

1

2

1

11

5

1

1

1

5

ascending order(max heap)

bottom-up

top-down

Page 24: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (1/8)Radix Sort (1/8)

We considers the problem of sorting records that We considers the problem of sorting records that have several keyshave several keys These keys are labeled These keys are labeled KK0 0 (most significant key)(most significant key), , KK11, ,

… , … , KKr-1 r-1 (least significant key)(least significant key).. Let Let KKi i

jj denote key denote key KKjj of record of record RRii..

A list of records A list of records RR00, … , , … , RRnn-1-1, is , is lexically sortedlexically sorted with with

respect to the keys respect to the keys KK00, , KK11, … , , … , KKrr-1-1 iffiff ((KKii

00, , KKii11, …, , …, KKii

r-1r-1) ) ( (KK00i+1i+1, , KK11

i+1i+1, …, , …, KKr-1r-1i+1i+1), 0), 0 i i < < nn-1-1

Page 25: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (2/8)Radix Sort (2/8)

ExampleExample sorting a deck of cards on two keys, suit and face sorting a deck of cards on two keys, suit and face

value, in which the keys have the ordering relation:value, in which the keys have the ordering relation:KK0 0 [Suit]:[Suit]: < < < < < < KK1 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A[Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A

Thus, a sorted deck of cards has the ordering:Thus, a sorted deck of cards has the ordering:22, …, A, …, A, … , 2, … , 2, … , A, … , A

Two approaches to sort:Two approaches to sort:1.1. MSD (Most Significant Digit) first:MSD (Most Significant Digit) first: sort on K0, then K1, ...

2.2. LSD (Least Significant Digit) first:LSD (Least Significant Digit) first: sort on Kr-1, then Kr-2, ...

Page 26: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (3/8)Radix Sort (3/8) MSD first

1. MSD sort first, e.g., bin sort, four bins

2. LSD sort second Result: 22, …, A, …, A, … , 2, … , 2, … , A, … , A

Page 27: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (4/8)Radix Sort (4/8) LSD first1.LSD sort first, e.g., face sort,

13 bins 2, 3, 4, …, 10, J, Q, K, A

2.MSD sort second (may not needed, we can just classify these 13 piles into 4 separated piles by considering them from face 2 to face A)

Simpler than the MSD one because we do not have to sort the subpiles independently

Result: 22, …, A, …, A, … , , … , 22, …, A, …, A

Page 28: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (5/8)Radix Sort (5/8) We also can use an LSD or MSD sort when we We also can use an LSD or MSD sort when we

have only one logical key, if we interpret this key have only one logical key, if we interpret this key as a composite of several keys.as a composite of several keys.

Example:Example: integer: the digit in the far right position is the least integer: the digit in the far right position is the least

significant and the most significant for the far left significant and the most significant for the far left positionposition

range: range: 0 K 999 using LSD or MSD sort for three keys (K0, K1, K2) since an LSD sort does not require the maintainence

of independent subpiles, it is easier to implement

MSD LSD0-9 0-9 0-9

Page 29: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (6/8)Radix Sort (6/8) radix sortradix sort

decompose the sort key into digits using a radix decompose the sort key into digits using a radix rr. . Ex: When Ex: When rr =10, we get the common base 10 or =10, we get the common base 10 or

decimal decomposition of the keydecimal decomposition of the key

LSD radix LSD radix rr sort sort The records, The records, RR00, … ,, … ,RRnn-1-1

Keys: Keys: dd-tuples (-tuples (xx00, , xx11, …, , …, xxdd-1-1) and that 0 ) and that 0 xxii < < rr.. Each record has a link field, and that the input list is Each record has a link field, and that the input list is

stored as a dynamically linked list.stored as a dynamically linked list. We implement the bins as queuesWe implement the bins as queues

frontfront[[ii], 0 ], 0 ii < < rr, pointing to the first record in bin , pointing to the first record in bin ii rearrear[[ii], 0 ], 0 ii < < rr, pointing to the last record in bin , pointing to the last record in bin ii

#define MAX_DIGIT 3 /* 0 to 999 */#define RADIX_SIZE 10typedef struct list_node *list_pointer;typedef struct list_node {

int key[MAX_DIGIT];list_pointer link;};

Page 30: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

LSD Radix SortLSD Radix Sort Time complexity: O(MAX_DIGIT(RADIX_SIZE+n))

MAX_DIGIT passes

O(RADIX_SIZE)

O(RADIX_SIZE)

O(n)

RADIX_SIZE = 10

MAX_DIGIT = 3

f[9]

f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271NULL

NULL

93

33 NULL

984NULL

55

306NULL

208NULL

179

859

9 NULL r[9]

r[8]

r[7]

r[6]

r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

Initial input:179→208→306→93→859→984→55→9→271→33

Chain after first pass, i=2:271→93→33→984→55→306→208→179→859→9

Page 31: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Radix Sort (8/8)Radix Sort (8/8) Simulation of Simulation of radix_sortradix_sort

f[9]f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271

NULL

93

33 NULL

984NULL

55

306

208

NULL

179

859

9 NULL

r[9]

r[8]

r[7]

r[6]r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

f[9]

f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271NULL

NULL93

33

984NULL

55

306NULL

208

179

859

9

NULL

r[9]

r[8]

r[7]

r[6]

r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

NULL

NULL

Chain after second pass, i=1:306→208→9→33→55→859→271→179→984→93

Chain after third pass, i=0:9→33→55→93→179→208→271→306→859→984

Page 32: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Summary of Internal Sorting (1/2)Summary of Internal Sorting (1/2) Insertion SortInsertion Sort

Works well when the list is already partially orderedWorks well when the list is already partially ordered The best sorting method for small The best sorting method for small nn

Merge SortMerge Sort The best/worst case (O(The best/worst case (O(nnloglognn)))) Require more storage than a heap sortRequire more storage than a heap sort Slightly more overhead than quick sortSlightly more overhead than quick sort

Quick SortQuick Sort The best average behaviorThe best average behavior The worst complexity in worst case (O(The worst complexity in worst case (O(nn22))))

Radix SortRadix Sort Depend on the size of the keys and the choice of the radixDepend on the size of the keys and the choice of the radix

Page 33: Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Summary of Internal Sorting (2/2)Summary of Internal Sorting (2/2) Analysis of the average running timesAnalysis of the average running times