Merge and Radix Merge and Radix Sorts Sorts Data Structures Data Structures Fall, 2007 Fall, 2007 13 13 th th
Mar 26, 2015
Merge and Radix SortsMerge and Radix SortsData StructuresData Structures
Fall, 2007Fall, 2007
1313thth
Merge Sort (1/13)Merge Sort (1/13) Before looking at the merge sort algorithm to sort Before looking at the merge sort algorithm to sort
nn records, let us see how one may merge two records, let us see how one may merge two sorted lists to get a single sorted list.sorted lists to get a single sorted list.
MergingMerging The first one, Program 7.7, uses O(The first one, Program 7.7, uses O(nn) additional ) additional
space.space. It merges the sorted lists It merges the sorted lists
((listlist[[ii], … , ], … , listlist[[mm]) and (]) and (listlist[[m+1m+1], …, ], …, listlist[[nn]), ]), into a single sorted list, (into a single sorted list, (sortedsorted[[ii], … , ], … , sortedsorted[[nn]).]).
Merge (using O(Merge (using O(nn) space)) space)
Merge Sort (3/13)Merge Sort (3/13) OO(1) space merge(1) space merge
Steps in an Steps in an OO(1) space merge when the total number of (1) space merge when the total number of records, n is a perfect square */records, n is a perfect square */and the number of records in each of the files to be and the number of records in each of the files to be merged is a multiple of merged is a multiple of n */n */
Step1:Step1:Identify the Identify the n records with largest keyn records with largest key. This is done by . This is done by following right to left along the two files to be merged.following right to left along the two files to be merged.
Step2:Step2:Exchange the records of the second file that were Exchange the records of the second file that were identified in Step1identified in Step1 with those just to the left of those with those just to the left of those identified from the first file so that the identified from the first file so that the n record with n record with largest keys form a contiguous blocklargest keys form a contiguous block
Merge Sort Merge Sort (4/13)(4/13)
OO(1) space merge (cont’d)(1) space merge (cont’d) Step3:Step3:
SwapSwap the block of the block of n largestn largest records with the records with the leftmostleftmost block (unless it is already the leftmost block). block (unless it is already the leftmost block). Sort the Sort the rightmost blockrightmost block
Step4:Step4:ReorderReorder the blocks excluding the block of largest the blocks excluding the block of largest records into records into nondecreasing ordernondecreasing order of the of the last key in the last key in the blocksblocks
Merge Sort (5/13)Merge Sort (5/13) OO(1) space merge (cont’d)(1) space merge (cont’d)
Step5:Step5:Perform as many merge sub steps as needed to Perform as many merge sub steps as needed to merge the merge the n-1 blocks other than the block with the n-1 blocks other than the block with the largest keys.largest keys.
0 1 2 3 w z u x 4 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t
0 1 2 3 4 z u x w 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t
Merge Sort (6/13)Merge Sort (6/13)
6, 7, 8 are merged
Segment one is merged (i.e., 0, 2, 4, 6, 8, a)
Change place marker (longest sorted sequence of records)
Segment one is merged (i.e., b, c, e, g, i, j, k)
Change place marker
Segment one is merged (i.e., o, p, q)
No other segment. Sort the largest keys.
Step6:Step6:Sort the block with the largest keysSort the block with the largest keys
When selection sort is used to implement When selection sort is used to implement Step 4 each block is regarded as a single Step 4 each block is regarded as a single record with key equal to that of the last record with key equal to that of the last record in the block. The time needed for record in the block. The time needed for these is O(n).these is O(n).
The total time is O(n).The total time is O(n). The additional space used is O(1).The additional space used is O(1). Example: Example:
Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19)Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19)
selection sortselection sortvoid selectionSort(int numbers[], int array_size) {void selectionSort(int numbers[], int array_size) { int i, j;int i, j; int min, temp;int min, temp;
for (i = 0; i < array_size-1; i++) {for (i = 0; i < array_size-1; i++) { min = i;min = i; for (j = i+1; j < array_size; j++) {for (j = i+1; j < array_size; j++) { if (numbers[j] < numbers[min])if (numbers[j] < numbers[min]) min = j;min = j; }} temp = numbers[i];temp = numbers[i]; numbers[i] = numbers[min];numbers[i] = numbers[min]; numbers[min] = temp;numbers[min] = temp; }}}} O(nO(n22))
Merge Sort (7/13)Merge Sort (7/13) Iterative merge sortIterative merge sort
1.1. We assume that the input sequence has We assume that the input sequence has nn sorted sorted lists, each of length 1.lists, each of length 1.
2.2. We merge these lists pairwise to obtain We merge these lists pairwise to obtain nn/2 lists of /2 lists of size 2.size 2.
3.3. We then merge the n/2 lists pairwise, and so on, We then merge the n/2 lists pairwise, and so on, until a single list remains.until a single list remains.
AnalysisAnalysis Total number of passes is the celling of logTotal number of passes is the celling of log22nn
merge two sorted list in linear time: O(merge two sorted list in linear time: O(nn)) The total computing time is O(The total computing time is O(n n loglog n n).).
Merge Sort (8/13)Merge Sort (8/13) merge_passmerge_pass Invokes Invokes mergemerge (Program 7.7) to merge the sorted sublists (Program 7.7) to merge the sorted sublists Perform one pass of the merge sort. It merges adjancent Perform one pass of the merge sort. It merges adjancent
pairs of subfiles from list into sorted.pairs of subfiles from list into sorted.
the number of elements in the listthe length of the subfile
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]length=2
n=10i= 0
0 1 3
4
list
sorted
4 5 7
8
merge_sortmerge_sort: Perform a merge sort on the file: Perform a merge sort on the file
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
length=1
list
extra
n=102
list
4
extra
8
list
16
Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept
Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept
listmergelistmerge:: Takes two sorted chains and returns an integer that points Takes two sorted chains and returns an integer that points
to the start of to the start of the sorted listthe sorted list
The link field in each record is initially set to -1
Since the elements were numbered from 0 to n-1, we use list[n] to store the start pointer
rmergermerge: sort the list, list[lower], …, list[upper]. : sort the list, list[lower], …, list[upper]. The link field in each record is initially set to -1The link field in each record is initially set to -1
start = rmerge(list, 0, n-1);
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
lower=upper=
middle=
0
9442211001
0
1
1
02
1
2
2
0
42
3
334
3
43
4
0
3
0
294
5
776655
5
6
6
6
5
76
7
7
5
97
8
88
8
9
9
9
8
5
8
5
7
0
5
0
4= 0
list
Merge Sort (13/13)Merge Sort (13/13) Variation:Variation: Natural merge sort Natural merge sort ::
We can modify We can modify merge_sortmerge_sort to take into account the to take into account the prevailing order within the input list.prevailing order within the input list.
In this implementation we make an initial pass over the data In this implementation we make an initial pass over the data to determine the sequences of records that are in order.to determine the sequences of records that are in order.
The merge sort then uses these initially ordered sublists for The merge sort then uses these initially ordered sublists for the remainder of the passes.the remainder of the passes.
Heap Sort (1/3)Heap Sort (1/3) The challenges of merge sortThe challenges of merge sort
The merge sort requires additional storage proportional to The merge sort requires additional storage proportional to the number of records in the file being sorted.the number of records in the file being sorted.
By using the By using the OO(1) space merge algorithm, the space (1) space merge algorithm, the space requirements can be reduced to requirements can be reduced to OO(1), but significantly (1), but significantly slower than the original one.slower than the original one.
Heap sortHeap sort Require only a fixed amount of additional storageRequire only a fixed amount of additional storage Slightly slower than merge sort using O(Slightly slower than merge sort using O(nn) additional ) additional
spacespace Faster than merge sort using Faster than merge sort using OO(1) additional space.(1) additional space. The worst case and average computing time is O(The worst case and average computing time is O(n n loglog n n), ),
same as merge sortsame as merge sort UnstableUnstable
adjustadjust adjust the binary tree to establish the heapadjust the binary tree to establish the heap
/* compare root and max. root */
/* move to parent */ [1]
[2] [3]
[4] [5] [6][7]
[8] [9] [10]
26
5 77
1 61 11 59
15 48 19
rootkey =
root = 1
n = 10
26
child = 23
77
67
59
14
26
Heap Sort (3/3)Heap Sort (3/3)
[1]
[2] [3]
[4][5] [6][7]
[8] [9] [10]
26
5 77
1 61 11 59
15 48 19
heapsortheapsort
n = 10i = 54
48
1
32
61
19
5
1
77
59
26
9
5
77
61
48
15
5
8
1
61
59
26
1
7
5
59
48
19
5
6
1
48
26
11
1
5
1
26
19
15
1
4
5
19
15
5
3
1
15
11
1
2
1
11
5
1
1
1
5
ascending order(max heap)
bottom-up
top-down
Radix Sort (1/8)Radix Sort (1/8)
We considers the problem of sorting records that We considers the problem of sorting records that have several keyshave several keys These keys are labeled These keys are labeled KK0 0 (most significant key)(most significant key), , KK11, ,
… , … , KKr-1 r-1 (least significant key)(least significant key).. Let Let KKi i
jj denote key denote key KKjj of record of record RRii..
A list of records A list of records RR00, … , , … , RRnn-1-1, is , is lexically sortedlexically sorted with with
respect to the keys respect to the keys KK00, , KK11, … , , … , KKrr-1-1 iffiff ((KKii
00, , KKii11, …, , …, KKii
r-1r-1) ) ( (KK00i+1i+1, , KK11
i+1i+1, …, , …, KKr-1r-1i+1i+1), 0), 0 i i < < nn-1-1
Radix Sort (2/8)Radix Sort (2/8)
ExampleExample sorting a deck of cards on two keys, suit and face sorting a deck of cards on two keys, suit and face
value, in which the keys have the ordering relation:value, in which the keys have the ordering relation:KK0 0 [Suit]:[Suit]: < < < < < < KK1 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A[Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A
Thus, a sorted deck of cards has the ordering:Thus, a sorted deck of cards has the ordering:22, …, A, …, A, … , 2, … , 2, … , A, … , A
Two approaches to sort:Two approaches to sort:1.1. MSD (Most Significant Digit) first:MSD (Most Significant Digit) first: sort on K0, then K1, ...
2.2. LSD (Least Significant Digit) first:LSD (Least Significant Digit) first: sort on Kr-1, then Kr-2, ...
Radix Sort (3/8)Radix Sort (3/8) MSD first
1. MSD sort first, e.g., bin sort, four bins
2. LSD sort second Result: 22, …, A, …, A, … , 2, … , 2, … , A, … , A
Radix Sort (4/8)Radix Sort (4/8) LSD first1.LSD sort first, e.g., face sort,
13 bins 2, 3, 4, …, 10, J, Q, K, A
2.MSD sort second (may not needed, we can just classify these 13 piles into 4 separated piles by considering them from face 2 to face A)
Simpler than the MSD one because we do not have to sort the subpiles independently
Result: 22, …, A, …, A, … , , … , 22, …, A, …, A
Radix Sort (5/8)Radix Sort (5/8) We also can use an LSD or MSD sort when we We also can use an LSD or MSD sort when we
have only one logical key, if we interpret this key have only one logical key, if we interpret this key as a composite of several keys.as a composite of several keys.
Example:Example: integer: the digit in the far right position is the least integer: the digit in the far right position is the least
significant and the most significant for the far left significant and the most significant for the far left positionposition
range: range: 0 K 999 using LSD or MSD sort for three keys (K0, K1, K2) since an LSD sort does not require the maintainence
of independent subpiles, it is easier to implement
MSD LSD0-9 0-9 0-9
Radix Sort (6/8)Radix Sort (6/8) radix sortradix sort
decompose the sort key into digits using a radix decompose the sort key into digits using a radix rr. . Ex: When Ex: When rr =10, we get the common base 10 or =10, we get the common base 10 or
decimal decomposition of the keydecimal decomposition of the key
LSD radix LSD radix rr sort sort The records, The records, RR00, … ,, … ,RRnn-1-1
Keys: Keys: dd-tuples (-tuples (xx00, , xx11, …, , …, xxdd-1-1) and that 0 ) and that 0 xxii < < rr.. Each record has a link field, and that the input list is Each record has a link field, and that the input list is
stored as a dynamically linked list.stored as a dynamically linked list. We implement the bins as queuesWe implement the bins as queues
frontfront[[ii], 0 ], 0 ii < < rr, pointing to the first record in bin , pointing to the first record in bin ii rearrear[[ii], 0 ], 0 ii < < rr, pointing to the last record in bin , pointing to the last record in bin ii
#define MAX_DIGIT 3 /* 0 to 999 */#define RADIX_SIZE 10typedef struct list_node *list_pointer;typedef struct list_node {
int key[MAX_DIGIT];list_pointer link;};
LSD Radix SortLSD Radix Sort Time complexity: O(MAX_DIGIT(RADIX_SIZE+n))
MAX_DIGIT passes
O(RADIX_SIZE)
O(RADIX_SIZE)
O(n)
RADIX_SIZE = 10
MAX_DIGIT = 3
f[9]
f[8]
f[7]
f[6]
f[5]
f[4]
f[3]
f[2]
f[1]
f[0]
271NULL
NULL
93
33 NULL
984NULL
55
306NULL
208NULL
179
859
9 NULL r[9]
r[8]
r[7]
r[6]
r[5]
r[4]
r[3]
r[2]
r[1]
r[0]
Initial input:179→208→306→93→859→984→55→9→271→33
Chain after first pass, i=2:271→93→33→984→55→306→208→179→859→9
Radix Sort (8/8)Radix Sort (8/8) Simulation of Simulation of radix_sortradix_sort
f[9]f[8]
f[7]
f[6]
f[5]
f[4]
f[3]
f[2]
f[1]
f[0]
271
NULL
93
33 NULL
984NULL
55
306
208
NULL
179
859
9 NULL
r[9]
r[8]
r[7]
r[6]r[5]
r[4]
r[3]
r[2]
r[1]
r[0]
f[9]
f[8]
f[7]
f[6]
f[5]
f[4]
f[3]
f[2]
f[1]
f[0]
271NULL
NULL93
33
984NULL
55
306NULL
208
179
859
9
NULL
r[9]
r[8]
r[7]
r[6]
r[5]
r[4]
r[3]
r[2]
r[1]
r[0]
NULL
NULL
Chain after second pass, i=1:306→208→9→33→55→859→271→179→984→93
Chain after third pass, i=0:9→33→55→93→179→208→271→306→859→984
Summary of Internal Sorting (1/2)Summary of Internal Sorting (1/2) Insertion SortInsertion Sort
Works well when the list is already partially orderedWorks well when the list is already partially ordered The best sorting method for small The best sorting method for small nn
Merge SortMerge Sort The best/worst case (O(The best/worst case (O(nnloglognn)))) Require more storage than a heap sortRequire more storage than a heap sort Slightly more overhead than quick sortSlightly more overhead than quick sort
Quick SortQuick Sort The best average behaviorThe best average behavior The worst complexity in worst case (O(The worst complexity in worst case (O(nn22))))
Radix SortRadix Sort Depend on the size of the keys and the choice of the radixDepend on the size of the keys and the choice of the radix
Summary of Internal Sorting (2/2)Summary of Internal Sorting (2/2) Analysis of the average running timesAnalysis of the average running times