Data structures and Algorithms Sorting Pham Quang Dung Hanoi, 2012 Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 1 / 38
Data structures and AlgorithmsSorting
Pham Quang Dung
Hanoi, 2012
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 1 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 2 / 38
Introduction to sorting
Put elements of a list in a certain order
Designing efficient sorting algorithms is very important for otheralgorithms (search, merge, etc.)
Each object is associated with a key and sorting algorithms work onthese keys.
Two basic operations that used mostly by sorting algorithms
Swap(a, b): swap the values of variables a and bCompare(a, b): return
true if a is before b in the considered orderfalse, otherwise.
Without loss of generality, suppose we need to sort a list of numbersin nondecreasing order
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 3 / 38
A sorting algorithm is called in-place if the size of additional memoryrequired by the algorithm is O(1) (which does not depend on the sizeof the input array)
A sorting algorithm is called stable if it maintains the relative orderof elements with equal keys
A sorting algorithm uses only comparison for deciding the orderbetween two elements is called Comparison-based sortingalgorithm
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 4 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 5 / 38
Insertion Sort
At iteration k , put the kth element of the original list in the rightorder of the sorted list of the first k elements (∀k = 1, . . . , n)
Result: after kth iteration, we have a sorted list of the first kth
elements of the original list
1 vo i d i n s e r t i o n s o r t ( i n t a [ ] , i n t n ) {i n t k ;
3 f o r ( k = 2 ; k <= n ; k++){i n t l a s t = a [ k ] ;
5 i n t j = k ;wh i l e ( j > 1 && a [ j −1] > l a s t ) {
7 a [ j ] = a [ j −1] ;j−−;
9 }a [ j ] = l a s t ;
11 }}
Listing 1: insertion sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 6 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 7 / 38
Selection Sort
Put the smallest element of the original list in the first position
Put the second smallest element of the original list in the secondposition
Put the third smallest element of the original list in the third position
...
1 vo i d s e l e c t i o n s o r t ( i n t a [ ] , i n t n ) {f o r ( i n t k = 1 ; k <= n ; k++){
3 i n t min = k ;f o r ( i n t i = k+1; i <= n ; i++)
5 i f ( a [ min ] > a [ i ] )min = i ;
7 swap ( a [ k ] , a [ min ] ) ;}
9 }
Listing 2: selection sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 8 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 9 / 38
Bubble sort
Pass from the beginning of the list: compare and swap two adjacentelements if they are not in the right order
Repeat the pass until no swaps are needed
1 vo i d b u b b l e s o r t ( i n t a [ ] , i n t n ) {i n t swapped ;
3 do{swapped = 0 ;
5 f o r ( i n t i = 1 ; i < n ; i++)i f ( a [ i ] > a [ i +1]){
7 swap ( a [ i ] , a [ i +1]) ;swapped = 1 ;
9 }}wh i l e ( swapped == 1) ;
11 }
Listing 3: bubble sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 10 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 11 / 38
Merge sort
Divide-and-conquer
Divide the original list of n/2 into two lists of n/2 elements
Recursively merge sort these two lists
Merge the two sorted lists
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 12 / 38
Merge sort
vo i d merge ( i n t a [ ] , i n t L , i n t M, i n t R) {2 // merge two s o r t e d l i s t a [ L . .M] and a [M+1. .R ]
i n t i = L ;// f i r s t p o s i t i o n o f the f i r s t l i s t a [ L . .M]4 i n t j = M+1;// f i r s t p o s i t i o n o f the second l i s t a [M+1. .R ]
f o r ( i n t k = L ; k <= R; k++){6 i f ( i > M) {// the f i r s t l i s t i s a l l scanned
TA[ k ] = a [ j ] ; j ++;8 } e l s e i f ( j > R) {// the second l i s t i s a l l scanned
TA[ k ] = a [ i ] ; i ++;10 } e l s e {
i f ( a [ i ] < a [ j ] ) {12 TA[ k ] = a [ i ] ; i ++;
} e l s e {14 TA[ k ] = a [ j ] ; j ++;
}16 }
}18 f o r ( i n t k = L ; k <= R; k++)
a [ k ] = TA[ k ] ;20 }
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 13 / 38
Merge sort
vo i d me rg e s o r t ( i n t a [ ] , i n t L , i n t R) {2 i f ( L < R) {
i n t M = (L+R) /2 ;4 merge s o r t ( a , L ,M) ;
me rg e s o r t ( a ,M+1,R) ;6 merge ( a , L ,M,R) ;
}8 }
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 14 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 15 / 38
Quick sort
Pick an element, called a pivot, from the original list
Rearrange the list so that:
All elements less than pivot come before the pivotAll elements greater or equal to to pivot come after pivot
Here, pivot is in the right position in the final sorted list (it is fixed)
Recursively sort the sub-list before pivot and the sub-list after pivot
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 16 / 38
Quick sort
vo i d q u i c k s o r t ( i n t a [ ] , i n t L , i n t R) {2 i f ( L < R) {
i n t i nd e x = (L+R) /2 ;4 i n d e x = p a r t i t i o n ( a , L ,R , i nd e x ) ;
i f ( L < i n d e x )6 q u i c k s o r t ( a , L , index −1) ;
i f ( i nd e x < R)8 q u i c k s o r t ( a , i nd e x +1,R) ;
}10 }
Listing 4: Quick sort algorithm
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 17 / 38
Quick sort
1 i n t p a r t i t i o n ( i n t a [ ] , i n t L , i n t R , i n t i n d e xP i v o t ) {i n t p i v o t = a [ i n d e xP i v o t ] ;
3 swap ( a [ i n d e xP i v o t ] , a [R ] ) ; // put the p i v o t i n the end o f the l i s ti n t s t o r e I n d e x = L ; // s t o r e the r i g h t p o s i t i o n o f p i v o t at the
end o f the p a r t i t i o n p rocedu r e5
f o r ( i n t i = L ; i <= R−1; i++){7 i f ( a [ i ] < p i v o t ) {
swap ( a [ s t o r e I n d e x ] , a [ i ] ) ;9 s t o r e I n d e x++;
}11 }
swap ( a [ s t o r e I n d e x ] , a [R ] ) ; // put the p i v o t i n the r i g h t p o s i t i o nand r e t u r n t h i s p o s i t i o n
13
r e t u r n s t o r e I n d e x ;15 }
Listing 5: partition
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 18 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 19 / 38
Heap sort
Sort a list A[1..N] in nondecreasing order
1 Build a heap out of A[1..N]
2 Remove the largest element and put it in the Nth position of the list
3 Reconstruct the heap out of A[1..N − 1]
4 Remove the largest element and put it in the N − 1th position of thelist
5 ...
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 20 / 38
Heap sort - Heap structure
Shape property: Complete binary tree with level L
Heap property: each node is greater than or equal to each of itschildren (max-heap)
10
9 7
4 5 6 1
3 2 1 2 3 4 5 6 7 8 9
10 9 7 4 5 6 1 3 2
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 21 / 38
Heap sort
Heap corresponding to a list A[1..N]
Root of the tree is A[1]Left child of node A[i ] is A[2 ∗ i ]Right child of node A[i ] is A[2 ∗ i + 1]Height is logN + 1
Operations
Build-Max-Heap: construct a heap from the original listMax-Heapify: repair the following binary tree so that it becomesMax-Heap
A tree with root A[i ]A[i ] < max(A[2 ∗ i ],A[2 ∗ i + 1]): heap property is not holdSubtrees rooted at A[2 ∗ i ] and A[2 ∗ i + 1] are Max-Heap
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 22 / 38
Heap sort
vo i d h e a p i f y ( i n t a [ ] , i n t i , i n t n ) {2 // a r r a y to be h e a p i f i e d i s a [ i . . n ]
i n t L = 2∗ i ;4 i n t R = 2∗ i +1;
i n t max = i ;6 i f ( L <= n && a [ L ] > a [ i ] )
max = L ;8 i f (R <= n && a [R ] > a [max ] )
max = R;10 i f (max != i ) {
swap ( a [ i ] , a [ max ] ) ;12 h e a p i f y ( a , max , n ) ;
}14 }
Listing 6: heapify
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 23 / 38
Heap sort
1 vo i d bu i ldHeap ( i n t a [ ] , i n t n ) {// a r r a y i s a [ 1 . . n ]
3 f o r ( i n t i = n /2 ; i >= 1 ; i−−){h e a p i f y ( a , i , n ) ;
5 }}
7
vo i d heap So r t ( i n t a [ ] , i n t n ) {9 // a r r a y i s a [ 1 . . n ]
bu i l dHeap ( a , n ) ;11 f o r ( i n t i = n ; i > 1 ; i−−){
swap ( a [ 1 ] , a [ i ] ) ;13 h e a p i f y ( a , 1 , i −1) ;
}15 }
Listing 7: heapSort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 24 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 25 / 38
Counting sort
Input: array of n integers a1, . . . , an, where 0 6= ai 6= k with k = O(n)
Main idea:
For each element x , compute the rank r [x ] as number of elements ofthe array which is less than or equal to xPlace x at position r [x ]
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 26 / 38
Counting sort
1 vo i d c oun t i n gSo r t ( i n t a [ ] , i n t r [ ] , i n t c [ ] , i n t n , i n t k ) {// a [ 1 . . n ] i s the a r r a y to be s o r t e d
3 // n i s the number o f e l ement s o f the a r r a y a// k i s the maximum of a , and 0 i s minimum of a
5
// count r [ i ] − the number o f e l ement s o f a hav ing v a l u e i7 f o r ( i n t i = 0 ; i <= k ; i++) r [ i ] = 0 ;
f o r ( i n t i = 1 ; i <= n ; i++) r [ a [ i ] ]++;9 // compute rank r [ x ] o f x
f o r ( i n t x = 1 ; x <= k ; x++) r [ x ] = r [ x ] + r [ x−1] ;11
// s o r t13 f o r ( i n t i = n ; i >= 1 ; i−−){
c [ r [ a [ i ] ] ] = a [ i ] ; / / p l a c e a [ i ] i n i t s r i g h t p o s i t i o n15 r [ a [ i ] ] = r [ a [ i ] ] −1 ;// reduce rank o f a [ i ] by one
}17 }
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 28 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 29 / 38
Radix sort
Not comparison-based sorting algorithm
Each key (integer) is represented by a sequence of d numerical digitsin a given radix (e.g., radix is 10: adad−1 . . . a1)
Principle
Take the least significant digit of each keyGroup the keys based on that digit while keep the original order of keys(stable sort)Repeat the grouping process with each more significant digit
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 30 / 38
Radix sort
Algorithm 1: RadixSort(A, d)
1 foreach i ∈ 1..d do2 Sort A based on the i th digits of keys using stable sort;
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 31 / 38
Radix sort
Example
A[1..10] = 2980, 0020, 0242, 3002, 1145, 1045, 2626, 1005, 3180, 4146
1 Step 1: 2980,0020,3180,0242,3002,1145,1045,1005,2626,4146
2 Step 2: 3002,1005,0020,2626,0242,1145,1045,4146,2980,3180
3 Step 3: 3002,1005,0020,1045,1145,4146,3180,0242,2626,2980
4 Step 4: 0020,0242,1005,1045,1145,2626,2980,3002,3180,4146
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 32 / 38
Radix sort
vo i d r a d i x s o r t 1 0 ( l ong a [ ] , i n t n ) {2 i n t max = −10000;
f o r ( i n t i = 0 ; i < n ; i++) i f (max < a [ i ] ) max = a [ i ] ;4 l ong tmp [MAX SIZE ] ;
i n t exp = 1 ;6 wh i l e (max/ exp > 0) {
i n t b i n s z [ 1 0 ] = {0} ;8 i n t i d x [ 1 0 ] ;
f o r ( i n t i = 0 ; i < n ; i++){10 i n t c = ( a [ i ] / exp ) % 10 ;
b i n s z [ c ]++;12 tmp [ i ] = a [ i ] ;
}14 i d x [ 0 ] = 0 ;
f o r ( i n t i = 1 ; i < 10 ; i++)16 i d x [ i ] = i d x [ i −1] + b i n s z [ i −1] ;
f o r ( i n t i = 0 ; i < n ; i++){18 i n t c = ( tmp [ i ] / exp )%10 ;
a [ i d x [ c ] ] = tmp [ i ] ;20 i d x [ c ]++;
}22 exp = exp ∗10 ;
}24 }
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 33 / 38
Outline
1 Introduction to Sorting
2 Insertion Sort
3 Selection Sort
4 Bubble Sort
5 Merge Sort
6 Quick Sort
7 Heap Sort
8 Counting Sort
9 Radix Sort
10 Bucket Sort
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 34 / 38
Bucket sort
Assume a uniform distribution on the input
The input (array A[1..n]) is generated by a random process thatdistributes uniformly and independently over the interval [0,1)
Main idea
Interval [0,1) is divided into n equal-size subintervals (or buckets)Distribute A[1..n] into the bucketsSort elements in each bucketConcatenate the sorted lists of the buckets in order to establish thesorted list of the original array
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 35 / 38
Bucket sort
Algorithm 2: BUCKET-SORT(A)
1 Let B[0..n − 1] be a new array;2 foreach i = 0, . . . , n − 1 do3 Make B[i] an empty list;
4 foreach i = 1..n do5 Insert A[i ] into list B[bnA[i ]c];6 foreach i = 0, . . . , n − 1 do7 Sort list B[i ] with insertion sort;
8 Concatenate the list B[0],B[1], . . . ,B[n − 1] together in order;
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 36 / 38
Bucket sort - Analysis
Analysis the average-case running time
Let ni be the random variable denoting the number of element placedin bucket B[i ]
T (n) = Θ(n) +∑n−1
i=0 O(n2i )
Average-case running time is E [T (n)] = E [Θ(n) +∑n−1
i=0 O(n2i )] =Θ(n) +
∑n−1i=1 O(E [n2i ])
We’ll prove that E [n2i ] = 2− 1n (next slide)
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 37 / 38
Bucket sort - Analysis
Xij =
{1, if A[j ] falls in bucket i0, otherwise
ni =∑
j = 1nXij
E [n2i ] = E [(∑n
j=1 Xij)2] =
E [∑n
j=1 X2ij +
∑1≤j≤n
∑k∈{1,...,n}\{j} XijXik ] =∑n
j=1 E [X 2ij ] +
∑1≤j≤n
∑k∈{1,...,n}\{j} E [XijXik ]
Xij is 1 with probability 1n and 0 otherwise, therefor
E [X 2ij ] = 12 ∗ 1
n + 02 ∗ (1− 1n ) = 1
n
When k 6= j , Xij and Xik are independent, henceE [Xij ∗ Xik ] = E [Xij ]E [Xik ] = 1
n ∗1n = 1
n2
Finally, we haveE [n2i ] = 2− 1
n ⇒ E [T (n)] = Θ(n) + n ∗ O(2− 1n ) = Θ(n)
Pham Quang Dung () Data structures and Algorithms Sorting Hanoi, 2012 38 / 38