Sorting Algorithms - CmpE WEB · 2017-10-18 · sorting algorithms (the other one is quicksort). It is a recursive algorithm. Divides the list into halves, Sorts each half separately,
Post on 06-Jul-2020
2 Views
Preview:
Transcript
Sorting
Sorting is a process that organizes a collection of data into eitherascending or descending order.
An internal sort requires that the collection of data fit entirely in thecomputer’s main memory.
We can use an external sort when the collection of data cannot fit inthe computer’s main memory all at once but must reside in secondarystorage such as on a disk (or tape).
We will analyze only internal sorting algorithms.
CMPE 250 Sorting Algorithms October 18, 2017 2 / 74
Why Sorting?
Any significant amount of computer output is generally arranged insome sorted order so that it can be interpreted.
Sorting also has indirect uses. An initial sort of the data cansignificantly enhance the performance of an algorithm.
Majority of programming projects use a sort somewhere, and in manycases, the sorting cost determines the running time.
A comparison-based sorting algorithm makes ordering decisions onlyon the basis of comparisons.
CMPE 250 Sorting Algorithms October 18, 2017 3 / 74
Sorting Algorithms
There are many sorting algorithms, such as:Selection SortInsertion SortBubble SortMerge SortQuick SortHeap SortShell Sort
The first three are the foundations for faster and more efficientalgorithms.
CMPE 250 Sorting Algorithms October 18, 2017 4 / 74
Insertion Sort
Insertion sort is a simple sorting algorithm that is appropriate for smallinputs.
The most common sorting technique used by card players.
The list is divided into two parts: sorted and unsorted.
In each pass, the first element of the unsorted part is picked up,transferred to the sorted sublist, and inserted at the appropriate place.
A list of n elements will take at most n − 1 passes to sort the data.
CMPE 250 Sorting Algorithms October 18, 2017 5 / 74
Insertion Sort Algorithm
// Simple insertion sort.template <typename Comparable>void insertionSort( vector<Comparable> & a )
for( int p = 1; p < a.size( ); ++p )
Comparable tmp = std::move( a[ p ] );
int j;for( j = p; j > 0 && tmp < a[ j - 1 ]; --j )
a[ j ] = std::move( a[ j - 1 ] );a[ j ] = std::move( tmp );
CMPE 250 Sorting Algorithms October 18, 2017 7 / 74
Insertion Sort – Analysis
Running time depends on not only the size of the array but also thecontents of the array.Best-case: → O(n)
Array is already sorted in ascending order.Inner loop will not be executed.The number of moves: 2× (n − 1)→ O(n)The number of key comparisons: (n − 1)→ O(n)
Worst-case: → O(n2)
Array is in reverse order:Inner loop is executed i − 1 times, for i = 2, 3, . . . , nThe number of moves:2× (n− 1) + (1 + 2 + · · ·+ n− 1) = 2× (n− 1) + n× (n− 1)/2→ O(n2)The number of key comparisons:(1 + 2 + · · ·+ n − 1) = n × (n − 1)/2→ O(n2)
Average-case: → O(n2)
We have to look at all possible initial data organizations.
So, Insertion Sort is O(n2)
CMPE 250 Sorting Algorithms October 18, 2017 8 / 74
Analysis of insertion sort
Which running time will be used to characterize this algorithm?Best, worst or average?
Worst:Longest running time (this is the upper limit for the algorithm)It is guaranteed that the algorithm will not be worse than this.
Sometimes we are interested in the average case. But there aresome problems with the average case.
It is difficult to figure out the average case. i.e. what is average input?Are we going to assume all possible inputs are equally likely?In fact for most algorithms the average case is the same as the worstcase.
CMPE 250 Sorting Algorithms October 18, 2017 9 / 74
A lower bound for simple sorting algorithms
An inversion :an ordered pair (Ai ,Aj) such that i < j but Ai > Aj
Example: 10, 6, 7, 15, 3,1Inversions are: (10,6), (10,7), (10,3), (10,1), (6,3), (6,1) (7,3), (7,1)(15,3), (15,1), (3,1)
CMPE 250 Sorting Algorithms October 18, 2017 10 / 74
Swapping
Swapping adjacent elements that are out of order removes oneinversion.
A sorted array has no inversions.
Sorting an array that contains i inversions requires at least i swaps ofadjacent elements.
CMPE 250 Sorting Algorithms October 18, 2017 11 / 74
Theorems
Theorem 1: The average number of inversions in an array of Ndistinct elements is N(N − 1)/4
Theorem 2: Any algorithm that sorts by exchanging adjacentelements requires Ω(N2) time on average.
For a sorting algorithm to run in less than quadratic time it must dosomething other than swap adjacent elements.
CMPE 250 Sorting Algorithms October 18, 2017 12 / 74
Mergesort
Mergesort algorithm is one of the two important divide-and-conquersorting algorithms (the other one is quicksort).It is a recursive algorithm.
Divides the list into halves,Sorts each half separately, andThen merges the sorted halves into one sorted array.
CMPE 250 Sorting Algorithms October 18, 2017 13 / 74
Mergesort
/*** Mergesort algorithm (driver).
*/template <typename Comparable>void mergeSort( vector<Comparable> & a )
vector<Comparable> tmpArray( a.size( ) );
mergeSort( a, tmpArray, 0, a.size( ) - 1 );
CMPE 250 Sorting Algorithms October 18, 2017 15 / 74
Mergesort (Cont.)
/*** Internal method that makes recursive calls.
* a is an array of Comparable items.
* tmpArray is an array to place the merged result.
* left is the left-most index of the subarray.
* right is the right-most index of the subarray.
*/template<typename Comparable>void mergeSort(vector<Comparable> & a, vector<Comparable> &
tmpArray, int left, int right) if (left < right)
int center = (left + right) / 2;mergeSort(a, tmpArray, left, center);mergeSort(a, tmpArray, center + 1, right);merge(a, tmpArray, left, center + 1, right);
CMPE 250 Sorting Algorithms October 18, 2017 16 / 74
Merge
/*** Internal method that merges two sorted halves of a subarray.
* a is an array of Comparable items.
* tmpArray is an array to place the merged result.
* leftPos is the left-most index of the subarray.
* rightPos is the index of the start of the second half.
* rightEnd is the right-most index of the subarray.
*/template <typename Comparable>void merge( vector<Comparable> & a, vector<Comparable> & tmpArray,
int leftPos, int rightPos, int rightEnd )
int leftEnd = rightPos - 1;int tmpPos = leftPos;int numElements = rightEnd - leftPos + 1;
// Main loopwhile( leftPos <= leftEnd && rightPos <= rightEnd )
if( a[ leftPos ] <= a[ rightPos ] )tmpArray[ tmpPos++ ] = std::move( a[ leftPos++ ] );
elsetmpArray[ tmpPos++ ] = std::move( a[ rightPos++ ] );
while( leftPos <= leftEnd ) // Copy rest of first halftmpArray[ tmpPos++ ] = std::move( a[ leftPos++ ] );
while( rightPos <= rightEnd ) // Copy rest of right halftmpArray[ tmpPos++ ] = std::move( a[ rightPos++ ] );
// Copy tmpArray backfor( int i = 0; i < numElements; ++i, --rightEnd )
a[ rightEnd ] = std::move( tmpArray[ rightEnd ] );
CMPE 250 Sorting Algorithms October 18, 2017 17 / 74
Mergesort – Analysis of Merge
A worst-case instance of the merge step in mergesort
CMPE 250 Sorting Algorithms October 18, 2017 20 / 74
Mergesort – Analysis of Merge (cont.)
Merging two sorted arrays of size k
Best-case:All the elements in the first array are smaller (or larger) than all theelements in the second array.The number of moves: 2k + 2kThe number of key comparisons: k
Worst-case:The number of moves: 2k + 2kThe number of key comparisons: 2k − 1
CMPE 250 Sorting Algorithms October 18, 2017 21 / 74
Mergesort - Analysis
Levels of recursive calls to mergesort, given an array of eight items
CMPE 250 Sorting Algorithms October 18, 2017 22 / 74
Mergesort - Analysis
Worst-case – The number of key comparisons:= 20× (2×2m−1−1) + 21× (2×2m−2−1) + ...+ 2m−1× (2×20−1)= (2m − 1) + (2m − 2) + ...+ (2m − 2m−1) ( m terms )= m × 2m −
∑m−1i=0 2i
= m × 2m − 2m − 1Using m = logn= n × log2n − n − 1→ O(n × log2n)
CMPE 250 Sorting Algorithms October 18, 2017 24 / 74
Mergesort - Analysis
Mergesort is an extremely efficient algorithm with respect to time.Both worst case and average cases are O(n × log2n)
But, mergesort requires an extra array whose size equals to the sizeof the original array.If we use a linked list, we do not need an extra array
But, we need space for the linksAnd, it will be difficult to divide the list into half ( O(n) )
CMPE 250 Sorting Algorithms October 18, 2017 25 / 74
Mergesort for Linked Lists
Merge sort is often preferred for sorting a linked list. The slowrandom-access performance of a linked list makes some otheralgorithms (such as quicksort) perform poorly, and others (such asheapsort) completely impossible.
MergeSort1 If head is NULL or there is only one element in the Linked List then
return.2 Else divide the linked list into two halves.3 Sort the two halves a and b.
MergeSort(&first);MergeSort(&second);
4 Merge the two parts of the list into a sorted one.*head = Merge(first, second);
CMPE 250 Sorting Algorithms October 18, 2017 26 / 74
Mergesort for linked lists
#include <iostream>using namespace std;
// Link list nodetypedef struct Node* listpointer;struct Node
int data;listpointer next;
;
// function prototypes
listpointer SortedMerge(listpointer a, listpointer b);
void FrontBackSplit(listpointer source,listpointer* frontRef, listpointer* backRef);
// sorts the linked list by changing next pointers (not data)
void MergeSort(listpointer* headRef)
listpointer head = *headRef;listpointer a;listpointer b;
//Base case -- length 0 or 1if ((head == NULL) || (head->next == NULL))
return;
// Split head into ’a’ and ’b’ sublistsFrontBackSplit(head, &a, &b);
// Recursively sort the sublistsMergeSort(&a);MergeSort(&b);
// answer = merge the two sorted lists together
*headRef = SortedMerge(a, b);
CMPE 250 Sorting Algorithms October 18, 2017 27 / 74
Mergesort for linked lists (cont.)
listpointer SortedMerge(listpointer a, listpointer b)
listpointer result = NULL;
// Base casesif (a == NULL)
return(b);else if (b==NULL)
return(a);
// Pick either a or b, and make recursive callif (a->data <= b->data)
result = a;result->next = SortedMerge(a->next, b);
else
result = b;result->next = SortedMerge(a, b->next);
return(result);
CMPE 250 Sorting Algorithms October 18, 2017 28 / 74
Mergesort for linked lists (cont.)
// Split the nodes of the given list into front and back halves,// and return the two lists using the reference parameters.// If the length is odd, the extra node should go in the front list.// Uses the fast/slow pointer strategy.
void FrontBackSplit(listpointer source,listpointer* frontRef, listpointer* backRef)
listpointer fast;listpointer slow;if (source==NULL || source->next==NULL)
// length < 2 cases
*frontRef = source;
*backRef = NULL;else
slow = source;fast = source->next;
// Advance ’fast’ two nodes, and advance ’slow’ one nodewhile (fast != NULL)
fast = fast->next;if (fast != NULL)
slow = slow->next;fast = fast->next;
// ’slow’ is before the midpoint in the list, so split it in two//at that point.
*frontRef = source;
*backRef = slow->next;slow->next = NULL;
CMPE 250 Sorting Algorithms October 18, 2017 29 / 74
Mergesort for linked lists (cont.)
// Function to print nodes in a given linked listvoid printList(listpointer node)
while(node!=NULL)
cout<< node->data<<" ";node = node->next;
// Function to insert a node at the beginning of the linked list
void push(listpointer* head_ref, int new_data)
// allocate nodelistpointer new_node = new Node;
// put in the datanew_node->data = new_data;
// link the old list off the new nodenew_node->next = (*head_ref);
// move the head to point to the new node(*head_ref) = new_node;
CMPE 250 Sorting Algorithms October 18, 2017 30 / 74
Mergesort for linked lists (cont.)
// Driver program to test above functionsint main()
// Start with the empty list
listpointer a = NULL;int n,num;
// Let us create an unsorted linked list to test the functions
cout<<endl<<"Enter the number of data elements to be sorted: ";cin>>n;
// Create linked list.for(int i = 0; i < n; i++)
cout<<"Enter element "<<i+1<<": ";cin>>num;
push(&a,num);
// Sort the above created Linked ListMergeSort(&a);
cout<< endl << "Sorted Linked List is: "<<endl;printList(a);
return 0;
CMPE 250 Sorting Algorithms October 18, 2017 31 / 74
Quicksort
Like mergesort, Quicksort is also based on the divide-and-conquerparadigm.
But it uses this technique in a somewhat opposite manner, as all thehard work is done before the recursive calls.It works as follows:
1 First, it partitions an array into two parts,2 Then, it sorts the parts independently,3 Finally, it combines the sorted subsequences by a simple
concatenation.
CMPE 250 Sorting Algorithms October 18, 2017 32 / 74
Quicksort
Algorithm 1 Quicksort1: Let S be the input set.2: if | S| = 0 or | S| = 1 then
return3: Pick an element v in S. Call v the pivot.4: Partition S − v into two disjoint groups:
S1 = x ∈ S − v |x ≤ vS2 = x ∈ S − v |x ≥ vreturn quicksort(S1), v , quicksort(S2)
CMPE 250 Sorting Algorithms October 18, 2017 33 / 74
Issues To Consider
How to pick the pivot?Many methods (discussed later)
How to partition?Several methods exist.The one we consider is known to give good results and to be easy andefficient.We discuss the partition strategy first.
CMPE 250 Sorting Algorithms October 18, 2017 35 / 74
Partitioning Strategy
For now, assume that pivot = A[(left+right)/2].
We want to partition array A[left .. right].
First, get the pivot element out of the way by swapping it with the lastelement (swap pivot and A[right]).
Let i start at the first element and j start at the next-to-last element (i =left, j = right – 1)
CMPE 250 Sorting Algorithms October 18, 2017 36 / 74
Partitioning Strategy (Cont.)
Want to haveA[k] ≤ pivot, for k < iA[k] ≥ pivot, for k > j
When i < jMove i right, skipping over elements smaller than the pivotMove j left, skipping over elements greater than the pivotWhen both i and j have stopped
A[i] ≥ pivotA[j] ≤ pivot⇒ A[i] and A[j] should now be swapped
CMPE 250 Sorting Algorithms October 18, 2017 37 / 74
Partitioning Strategy (Cont.)
When i and j have stopped and i is to the left of j (thus legal)Swap A[i] and A[j]
The large element is pushed to the right and the small element is pushedto the left
After swappingA[i] ≤ pivotA[j] ≥ pivot
Repeat the process until i and j cross
CMPE 250 Sorting Algorithms October 18, 2017 38 / 74
Partitioning Strategy (Cont.)
When i and j have crossedswap A[i] and pivot
Result:A[k] ≤ pivot, for k < iA[k] ≥ pivot, for k > i
CMPE 250 Sorting Algorithms October 18, 2017 39 / 74
Pivot Strategies
First element:Bad choice if input is sorted or in reverse sorted orderBad choice if input is nearly sorted
Random: even a malicious agent cannot arrange a bad input
Median-of-three: choose the median of the left, right, and centerelements
CMPE 250 Sorting Algorithms October 18, 2017 40 / 74
Median of Three
// Return median of left, center, and right.// Order these and hide the pivot.
template <typename Comparable>const Comparable & median3( vector<Comparable> & a, int left, int right )
int center = ( left + right ) / 2;
if( a[ center ] < a[ left ] )std::swap( a[ left ], a[ center ] );
if( a[ right ] < a[ left ] )std::swap( a[ left ], a[ right ] );
if( a[ right ] < a[ center ] )std::swap( a[ center ], a[ right ] );
// Place pivot at position right - 1std::swap( a[ center ], a[ right - 1 ] );return a[ right - 1 ];
CMPE 250 Sorting Algorithms October 18, 2017 42 / 74
Discussion
Small arrays: Quicksort is slower than insertion sort when is N issmall (say, N ≤ 20).
Optimization: Make |S| ≤ 20 the base case and call insertion sort.
CMPE 250 Sorting Algorithms October 18, 2017 43 / 74
Quicksort algorithm (driver)
template <typename Comparable>void quicksort( vector<Comparable> & a )
quicksort( a, 0, a.size( ) - 1 );
CMPE 250 Sorting Algorithms October 18, 2017 44 / 74
Quicksort algorithm (recursive)
// Uses median-of-three partitioning and a cutoff of 20.// a is an array of Comparable items.// left is the left-most index of the subarray.// right is the right-most index of the subarray.
template <typename Comparable>void quicksort( vector<Comparable> & a, int left, int right )
if( left + 20 <= right )
const Comparable & pivot = median3( a, left, right );
// Begin partitioningint i = left, j = right - 1;for( ; ; )
while( a[ ++i ] < pivot ) while( pivot < a[ --j ] ) if( i < j )
std::swap( a[ i ], a[ j ] );else
break;
std::swap( a[ i ], a[ right - 1 ] ); // Restore pivot
quicksort( a, left, i - 1 ); // Sort small elementsquicksort( a, i + 1, right ); // Sort large elements
else // Do an insertion sort on the subarray
insertionSort( a, left, right );
CMPE 250 Sorting Algorithms October 18, 2017 45 / 74
Analysis of Quicksort
Worst case: pivot is the smallest (or largest) element all the time.T (N) = T (N − 1) + cNT (N − 1) = T (N − 2) + c(N − 1)T (N − 2) = T (N − 3) + c(N − 2). . .T (2) = T (1) + c(2)T (N) = T (1) + c
∑Ni=2 i → O(N2)
Best case: pivot is the medianT (N) = 2T (N/2) + cNT (N) = cNlogN + N → O(NlogN)
CMPE 250 Sorting Algorithms October 18, 2017 46 / 74
Quicksort: Average case
Assume each of the sizes for S1 are equally likely.
0 ≤ |S1| ≤ N − 1.
T (N) =(
1N
∑N−1i=0 [T (i) + T (N − i − 1)]
)+ cN
T (N) =(
2N
∑N−1i=0 T (i)
)+ cN
NT (N) =(
2∑N−1
i=0 T (i))
+ cN2
(N − 1)T (N − 1) =(
2∑N−2
i=0 T (i))
+ c(N − 1)2
NT (N)− (N − 1)T (N − 1) = 2T (N − 1) + 2cN − cNT (N) = (N + 1)T (N − 1) + 2cNDivide equation by N(N + 1)T (N)N+1 = T (N−1)
N + 2cN+1
CMPE 250 Sorting Algorithms October 18, 2017 47 / 74
Quicksort: Average case (Cont.)
T (N−1)N = T (N−2)
N−1 + 2cN
T (N−2)N−1 = T (N−3)
N−2 + 2cN−1
. . .T (2)
3 = T (1)2 + 2c
3T (N)N+1 = T (1)
2 + 2c∑N+1
i=31i
2c∑N+1
i=31i = 2c(HN+1 − 3
2 )
T (N) = (N + 1)( T (1)2 + 2c(HN+1 − 3
2 )HN ≈ loge(N) + γ + 1
2N (γ = 0.577215664901(Euler-MascheroniConstant)T (N) ≈ (N + 1)
[T (1)
2 + 2c(
(loge(N + 1) + γ + 12(N+1))− 3
2
)]T (N)→ O(NlogN)
CMPE 250 Sorting Algorithms October 18, 2017 48 / 74
Heapsort
The priority queue can be used to sort N items byinserting every item into a binary heap andextracting every item by calling deleteMin N times, thus sorting theresult.
An algorithm based on this idea is heapsort.
It is an O(NlogN) worst-case sorting algorithm.
CMPE 250 Sorting Algorithms October 18, 2017 49 / 74
Heapsort
The main problem with this algorithm is that it uses an extra array forthe items exiting the heap.We can avoid this problem as follows:
After each deleteMin, the heap shrinks by 1.Thus the cell that was last in the heap can be used to store the elementthat was just deleted.Using this strategy, after the last deleteMin, the array will contain allelements in decreasing order.
If we want them in increasing order we must use a max heap.
CMPE 250 Sorting Algorithms October 18, 2017 50 / 74
Heapsort Example
Max heap after the buildHeap phase for the input sequence59,36,58,21,41,97,31,16,26,53
CMPE 250 Sorting Algorithms October 18, 2017 51 / 74
Heapsort Example (Cont.)
Heap after the first deleteMax operation
CMPE 250 Sorting Algorithms October 18, 2017 52 / 74
Heapsort Example (Cont.)
Heap after the second deleteMax operation
CMPE 250 Sorting Algorithms October 18, 2017 53 / 74
Implementation
In the implementation of heapsort, the ADT BinaryHeap is not used.Everything is done in an array.
The root is stored in position 0.Thus there are some minor changes in the code:
Since we use max heap, the logic of comparisons is changed from > to<.For a node in position i, the parent is in (i − 1)/2, the left child is in2i + 1 and right child is next to left child.Percolating down needs the current heap size which is lowered by 1 atevery deletion.
CMPE 250 Sorting Algorithms October 18, 2017 54 / 74
The Heapsort Sort Algorithm
// Standard heapsort.
template <typename Comparable>void heapsort( vector<Comparable> & a )
for( int i = a.size( ) / 2 - 1; i >= 0; --i ) // buildHeappercDown( a, i, a.size( ) );
for( int j = a.size( ) - 1; j > 0; --j )
std::swap( a[ 0 ], a[ j ] ); // deleteMaxpercDown( a, 0, j );
CMPE 250 Sorting Algorithms October 18, 2017 55 / 74
percDown Algorithm
// Internal method for heapsort.// i is the index of an item in the heap.// Returns the index of the left child.
inline int leftChild( int i )
return 2 * i + 1;// Internal method for heapsort that is used in// deleteMax and buildHeap.// i is the position from which to percolate down.// n is the logical size of the binary heap.
template <typename Comparable>void percDown( vector<Comparable> & a, int i, int n )
int child;Comparable tmp;
for( tmp = std::move( a[ i ] ); leftChild( i ) < n; i = child )
child = leftChild( i );if( child != n - 1 && a[ child ] < a[ child + 1 ] )
++child;if( tmp < a[ child ] )
a[ i ] = std::move( a[ child ] );else
break;a[ i ] = std::move( tmp );
CMPE 250 Sorting Algorithms October 18, 2017 56 / 74
Analysis of Heapsort
It is an O(NlogN) algorithm.First phase: Build heap O(N)Second phase: N deleteMax operations: O(NlogN).
Detailed analysis shows that, the average case for heapsort is poorerthan quick sort.
Quicksort’s worst case however is far worse.
An average case analysis of heapsort is very complicated, butempirical studies show that there is little difference between theaverage and worst cases.
Heapsort usually takes about twice as long as quicksort.Heapsort therefore should be regarded as something of an insurancepolicy:On average, it is more costly, but it avoids the possibility of O(N2).
CMPE 250 Sorting Algorithms October 18, 2017 57 / 74
How fast can we sort?
Heapsort, Mergesort, and Quicksort all run in O(NlogN) best caserunning time.
Can we do any better?
CMPE 250 Sorting Algorithms October 18, 2017 58 / 74
The Answer is No! (if using comparisons only)
Our basic assumption: we can only compare two elements at a time –how does this limit the run time?Suppose you are given N elements
Assume no duplicates – any sorting algorithm must also work for thiscase
How many possible orderings can you get?
CMPE 250 Sorting Algorithms October 18, 2017 59 / 74
How many possible orderings?
Example: a, b, c (N = 3)Orderings:
1 a b c2 b c a3 c a b4 a c b5 b a c6 c b a
6 orderings = 3× 2× 1 = 3!
For N elements: N! orderings
CMPE 250 Sorting Algorithms October 18, 2017 60 / 74
A Decision Tree
Leaves contain possible orderings of a, b, c
CMPE 250 Sorting Algorithms October 18, 2017 61 / 74
Decision Trees and Sorting
A Decision Tree is a Binary Tree such that:Each node = a set of orderingsEach edge = 1 comparisonEach leaf = 1 unique orderingHow many leaves for N distinct elements?
Only 1 leaf has sorted orderingEach sorting algorithm corresponds to a decision tree
Finds correct leaf by following edges (= comparisons)
Run time ≥ maximum number of comparisonsDepends on: depth of decision treeWhat is the depth of a decision tree for N distinct elements?
CMPE 250 Sorting Algorithms October 18, 2017 62 / 74
Lower Bound on Comparison-Based Sorting
Suppose you have a binary tree of depth d. How many leaves can thetree have?
e.g. depth d = 1→ at most 2 leaves,d = 2→ at most 4 leaves, etc.
CMPE 250 Sorting Algorithms October 18, 2017 63 / 74
Lower Bound on Comparison-Based Sorting
A binary tree of depth d has at most 2d leaves
Number of leaves L ≤ 2d → d ≥ logL
Decision tree has L = N! leaves→ its depth d ≥ log(N!)
CMPE 250 Sorting Algorithms October 18, 2017 64 / 74
Lower Bound on Comparison-Based Sorting
Stirling’s approximation: N! ≈√
2πN(N
e
)N
log(N!) ≈ log(√
2πN(N
e
)N)
= log(√
2πN)
+ log((N
e
)N)
= 12 log(2πN) + N(log(N)− 1)→ Ω(NlogN)
Conclusion: Any sorting algorithm based on comparisons betweenelements requires Ω(NlogN) comparisons
CMPE 250 Sorting Algorithms October 18, 2017 65 / 74
Comparison of Sorting Algorithms
Algorithm Worst case Average case
Selection sort O(N2) O(N2)Bubble sort O(N2) O(N2)Insertion sort O(N2) O(N2)Mergesort O(NlogN) O(NlogN)Quicksort O(N2) O(NlogN)Radix sort O(N) O(N)Treesort O(N2) O(NlogN)Heapsort O(NlogN) O(NlogN)
CMPE 250 Sorting Algorithms October 18, 2017 66 / 74
Sorting in linear time
Comparison sort:Lower bound: Ω(nlogn).
Non comparison sort:Bucket sort, radix sortThey can sort in linear time (under certain assumptions).
CMPE 250 Sorting Algorithms October 18, 2017 67 / 74
Bucket Sort
Assumption: uniform distributionInput numbers are uniformly distributed in [0, 1).Suppose input size is n.
Idea:Divide [0, 1) into n equal-sized subintervals (buckets).Distribute n numbers into bucketsExpect that each bucket contains few numbers.Sort numbers in each bucket (insertion sort as default).Then go through buckets in order, listing elements.
CMPE 250 Sorting Algorithms October 18, 2017 68 / 74
Bucket Sort Algorithm
Algorithm 2 BucketSort(A)1: n← length[A]2: for i ← 1 to n do
insert A[i] into bucket B[bnA[i]c]3: for i ← 0 to n − 1 do
sort bucket B[i] using insertion sort4: Concatenate bucket B[0],B[1],. . . ,B[n-1]
CMPE 250 Sorting Algorithms October 18, 2017 69 / 74
Analysis of Bucket Sort Algorithm
Algorithm 3 BucketSort(A)1: n← length[A] Ω(1)2: for i ← 1 to n do O(n)
insert A[i] into bucket B[bnA[i]c] Ω(1) (i.e. total O(n)
3: for i ← 0 to n − 1 do O(n)sort bucket B[i] using insertion sort O(n2
i )
4: Concatenate bucket B[0],B[1],. . . ,B[n-1] O(n)
where ni is the size of bucket B[i].Thus T (n) = Θ(n) +
∑n−1i=0 O(n2
i ) = Θ(n) + nO(2− 1n ) = Θ(n)
Better than Ω(nlogn)
CMPE 250 Sorting Algorithms October 18, 2017 71 / 74
Radix Sort
Origin: Herman Hollerith’s card-sorting machine for the 1890 U.S.Census
Digit-by-digit sort.
Hollerith’s original (bad) idea: sort on most-significant digit first.
Good idea: Sort on least-significant digit first with auxiliary stable sort.
Stable Sort Property:The relative order of any two items with thesame key is preserved after the execution of the algorithm.
CMPE 250 Sorting Algorithms October 18, 2017 72 / 74
Radix Sort Algorithm
Algorithm 4 RadixSort(A,d)1: for i ← 1 to d do
use stable BucketSort to sort array A on digit i.
Lemma: Given n d-digit numbers in which each digit can take on upto k possible values, RadixSort correctly sorts these numbers inΘ(d(n + k)) time.
If d is constant and k = O(n), then time is Θ(n).
CMPE 250 Sorting Algorithms October 18, 2017 73 / 74
top related