Top Banner
Sorting considerthe problem ofsorting a list, x 1 , x 2 ,..., x n arrange the elem entsso thatthey (orsom e key fieldsin them )are in ascending order x 1 <= x 2 ,<= ... <= x n orin descending order x 1 >= x 2 >=...>= x n Som e O (n 2 )sorting schem es easy to understand and to im plem ent notvery efficient, especially forlarge data sets Three categories: selection sorts , exchange sorts , and insertion sorts .
32

Sorting

Jan 04, 2016

Download

Documents

winter-myers

Sorting. Selection Sort. Selection Sort. Selection Sort Algorithm. Selection Sort Complexity. Exchange Sort. Bubble Sort. Bubble Sort Algorithm. Bubble Sort Complexity. Insertion Sort. Insertion Sort Algorithm. Insertion Sort Example. Heaps. Heaps. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sorting

Sorting

consider the problem of sorting a list, x1 , x2 ,..., xn

arrange the elements so that they (or some key fields in them) are inascending order x1 <= x2 ,<= ... <= xn or indescending order x1 >= x2 >=...>= xn

Some O(n2) sorting schemeseasy to understand and to implementnot very efficient, especially for large data sets

Three categories:selection sorts,exchange sorts, andinsertion sorts.

Page 2: Sorting

Selection Sort

basic idea:make a number of passes through the list or a part of the list and,on each pass, select one element to be correctly positioned.

For example, on each pass through a sublist, the smallest element in thissublist might be found and then moved to its proper location.

Given the following list is to be sorted into ascending order:67, 33, 21, 84, 49, 50, 75

Scan the list to locate the smallest element and find it in position 3Interchange this element with the first element

properly positioning smallest element at the beginning of the list21 , 33 , 67 , 84 , 49 , 50 , 75

Now in all subsequent scans, the first element need not be looked at!!

Page 3: Sorting

Selection SortContinue the sort by scanning the sublist of elements from position 2 on tofind the smallest element

Exchange it with the second element (itself in this case) properly positioning the next-to-smallest element in position 2

In all subsequent scans, the first two elements need not examined!

Continue in this manner,locating the smallest element in the sublist of elements from position i

on and interchanging it with the ith element,

until sublist consists only of the last two elements, which results in anexchange or not and thus completes the sort.

21 , 33 , 49 , 84 , 67 , 50 , 7521 , 33 , 49 , 50 , 67 , 84 , 7521 , 33 , 49 , 50 , 67 , 84 , 7521 , 33 , 49 , 50 , 67 , 75 , 84

Page 4: Sorting

Selection Sort AlgorithmFor i = 1 to n do:/* On the ith pass, first find the smallest element in sublist x[i] ...x[n] */

a. Set smallPos = i.b. Set smallest = x[smallPos].c. For j = i + 1 to n - 1 do:

If X[j] < smallest // smaller element foundi. Set smallPos = j.ii. Set smallest = x[smallPos].

/* Swap smallest element found with element at the beginning of sublist */d. Set x[smallPos] = x[i].e. Set x[i] = smallest.

A version that can be used for linked lists is just as easy.

Replace the indices i and j with pointers that move through the list andsublists.

Page 5: Sorting

Selection Sort Complexityfor this sorting method for all cases.

On the first pass through the list,the first item is compared with each of the n - 1 elements that follow it;

On the second pass,the second element is compared with the n - 2 elements following it, etc.

A total of (n-1) + (n-2) + … + 1 = (n*(n-1))/2 comparisons thus requiredfor any list

It follows that the computing time is O(n 2 ) in all cases.

Quadratic complexity is not considered practical (feasible) for large n.

Page 6: Sorting

Exchange Sort

exchange sorts systematically interchange pairs of elements that are out oforder until eventually no such pairs remain => list is sorted.

One example of an exchange sort isbubble sort: very inefficient, but quite easy to understand

Consider again the list67, 33, 21, 84, 49, 50, 75

On the first pass, compare the first two elements, 67 and 33, and interchangethem because they are out of order: 33 , 67 , 21 , 84 , 49 , 50 , 75

Next compare the second and third elements, 67 and 21 and interchangethem, yielding: 33 , 21 , 67 , 84 , 49 , 50 , 75

Page 7: Sorting

Bubble SortNext we compare 67 and 84 but do not interchange them (already ordered)

33 , 21 , 67 , 84 , 49 , 50 , 75

Next, 84 and 49 are compared & interchanged: 33 , 21 , 67 , 49 , 84 , 50 , 75Then 84 and 50 are compared & interchanged: 33 , 21 , 67 , 49 , 50 , 84 , 75Finally 84 and 75 are compared & interchanged: 33 , 21 , 67 , 49 , 50 , 75 , 84

The first pass through the list is now complete. guaranteed that on this pass, the largest element in the list will “sink” to

the end of the list, since it will obviously be moved past all smaller elements.

Notice also that some of the smaller items have “bubbled up” toward theirproper positions nearer the front of the list.

Scan the list again, leaving out the last item ( already in its proper position).

Page 8: Sorting

Bubble Sort Algorithm

1. Initialize numPairs to n - 1./* numPairs is the number of pairs to be compared on the current pass */

2. Do the following:a. Set last equal to 1.

/* last marks the location of the last element involved in an interchange */

b. For i = 1 to numPairs:If x i > xi+1 :

i. Interchange x i and x i+1.ii. Set last equal to i.

c. Set numPairs equal to last - 1.while numPairs > 0.

Page 9: Sorting

Bubble Sort Complexity

Worst case for bubble sort occurs when the list elements are in reverse order only one item (the largest) is positioned correctly on each pass

On the first pass through the list, n - 1 comparisons and interchangesare made, and only the largest element is correctly positioned.

On the next pass, the sublist consisting of the first n - 1 elements is scanned;there are n – 2 comparisons and interchanges; and the next largest elementsinks to position n - 1.

Continue until the sublist consisting of the first two elements is scanned

Total of (n - 1) + (n - 2) + … + 1 = n(n - 1) / 2 comparisons and interchangesworst-case computing time for bubble sort is O(n 2 ).

Page 10: Sorting

Insertion Sort

Insertion sorts are based on the process ofrepeatedly inserting a new element into already sorted list

At the ith stage, xi is inserted into its proper place among the alreadysorted x1, x2 ,..., xi-1.

Compare xi with each of these elements, starting from the right end, andshift them to the right as necessary.

Use array position 0 to store a copy of xi to prevent “falling off the leftend” in these right-to-left scans.

Page 11: Sorting

Insertion Sort Algorithm

For i = 2 to n do:/* Insert x[i] into its proper position among x[1], . . . , x[i - 1] */

a. Set nextElement equal to x[i].b. Set x[0] equal to nextElement.c. Set j equal to i.d. While nextElement < x[j - 1] do:

// Shift element to the right to open a spoti. Set x[j] equal to x[j - 1].ii. Decrement j by 1.

// Now drop nextElement into the open spote. Set x[j ] equal to nextElement.

Shifting elements is really grossly inefficient.

Insertion Sort, however, works well with linked list implementations.It is essentially the same algorithm as constructing an ordered list

Page 12: Sorting

Insertion Sort Example

Given, 67, 33, 21, 84, 49, 50, 75.

Only the sorted sublist produced at each stage is shown

6733 6721 33 6721 33 67 8421 33 49 67 8421 33 49 50 67 8421 33 49 50 67 75 84

Worst case computing time is again O(n 2 ).

Page 13: Sorting

HeapsA heap is a binary tree with the following properties:

1. left-complete: each level of the tree is completely filled, exceptpossibly the bottom level where the nodes are in the leftmost positions.

2. heap-ordered: data item stored in each node is greater than orequal to the data items stored in its children.

Not a heap Heap

22

12

14

24

28

14

12 22

24

28

Page 14: Sorting

HeapsTo implement a heap, an array or a vector can be used most effectively.

Simply number the nodes in the heap from top to bottom,number the nodes on each level from left to right andstore the data in the ith node in the ith location of the array.

The completeness property of a heap guarantees that these data items willbe stored in consecutive locations at the beginning of the array.

If heap is the name of the array or vector used, the items in previous heapstored as follows:heap[1] = 24, heap[2] = 14, heap[3] = 28, heap[4] = 12, heap[5] = 22.

in an array implementation, easy to find the children of a given node:children of the ith node are at locations 2*i and 2*i + 1.

Similarly, the parent of the ith node is easily seen to be in location i / 2.

Page 15: Sorting

Convert Complete Binary Tree to a Heap

Given a complete binary tree stored in positions r through n of the arrayheap with left and right subtrees that are heaps.

Percolate-Down the largest value

For c = 2 * r to n do: // c is location of left child// Find the largest childa. If c < n and heap[c] < heap[c + 1]

Increment c by 1./* Swap node & largest child if needed, move down to the next subtree */

b. If heap[r] < heap[c]:i. Swap heap[r] and heap[c].ii. Set r = c.iii. Set c = 2 * c.

ElseTerminate repetition.

Apply this percolate-down procedure to the bottom half of the tree

Page 16: Sorting

Convert Complete Binary Tree to a Heaptemplate <typename ElementType>void PercolateDown(ElementType x[], int n, int r){

int c = 2*r;bool done = false;while (c < n && !done){

if (c < n && x[c] < x[c+1] )c++;

if (x[r] < x[c]){

ElementType temp = x[r];x[r] = x[c];x[c] = temp;r = c;c = 2*c;

}else

done = true;}return;

}

Page 17: Sorting

Convert Complete Binary Tree to a Heap

template <typename ElementType>void Heapify(ElementType x[], int n){

for (int r = n/2; r > 0; r--)PercolateDown(x, n, r);

return;}

After application of Heapify(), x is a heap.

Page 18: Sorting

Heapsort1. Consider array x as a complete binary tree and

use the Heapify algorithm to convert this tree to a heap.2. For i = n down to 2:

a. Interchange x[1] and x[i],thus putting the largest element in the sublist x[1],...,x[i] at end of sublist.

b. Apply the PercolateDown algorithm to convert the binary treecorresponding to the sublist stored in positions 1 through i - 1 of x.

In PercolateDown, the number of items in the subtree considered at eachstage is one-half the number of items in the subtree at the preceding stage.Thus, the worst-case computing time is O(log 2 n).

Heapify algorithm executes PercolateDown n/2 times: worst-casecomputing time is O(nlog 2 n).

Heapsort executes Heapify one time and PercolateDown n - 1 times;consequently, its worst-case computing time is O(n log 2 n).

Page 19: Sorting

Heapsort

template <typename ElementType>void HeapSort(ElementType x[], int n){

Heapify(x,n);

for (int index = n; index > 0; index--){

ElementType temp = x[1];x[1] = x[index];x[index] = temp;

PercolateDown(x,n, index-1);}return;

}

Page 20: Sorting

QuicksortA more efficient exchange sorting scheme than bubble sort because a typicalexchange involves elements that are far apart fewer interchanges are required to correctly position an element.

Quicksort uses a divide-and-conquer strategya recursive approach to problem-solving in which

the original problem partitioned into simpler sub-problems,each subproblem considered independently.Subdivision continues until subproblems obtained are simple

enough to be solved directly

Choose some element called a pivotPerform a sequence of exchanges so that

all elements that are less than this pivot are to its left andall elements that are greater than the pivot are to its right.

divides the (sub)list into two smaller sublists,each of which may then be sorted independently in the same way.

Page 21: Sorting

Quicksort1. If the list has 0 or 1 elements,

return. // the list is sorted

Else do:2. Pick an element in the list to use as the pivot.

3. Split the remaining elements into two disjoint groups:SmallerThanPivot = {all elements < pivot}LargerThanPivot = {all elements > pivot}

4. Return the list rearranged as:Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).

Page 22: Sorting

Quicksort Example

Given 75, 70, 65, 84, 98, 78, 100, 93, 55, 61, 81, 68 to sort

Select, arbitrarily, the first element, 75, as pivot.Search from right for elements <= 75, stop at first element

>75Search from left for elements > 75, stop at first element <=75Swap these two elements, and then repeat two elements same

75, 70, 65, 84, 98, 78, 100, 93, 55, 61, 81, 6875, 70, 65, 68, 98, 78, 100, 93, 55, 61, 81, 84

75, 70, 65, 68, 98, 78, 100, 93, 55, 61, 81, 8475, 70, 65, 68, 61, 78, 100, 93, 55, 98, 81, 84

75, 70, 65, 68, 61, 78, 100, 93, 55, 98, 81, 8475, 70, 65, 68, 61, 55, 100, 93, 78, 98, 81, 84

75, 70, 65, 68, 61, 55, 100, 93, 78, 98, 81, 84 done, swap with pivot

Page 23: Sorting

Quicksort Example

The previous SPLIT operation placed pivot 75 so that all elementsto the left were <= 75 and all elements to the right were >75.

75 is now placed appropriately Need to sort sublists on either side of 75

55, 70, 65, 68, 61, 75, 100, 93, 78, 98, 81, 84 pivot 75 Need to sort (independently):

55, 70, 65, 68, 61100, 93, 78, 98, 81, 84

Quicksort performance:O(nlogn) if the pivot results in sublists of approximately the

same size.O(n2) worst-case (list already ordered, elements in reverse)

when Split() repetitively results, for example, in one empty sublist

Page 24: Sorting

Quicksorttemplate <typename ElementType>void Split(ElementType x[],int first, int last, int& pos)(

ElementType pivot = x[left]; // pivot elementint left = first, // index for left search

right = last; // index for right searchwhile (left < right){

// Search from right for element <= pivotwhile (x[right] > pivot)

right--;// Search from left for element > pivotwhile (left < right && x[left] <= pivot)

left++;// Interchange elements if searches haven’t metif (left < right)

Swap(x[left], x[right]);}// End of searches; place pivot in correct positionpos = right;x[first] = x[pos];x[pos] = pivot;

}

Page 25: Sorting

Quicksort

template <typename ElementType>void Quicksort(ElementType x[], int first, int last){

int pos; // final position of pivotif (first < last) // list has more than one element{

// Split into two sublistsSplit(x, first, last, pos);// Sort left sublistQuicksort(x, first, pos - 1);// Sort right sublistQuicksort(x, pos + 1, last);

}// else list has 0 or 1 element and// requires no sortingreturn;

}

This function is called with a statement of the formQuicksort(x, 1, n);

Page 26: Sorting

Quicksort Improvement I

Quicksort is a recursive function

stack of activation records must be maintained by system to managerecursion.

The deeper the recursion is, the larger this stack will become.

The depth of the recursion and the corresponding overhead can be reducedsort the smaller sublist at each stage first

Another improvement aimed at reducing the overhead of recursion is to usean iterative version of Quicksort()

To do so, use a stack to store the first and last positions of the sublists sorted"recursively".

Page 27: Sorting

Quicksort Improvement II

An arbitrary pivot gives a poor partition fornearly sorted lists (or lists in reverse)

virtually all the elements go into either SmallerThanPivot orLargerThanPivot

all through the recursive calls. Quicksort takes quadratic time to do essentially nothing at all.

One common method for selecting the pivot is the median-of-three rule,select the median of the first, middle, and last elementsin each sublist as the pivot.

Often the list to be sorted is already partially ordered median-of-three rule will select a pivot closer to the middle of the sublist

than will the “first-element” rule.

Page 28: Sorting

Quicksort Improvement III

For small files (n <= 20), quicksort is worse than insertion sort;small files occur often because of recursion.

Use an efficient sort (e.g., insertion sort) for small files.

Better yet, use Quicksort() until sublists are of a small size and thenapply an efficient sort like insertion sort.

Page 29: Sorting

MergesortSorting schemes are

internal -- designed for data items stored in main memoryexternal -- designed for data items stored in secondary memory.

Previous sorting schemes were all internal sorting algorithms:required direct access to list elements

( not possible for sequential files) made many passes through the list

(not practical for files)

mergesort can be used both as an internal and an external sort.basic operation in mergesort is merging, that is,combining two lists that have previously been sorted so that theresulting list is also sorted.

Page 30: Sorting

MergesortFor example: File1 15 20 25 35 45 60 65 70

File2 10 30 40 50 55

Pair by pair,compare the smallest unmerged element in File1, call it xwith the smallest unmerged element in File2, call it y

If x < y,copy x from File1 to the "merged" file, File3

Elsecopy y from File2 to the "merged" file, File3

File1 15 20 25 35 45 60 65 70 File3 10File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15File2 10 30 40 50 55

Page 31: Sorting

MergesortFile1 15 20 25 35 45 60 65 70 File3 10 15 20File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15 20 25File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15 20 25 30File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15 20 25 30 35File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15 20 25 30 35 40File2 10 30 40 50 55

File1 15 20 25 35 45 60 65 70 File3 10 15 20 25 30 35 40 45File2 10 30 40 50 55

Etc.

Page 32: Sorting

Mergesort1. Open File1 and File2 for input, File3 for output.

2. Read first element x from File1 and first element y from File2.

3. Repeat the following until end of either File1 or File2 reached:If x< y

a. Write x to File3.b. Read a new x value from File1.

Elsea. Write y to File3.b. Read a new y value from File2.

4. If end of File1 encountered,copy any remaining elements from File2 into File3.

Else // end of File2 was encounteredcopy the rest of File1 into File3.