1 Mark Dunlop, Computer and Information Sciences, Strathclyde University http://www.cis.strath.ac.uk/~mdd/ 1 Algorithms and Complexity Notes part 2: Searching & Sorting Mark D Dunlop [email protected]3 5 11 Mark Dunlop, Computer and Information Sciences, Strathclyde University http://www.cis.strath.ac.uk/~mdd/ 2 Algorithms and Complexity Searching Mark D Dunlop [email protected]Mark Dunlop, Computer and Information Sciences, Strathclyde University http://www.cis.strath.ac.uk/~mdd/ 3 Split the room • Sort yourself by birthday (e.g. 1/1 .... 31/12) • Sort yourself by first name (e.g. Aaron...Zakia) Mark Dunlop, Computer and Information Sciences, Strathclyde University http://www.cis.strath.ac.uk/~mdd/ 4 Searching - examples of accesses • How many people in your half are called David • How many people in your half have the same birthday • Does anyone share my birthday • Who’s birthday is next in your half • How many people have a unique first name in your half
14
Embed
Algorithms and Complexity€¦ · Search Complexity • Linear search: proportional to n • Binary search: proportional to log 2(n) – How many times you need to halve it •log
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
1
Algorithms and ComplexityNotes part 2: Searching & Sorting
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
33
Ave Performance of Shellsort
• insertion sort – very approx 0.000122n2 + 0.2n +....
• Shell’s shellsort– very approx 0.0000002n2 + 0.02n +...
• But both still have a worst case of O(n2) but the revisions remove this worst case for shell sort
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
34
Merge Sort
• Shell sort is the best we can do with a swap ‘em around strategy
• Merge sort is a divide and conquer strategy (c.f. binary search)
If number of items to search <= 1 return
elsesort left half; sort right halfmerge halves
35
11
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
35
Sorting is the easy bitvoid sort (int[] A, left, right)if (left!=right)
int centre = (left + right) / 2sort (A, left, centre)sort (A, centre+1, right)merge (A, left, centre, right)
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
36
void merge (int[] A , left1, centre, right2)int[] B = new int[]int p1 = left1; int p2 = centre+1; int pB=0while p1<=centre & p2<=right2
if (A[p1]<=A[p2]) B[pB]=A[p1]; pB++; p1++
else B[pB]=A[p2]; pB++; p2++
// one list is empty now
while p1<=centre B[pB]=A[p1]; pB++; p1++
while p2<=right2B[pB]=A[p2]; pB++; p2++
// copy back from list B
for (int i=0, i<=right2-left1, i++) A[i+left1]=B[i]
10
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
37
Complexity Analysis
• Merge is clearly O(n)
• But how many times is it called?
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
38
The call tree
• at level 3: 8 merges of size n/8• at level i: 2i merges of size n/2i
• each level is O(n)
1 merge of size n (level 0)2 merges of size n/24 merges of size n/48 merges of size n/8 (level 3)
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
39
So how many levels?
• log2(n) levels each of O(n)
• so merge sort is O(n log n)
splits number covered
=
1 2 21
2 4 22
3 8 23
4 16 24
5 32 25
i n 2i
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
40
Common classes of Big-Oh
0
100
200
300
400
500
600
700
800
900
1000
0 20 40 60 80 100 120
number
time
log nnn log nn*n2*n*nn!
11
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
41
Quick Sort
• Can we do mergesort in line?• Yes, if we can we get rid of the merges?• Yes, if all data left is split so all data on
the left is below all data right of split.
• So, quicksort shuffles data before recursing
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
42
Overview of Quicksort
• Worst case is O(n2) but this can be avoided
• Very tightly written innermost loops make it very fast
• Very tricky to implement correctly -slightly off and its much slower
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
43
Quick Sort
if number of elements <=1 return
elsepick a pivot value v in APartition A into L and R such that
∀ i∈L: i≤v & ∀ j∈R: j≥v sort(L) sort(R)
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
44
Picking the pivot
• A[left]– wrong - if sorted leads to O(n2)
• Ideal is to split data in middle ( |L|=|R| ) to give O(n log n ) equalling merge sort
• Pick the median is perfect but doing this involves sorting!
12
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
45
Picking the pivot• Safe choice is value at (low+high)/2
– will do very well if already sorted– but not guaranteed to split data equally
• Could pick one randomly - average best• Pragmatic choice: pick the median of
– A[low]– A[(low+high)/2]– A[high]
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
46
Very pseudo code
//assume all elements are unique for nowPick the pivotGet the pivot out the way by putting it at the endSearch from left to right looking for a large element and
also search from right to left looking for a small elementSwap the large and the smallrepeat red until large & small elements swap roundswap pivot with last large itemsort left and right parts (pivot now in right place)
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
47
void sort(a, low, high)
//Pick the pivotint mid = (low + high) /2;if a[mid]<a[low] swap(a, low, mid)if a[high]<a[low] swap(a, low, high)if a[high]<a[mid] swap(a, mid, high)// low, mid, high are now sorted - use mid as pivot// note a[low] is small and a[high] is high so leave them
//Get the pivot out the way by putting it at the endswap(a, mid, high-1)
//begin partitioning...
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
48
//begin partitioningpivot = a[high-1]int i = low+1; int j=high-1;while i<j //i.e. while not swapped over
//Search from left to right looking for a large element //Search from right to left looking for a small elementwhile (a[i]<pivot) i++;while (a[j]>pivot) j--;if (i<j) swap(a, i, j) //Swap the large and the small
//swap pivot with last large item we found aboveswap(a, i, high-1)
//sort left and right parts - NB a[i] is in right placesort(a, low, i-1)sort(a, i+1, high)
13
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
49
Average case complexity
• On average we have O(n log n) plus a very fast n part
• Insertion sort is faster for small lists - so good quicksort implementations actually call insertion sort when the size of list is <5..20
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
50
Bucket Sort
• Now for something completely different• How about sorting based on the
contents of the key rather than just the order of the keys?
• Similar to how we sort forms by name– put the as in a pile, a*, the bs into b*...
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
51
Pseudo Codevoid bucketSort (queue q)//N = size of alphabetqueue[N] bucket
while q.isnotempty()x = a.removefromfront();bucket[x.key].add(x)
for i=0 to N-1while bucket[i].isnotempty()
a.insertatend(bucket[i].removefromfront())
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
52
Radix Sort
• Not the full story, how do we sort...– put the as in a pile, a*, the bs into b*...– split a* into aa* ab* ac*....– split the aa* into aaa*, aab*....
• That’s radix sorting
• But first a new concept...
14
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
53
Stable Sort
• A stable sort is a sort that preserves the order of elements that have equal keys
• Most sorts don’t honour this but its a nice feature that radix sort needs to be most efficient
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
54
Working backwards
• If we can do a stable sort of keyi, then we should sort keyi+1 first then stabily sort keyi
• e.g. GH KI FE BH IU XS DE HJ JT HU• sort second letter gives
– FE DE GH BH KI HJ KS KT IU HU• sort first letter stabily gives
– BH DE FE GH HJ HU IU KI KS KT
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/
55
Radix Sort Complexity
• Given an alphabet of A letters, key length of L letters and N items
• Radix sort works in O(L(N+A))• Given that L & A are fixed for a given
dataset, this gives an O(N) sorter!
• Pseudo code left as exercise 8-)
Mark Dunlop, Computer and Information Sciences, Strathclyde Universityhttp://www.cis.strath.ac.uk/~mdd/