Top Banner
1 More Sorting; Searching Dan Barrish-Flood
16

1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

Dec 15, 2015

Download

Documents

Jake Isham
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

1

More Sorting; Searching

Dan Barrish-Flood

Page 2: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

2

Bucket Sort•Put keys into n buckets, then sort each bucket, then concatenate.

•If keys are uniformly distributed (quite an assumption), then each bucket will be small, so each bucket can be sorted very quickly (using probably insertion sort).

•Analysis is complex, because this involves probability and random distribution (see CLRS p. 175 – 176!!)

•Bucket sort runs in Θ(n), i.e. linear!

Page 3: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

3

Bucket Sort, cont’dAssume the keys are in the range [0,1)

Divide the range [0,1) into n equal-sized buckets, so that the 1st bucket is [0, 1/n)

the 1st bucket is [0, 1/n)

the 2nd bucket is [1/n , 2/n)

....

the nth bucket is [(n-1)/n , 1)

If the buckets are B[0], B[1], ..., B[n-1] then the appropriate bucket for the key A[i] is FLOOR(n A[i])

Page 4: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

4

Bucket Sort Code

Page 5: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

5

Bucket Sort in Action

Page 6: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

6

When Linear Sorts can be used

Integer Real String

Counting

Sort

Yes No No

Radix

Sort

Yes No Yes*

Binsort

(=BucketSort)

Yes* Yes Yes*

*with modifications

Page 7: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

7

The Selection ProblemInput: a set A of n (distinct) numbers, and a number i where 1 ≤ i ≤ n

Output: the element x Є A that is larger than exactly i-1 other elements of A (in practice, we allow ties).

The i th smallest value is called the ith order statistic.

The 1st order statistic is the minimum.

The nth order statistic is the maximum

the n/2 order statistic is the median (OK, there can be two medians, but we usually go with the lower one).

Q. Solve the selection problem in Θ(?) time...

Page 8: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

8

Min and MaxCan find the min (or max) in n-1 comparisons in

the way you learned in your first programming class:

MINIMUM(A)1. min ← A[1]2. for i ← 2 to length[A]3. do if min > A[i ]4. then min ← A[i ]5. return min

The n-1 is also a lower bound....

Page 9: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

9

Min and Max (Cont’d)...thus we can find both min and max in 2n – 2 comparsions. But we can do a bit better (thought not asymptotically better). We can find both min and max in 3·floor(n/2) comparisions:•keep track of both min and max•for each pair of elements:

first compare them to each other;compare the larger to maxcompare the smaller to min

This is the lower bound.

Page 10: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

10

Selection in GeneralRANDOMIZED-PARTITION(A,p,r)1. i ← RANDOM(p,r)2. exchange A[r] with A[i ]3. return PARTITION(A,p,r)

An average-case Θ(n) algorithm for finding the ith smallest element: like QuickSort, but only recurse on one part.

Average-case analaysis along the same lines as quicksort. Worst-case running time?

Page 11: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

11

Solution in Linear time (Worst-case)A clever algorithm of largely theoretical interest.

SELECT(A, i ) finds the i th smallest element of A in Θ(n) time, worst-case.

Idea: partition, but do some extra work to find a split that is guaranteed to be good.

Page 12: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

12

the SELECT algorithm

Page 13: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

13

More on SELECTn = 5: o o o o ofind the median directly, or by sorting

n = 25: 1. make 5 groups of 5 each2. find the median of each group (that’s 5 numbers)3. find the median-of-medians, x.note: x > about half the elements of about half the groups, and x < about half the elements of about half the groups. So x is in the middle half of the set (not necessarily the exact middle element!), so x is a great pivot for PARTITION. To put it more precisely...

Page 14: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

14

SELECT high-level descriptionSELECT(A, i )

1. Divide A into floor(n/5) groups of 5 elements (and possibly one smaller groups of left-overs, as in the figure in an ealier slide).

2. Find the median of each 5-element group.

3. use SELECT recursively find the median x of the ceiling(n/5) medians found is step 2.

4. PARTITION A around x, where k is the number of elements on the low side, so that x is the kth smallest element.

5. if i = k then return x, otherwise call SELECT recursively to find the ith smallest element on the low side if i < k, or the (i-k)th smallest element on the high side otherwise.

Page 15: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

15

Analsysis of SELECT

610

7

6) -n 10

3( -n

ispartition a of size possiblelargest theSo

x. are elements 6

least at that showsargument same

6

2)-52

13(

x" elements ofnumber " so

group.over -left theand x containing group for the

except x, elements 3 contribute 1/2least At

groups. n/5 are There

an x?greater th are elementsmany How

10

3

10

3

n

n

n

n

Page 16: 1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.

16

Recurrence for SELECT

case- worst(n), is SELECT thus80, n when cenough large apick can We

O(n) 7-10

nc

or ),(710

9cn need We

)(710

9cn

)(610

7

5

cn

)(610

75 T(n)

80 n all and c somefor cn T(n) Assume

induction) (i.e.on substitutiby Proof

80n if )(610

7)5(TT(n)

80 n if )1()(

)(

)(c

)(

nOccn

nOc

nOccn

nOnn

nOn

Tn

nT

c