Top Banner
Mudasser Naseer 1 06/12/2 CS 332: Algorithms Lecture # 10 Medians and Order Statistics Structures for Dynamic Sets
42

CS 332: Algorithms Lecture # 10

Jan 23, 2016

Download

Documents

odeda

CS 332: Algorithms Lecture # 10. Medians and Order Statistics Structures for Dynamic Sets. Review: Radix Sort. Radix sort: Assumption: input has d digits ranging from 0 to k Basic idea: Sort elements by digit starting with least significant - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 332: Algorithms Lecture # 10

Mudasser Naseer 1 04/21/23

CS 332: AlgorithmsLecture # 10

Medians and Order Statistics

Structures for Dynamic Sets

Page 2: CS 332: Algorithms Lecture # 10

Mudasser Naseer 2 04/21/23

Review: Radix Sort

● Radix sort:■ Assumption: input has d digits ranging from 0 to k■ Basic idea:

○ Sort elements by digit starting with least significant○ Use a stable sort (like counting sort) for each stage

■ Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk)

○ When d is constant and k=O(n), takes O(n) time

■ Fast! Stable! Simple!■ Doesn’t sort in place

Page 3: CS 332: Algorithms Lecture # 10

Mudasser Naseer 3 04/21/23

Review: Bucket Sort

● Bucket sort■ Assumption: input is n reals from [0, 1)■ Basic idea:

○ Create n linked lists (buckets) to divide interval [0,1) into subintervals of size 1/n

○ Add each input element to appropriate bucket and sort buckets with insertion sort

■ Uniform input distribution O(1) bucket size○ Therefore the expected total time is O(n)

Page 4: CS 332: Algorithms Lecture # 10

Mudasser Naseer 4 04/21/23

Review: Order Statistics

● The ith order statistic in a set of n elements is the ith smallest element

● The minimum is thus the 1st order statistic ● The maximum is (thus) the nth order statistic● The median is the n/2 order statistic

■ If n is even, there are 2 medians■ The lower median, at i = n/2, and■ The upper median, at i = n/2 + 1.■ We mean lower median when we use the phrase the median

Page 5: CS 332: Algorithms Lecture # 10

Review: Order Statistics

● Could calculate order statistics by sorting■ Time: O(n lg n) with comparison sort■ We can do better

Mudasser Naseer 5 04/21/23

Page 6: CS 332: Algorithms Lecture # 10

Mudasser Naseer 6 04/21/23

The Selection Problem

● The selection problem: find the ith smallest element of a set

● Selection problem can be solved in O(n lg n) time■ Sort the numbers using an O(n lg n)-time algorithm■ Then return the i th element in the sorted array.

● There are faster algorithms. Two algorithms:■ A practical randomized algorithm with O(n) expected running

time■ A cool algorithm of theoretical interest only with O(n) worst-

case running time

Page 7: CS 332: Algorithms Lecture # 10

Minimum and maximum

MINIMUM(A, n)min ← A[1]

for i ← 2 to n

do if min > A[i ]

then min ← A[i ]

return min● The maximum can be found in exactly the same way

by replacing the > with < in the above algorithm.

Mudasser Naseer 7 04/21/23

Page 8: CS 332: Algorithms Lecture # 10

Simultaneous minimum and maximum

● There will be n − 1 comparisons for the minimum and n − 1 comparisons for the maximum, for a total of 2n −2 comparisons. This will result in (n) time.

● In fact, at most 3n/2 comparisons are needed.■ Maintain the min. and max. of elements seen so far■ Process elements in pairs. ■ Compare the elements of a pair to each other.■ Then compare the larger element to the maximum so far,

and compare the smaller element to the minimum so far.

Mudasser Naseer 8 04/21/23

Page 9: CS 332: Algorithms Lecture # 10

Simultaneous minimum and maximum

● This leads to only 3 comparisons for every 2 elements.

● Initial values for the min and max depends on whether n is odd or even■ If n is even, compare the first two elements and

assign the larger to max and the smaller to min. Then process the rest of the elements in pairs.

■ If n is odd, set both min and max to the first element. Then process the rest of the elements in pairs.

Mudasser Naseer 9 04/21/23

Page 10: CS 332: Algorithms Lecture # 10

Total # of Comparisons

● If n is even, we do 1 initial comparison and then 3(n −2)/2 more comparisons.

# of comparisons = 3(n − 2)/2 + 1

= (3n − 6)/2 + 1

= 3n/2 – 3 + 1 = 3n/2 − 2 .● If n is odd, we do 3(n − 1)/2 = 3n/2 comparisons.● In either case, the maximum number of

comparisons is ≤ 3n/2.

Mudasser Naseer 10 04/21/23

Page 11: CS 332: Algorithms Lecture # 10

Mudasser Naseer 11 04/21/23

Review: Randomized Selection

● Key idea: use partition() from quicksort■ But, only need to examine one subarray■ This savings shows up in running time: O(n)

A[q] A[q]

qp r

Page 12: CS 332: Algorithms Lecture # 10

Mudasser Naseer 12 04/21/23

Review: Randomized Selection

RandomizedSelect(A, p, r, i)

if (p == r) then return A[p];

q = RandomizedPartition(A, p, r)

k = q - p + 1;

if (i == k) then return A[q]; // not in book

if (i < k) then

return RandomizedSelect(A, p, q-1, i);

else

return RandomizedSelect(A, q+1, r, i-k);

A[q] A[q]

k

qp r

Page 13: CS 332: Algorithms Lecture # 10

Mudasser Naseer 13 04/21/23

Review: Randomized Selection

● Average case■ For upper bound, assume ith element always falls

in larger side of partition:■ T(n) = O(n)

Page 14: CS 332: Algorithms Lecture # 10

Mudasser Naseer 14 04/21/23

Dynamic Sets

● In structures for dynamic sets■ Elements have a key and satellite data■ Dynamic sets support queries such as:

○ Search(S, k), Minimum(S), Maximum(S), Successor(S, x), Predecessor(S, x)

■ They may also support modifying operations like:○ Insert(S, x), Delete(S, x)

Page 15: CS 332: Algorithms Lecture # 10

Mudasser Naseer 15 04/21/23

Binary Search Trees

● Binary Search Trees (BSTs) are an important data structure for dynamic sets

● In addition to satellite data, elements have:■ key: an identifying field inducing a total ordering■ left: pointer to a left child (may be NULL)■ right: pointer to a right child (may be NULL)■ p: pointer to a parent node (NULL for root)

Page 16: CS 332: Algorithms Lecture # 10

Mudasser Naseer 16 04/21/23

Binary Search Trees

● BST property: key[left(x)] key[x] key[right(x)]

● Example:

F

B H

KDA

Page 17: CS 332: Algorithms Lecture # 10

Mudasser Naseer 17 04/21/23

Inorder Tree Walk

● What does the following code do?TreeWalk(x)

TreeWalk(left[x]);

print(x);

TreeWalk(right[x]);

● A: prints elements in sorted (increasing) order● This is called an inorder tree walk

■ Preorder tree walk: print root, then left, then right■ Postorder tree walk: print left, then right, then root

Page 18: CS 332: Algorithms Lecture # 10

Mudasser Naseer 18 04/21/23

Inorder Tree Walk

● Example:

● How long will a tree walk take?● Draw Binary Search Tree for {2,3,5,5,7,8}

F

B H

KDA

Page 19: CS 332: Algorithms Lecture # 10

Mudasser Naseer 19 04/21/23

Operations on BSTs: Search

● Given a key and a pointer to a node, returns an element with that key or NULL:

TreeSearch(x, k)

if (x = NULL or k = key[x])

return x;

if (k < key[x])

return TreeSearch(left[x], k);

else

return TreeSearch(right[x], k);

Page 20: CS 332: Algorithms Lecture # 10

Mudasser Naseer 20 04/21/23

BST Search: Example

● Search for D and C:

F

B H

KDA

Page 21: CS 332: Algorithms Lecture # 10

Mudasser Naseer 21 04/21/23

Operations on BSTs: Search

● Here’s another function (Iterative) that does the same:

TreeSearch(x, k)

while (x != NULL and k != key[x])

if (k < key[x])

x = left[x];

else

x = right[x];

return x;

● Which of these two functions is more efficient?

Page 22: CS 332: Algorithms Lecture # 10

Mudasser Naseer 22 04/21/23

BST Operations: Minimum/Max

● How can we implement a Minimum() query?● What is the running time? TREE-MINIMUM(x)

while left[x] ≠ NIL

do x ← left[x]

return x

TREE-MAXIMUM(x)

while right[x] ≠ NIL

do x ← right[x]

return x

Page 23: CS 332: Algorithms Lecture # 10

Mudasser Naseer 23 04/21/23

BST Operations: Successor

● For deletion, we will need a Successor() operation

Page 24: CS 332: Algorithms Lecture # 10

Mudasser Naseer 24 04/21/23

● What is the successor of node 3? Node 15? Node 13?

● Time complexity?● What are the general rules for finding the

successor of node x? (hint: two cases)

Page 25: CS 332: Algorithms Lecture # 10

Mudasser Naseer 25 04/21/23

BST Operations: Successor

● Two cases:■ x has a right subtree: successor is minimum node

in right subtree■ x has no right subtree: successor is first ancestor of

x whose left child is also ancestor of x○ Intuition: As long as you move to the left up the tree,

you’re visiting smaller nodes.

● Predecessor: similar algorithm

Page 26: CS 332: Algorithms Lecture # 10

Mudasser Naseer 26 04/21/23

Operations of BSTs: Insert

● Adds an element x to the tree so that the binary search tree property continues to hold

● The basic algorithm■ Like the search procedure above■ Insert x in place of NULL■ Use a “trailing pointer” to keep track of where you

came from (like inserting into singly linked list)

Page 27: CS 332: Algorithms Lecture # 10

Insert

TREE-INSERT(T, z)

1 y ← NIL;

2 x ← root[T ]

3 while x ≠ NIL

4 do y ← x

5 if key[z] < key[x]

6 then x ← left[x]

7 else x ← right[x]

8 p[z]← y

Mudasser Naseer 27 04/21/23

Page 28: CS 332: Algorithms Lecture # 10

Insert

9 if y = NIL

10 then root[T ]← z % Tree T was empty

11 else if key[z] < key[y]

12 then left[y]← z

13 else right[y] ← z

Mudasser Naseer 28 04/21/23

Page 29: CS 332: Algorithms Lecture # 10

Mudasser Naseer 29 04/21/23

BST Insert: Example

● Example: Insert C

F

B H

KDA

C

Page 30: CS 332: Algorithms Lecture # 10

Mudasser Naseer 30 04/21/23

BST Search/Insert: Running Time

● What is the running time of TreeSearch() or TreeInsert()?

● A: O(h), where h = height of tree● What is the height of a binary search tree?● A: worst case: h = O(n) when tree is just a

linear string of left or right children■ We’ll keep all analysis in terms of h for now■ Later we’ll see how to maintain h = O(lg n)

Page 31: CS 332: Algorithms Lecture # 10

Mudasser Naseer 31 04/21/23

Sorting With Binary Search Trees

● Informal code for sorting array A of length n:BSTSort(A)

for i=1 to n

TreeInsert(A[i]);

InorderTreeWalk(root);

● Argue that this is (n lg n)● What will be the running time in the

■ Worst case? ■ Average case? (hint: remind you of anything?)

Page 32: CS 332: Algorithms Lecture # 10

Mudasser Naseer 32 04/21/23

Sorting With BSTs

● Average case analysis■ It’s a form of quicksort!

for i=1 to n TreeInsert(A[i]);InorderTreeWalk(root);

3 1 8 2 6 7 5

5 7

1 2 8 6 7 5

2 6 7 5

3

1 8

2 6

5 7

Page 33: CS 332: Algorithms Lecture # 10

Mudasser Naseer 33 04/21/23

Sorting with BSTs

● Same partitions are done as with quicksort, but in a different order■ In previous example

○ Everything was compared to 3 once○ Then those items < 3 were compared to 1 once○ Etc.

■ Same comparisons as quicksort, different order!○ Example: consider inserting 5

Page 34: CS 332: Algorithms Lecture # 10

Mudasser Naseer 34 04/21/23

Sorting with BSTs

● Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n)

● Which do you think is better, quicksort or BSTsort? Why?

Page 35: CS 332: Algorithms Lecture # 10

Mudasser Naseer 35 04/21/23

Sorting with BSTs

● Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n)

● Which do you think is better, quicksort or BSTSort? Why?

● A: quicksort■ Better constants■ Sorts in place■ Doesn’t need to build data structure

Page 36: CS 332: Algorithms Lecture # 10

Mudasser Naseer 36 04/21/23

More BST Operations

● BSTs are good for more than sorting. For example, can implement a priority queue

● What operations must a priority queue have?■ Insert■ Minimum■ Extract-Min

Page 37: CS 332: Algorithms Lecture # 10

Mudasser Naseer 37 04/21/23

BST Operations: Delete

● Deletion is a bit tricky● 3 cases:

■ x has no children: ○ Remove x

■ x has one child: ○ Splice out x

■ x has two children: ○ Swap x with successor○ Perform case 1 or 2 to delete it

F

B H

KDA

CExample: delete Kor H or B

Page 38: CS 332: Algorithms Lecture # 10

BST Operations: Delete

TREE-DELETE(T, z)

1 if left[z] = NIL or right[z] = NIL

2 then y ← z

3 else y ← TREE-SUCCESSOR(z)

4 if left[y] ≠ NIL

5 then x ← left[y]

6 else x ← right[y]

7 if x ≠ NIL

8 then p[x] ← p[y]

Mudasser Naseer 38 04/21/23

Page 39: CS 332: Algorithms Lecture # 10

Delete

9 if p[y] = NIL

10 then root[T ] ← x

11 else if y = left[p[y]]

12 then left[p[y]] ← x

13 else right[p[y]] ← x

14 if y ≠ z

15 then key[z]← key[y]

16 copy y’s satellite data into z

17 return y

Mudasser Naseer 39 04/21/23

Page 40: CS 332: Algorithms Lecture # 10

Mudasser Naseer 40 04/21/23

BST Operations: Delete

● Why will case 2 always go to case 0 or case 1?● A: because when x has 2 children, its

successor is the minimum in its right subtree● Could we swap x with predecessor instead of

successor?● A: yes. Would it be a good idea?● A: might be good to alternate

Page 41: CS 332: Algorithms Lecture # 10

Mudasser Naseer 41 04/21/23

Page 42: CS 332: Algorithms Lecture # 10

Mudasser Naseer 42 04/21/23

The End