CSE 326: Data StructuresLecture #8
Binary Search Trees
Alon Halevy
Spring Quarter 2001
Binary Trees
• Many algorithms are efficient and easy to program for the special case of binary trees
• Binary tree is– a root
– left subtree (maybe empty)
– right subtree (maybe empty)
A
B
D E
C
F
HG
JI
Binary Search Tree Dictionary Data Structure
4
121062
115
8
14
13
7 9
• Search tree property– all keys in left subtree
smaller than root’s key
– all keys in right subtree larger than root’s key
– result:• easy to find any given key
• inserts/deletes by changing links
Example and Counter-Example
3
1171
84
5
4
111062
185
8
BINARY SEARCH TREE NOT ABINARY SEARCH TREE
7
In Order Listing
visit left subtree
visit node
visit right subtree
2092
155
10
307 17
In order listing:25791015172030
Finding a Node
Node *& find(Comparable x,
Node * root) {
if (root == NULL)
return root;
else if (x < root->key)
return find(x,
root->left);
else if (x > root->key)
return find(x,
root->right);
else
return root;
}
2092
155
10
307 17
runtime:
InsertConcept: proceed down tree as in Find; if new key not found,
then insert a new node at last spot traversed
void insert(Comparable x, Node * root) {assert ( root != NULL );if (x < root->key){
if (root->left == NULL)root->left = new Node(x);
else insert( x, root->left ); }else if (x > root->key){
if (root->right == NULL)root->right = new Node(x);
else insert( x, root->right ); } }
BuildTree for BSTs
Suppose a1, a2, …, an are inserted into an initially empty BST:
1. a1, a2, …, an are in increasing order
2. a1, a2, …, an are in decreasing order
3. a1 is the median of all, a2 is the median of elements less than a1, a3 is the median of elements greater than a1, etc.
4. data is randomly ordered
Examples of Building from Scratch
• 1, 2, 3, 4, 5, 6, 7, 8, 9
• 5, 3, 7, 2, 4, 6, 8, 1, 9
Analysis of BuildTree• Worst case is O(n2)
1 + 2 + 3 + … + n = O(n2)
• Average case assuming all orderings equally likely is O(n log n)– not averaging over all binary trees, rather averaging
over all input sequences (inserts)– equivalently: average depth of a node is log n– proof: see Introduction to Algorithms, Cormen, Leiserson, & Rivest
Bonus: FindMin/FindMax
• Find minimum
• Find maximum 2092
155
10
307 17
Deletion
2092
155
10
307 17
Why might deletion be harder than insertion?
Deletion - Leaf Case
2092
155
10
307 17
Delete(17)
Deletion - One Child Case
2092
155
10
307
Delete(15)
Deletion - Two Child Case
3092
205
10
7
Delete(5)
replace node with value guaranteed to be between the left and
right subtrees: the successorCould we have used the predecessor instead?
Finding the Successor
Find the next larger nodein this node’s subtree.
– not next larger in entire tree
Node * succ(Node * root) { if (root->right == NULL) return NULL; else return min(root->right);}
2092
155
10
307 17
How many children can the successor of a node have?
Predecessor
Find the next smaller node
in this node’s subtree.
Node * pred(Node * root) {
if (root->left == NULL)
return NULL;
else
return max(root->left);
}
2092
155
10
307 17
Deletion - Two Child Case
3092
205
10
7
Delete(5)
always easy to delete the successor – always has either 0 or 1 children!
Delete Codevoid delete(Comparable x, Node *& p) {
Node * q;
if (p != NULL) {
if (p->key < x) delete(x, p->right);
else if (p->key > x) delete(x, p->left);
else { /* p->key == x */
if (p->left == NULL) p = p->right;
else if (p->right == NULL) p = p->left;
else {
q = successor(p);
p->key = q->key;
delete(q->key, p->right);
}
}
} }
Lazy Deletion
• Instead of physically deleting nodes, just mark them as deleted+ simpler+ physical deletions done in batches+ some adds just flip deleted flag– extra memory for deleted flag– many lazy deletions slow finds– some operations may have to be
modified (e.g., min and max)
2092
155
10
307 17
Lazy Deletion
2092
155
10
307 17
Delete(17)
Delete(15)
Delete(5)
Find(9)
Find(16)
Insert(5)
Find(17)
Dictionary Implementations
BST’s looking good for shallow trees, i.e. the depth D is small (log n), otherwise as bad as a linked list!
unsorted
array
sorted
array
linked
list
BST
insert find + O(n) O(n) find + O(1) O(Depth)
find O(n) O(log n) O(n) O(Depth)
delete find + O(1) O(n) find + O(1) O(Depth)
Beauty is Only (log n) Deep
• Binary Search Trees are fast if they’re shallow:– e.g.: perfectly complete
– e.g.: perfectly complete except the “fringe” (leafs)
– any other good cases?
What matters here?Problems occur when onebranch is much longer than the other!
Balance
• Balance– height(left subtree) - height(right subtree)
– zero everywhere perfectly balanced
– small everywhere balanced enough
t
57
Balance between -1 and 1 everywhere maximum height of 1.44 log n
AVL Tree Dictionary Data Structure
4
121062
115
8
14137 9
• Binary search tree properties– binary tree property
– search tree property
• Balance property– balance of every node is:
-1 b 1– result:
• depth is (log n)
15
An AVL Tree
15
92 12
5
10
20
17
0
0
100
1 2
3 10
3
data
height
children
300
Not AVL Trees
15
12
5
10
20
170
10
0 2
3
300
15
10
200
1
2
(-1)-1 = -20-2 = -2
Staying Balanced
Good case: inserting small, tall and middle.
Insert(middle)
Insert(small)
Insert(tall)
M
S T00
1
Bad Case #1
Insert(small)
Insert(middle)
Insert(tall)
T
M
S
0
1
2
Single Rotation
T
M
S
0
1
2
M
S T00
1
Basic operation used in AVL trees:
A right child could legally have its parent as its left child.
General Case: Insert Unbalancesa
X
Y
b
Z
h h - 1
h + 1 h - 1
h + 2a
X
Y
b
Z
h-1 h - 1
h h - 1
h + 1
General Single Rotation
• Height of left subtree same as it was before insert!
• Height of all ancestors unchanged– We can stop here!
a
X
Y
b
Z
a
XY
b
Zh h - 1
h + 1 h - 1
h + 2
h
h - 1
h
h - 1
h + 1
Bad Case #2
Insert(small)
Insert(tall)
Insert(middle)
M
T
S
0
1
2
Will a single rotation fix this?
Double Rotation
M
S T00
1
M
T
S
0
1
2
T
M
S
0
1
2
General Double Rotation
• Initially: insert into either X or Y unbalances tree (root height goes to h+2)
• “Zig zag” to pull up c – restores root height to h+1, left subtree height to h
a
Z
b
W
c
X Y
a
Z
b
W
c
X Y
h
h - 1?
h - 1
h - 1
h + 2
h + 1
h - 1h - 1
h
h + 1
h
h - 1?
Insert Algorithm
• Find spot for value• Hang new node• Search back up looking for imbalance• If there is an imbalance:
case #1: Perform single rotation and exit
case #2: Perform double rotation and exit
Easy Insert
2092
155
10
3017
Insert(3)
120
0
100
1 2
3
0
Hard Insert (Bad Case #1)
2092
155
10
3017
Insert(33)
3
121
0
100
2 2
3
00
Single Rotation
2092
155
10
30173
12
33
1
0
200
2 3
3
10
0
3092
205
10
333
151
0
110
2 2
3
001712
0
Hard Insert (Bad Case #2)
Insert(18)
2092
155
10
30173
121
0
100
2 2
3
00
Single Rotation (oops!)
2092
155
10
30173
121
1
200
2 3
3
00
3092
205
10
3
151
1
020
2 3
3
01712
0
180
180
Double Rotation (Step #1)
2092
155
10
30173
121
1
200
2 3
3
00
180
1792
155
10
203
121 200
2 3
3
10
300
Look familiar?18
0
Double Rotation (Step #2)
1792
155
10
203
121 200
2 3
3
10
300
180
2092
175
10
303
151
0
110
2 2
3
0012
018
AVL Algorithm Revisited• Recursive1. Search downward for
spot
2. Insert node
3. Unwind stack,
correcting heights
a. If imbalance #1,
single rotate
b. If imbalance #2,
double rotate
• Iterative1. Search downward for spot, stacking parent nodes2. Insert node3. Unwind stack, correcting heights a. If imbalance #1, single rotate and exit b. If imbalance #2, double rotate and exit
Single Rotation Codevoid RotateRight(Node *& root) {
Node * temp = root->right;
root->right = temp->left;
temp->left = root;
root->height = max(root->right->height,
root->left->height) + 1;
temp->height = max(temp->right->height,
temp->left->height) + 1;
root = temp;
}
X
Y
Z
root
temp
Double Rotation Codevoid DoubleRotateRight(Node *& root) {
RotateLeft(root->right);
RotateRight(root);
}
a
Z
b
W
c
XY
a
Z
c
b
X
Y
W
First Rotation
Double Rotation Completed
a
Z
c
b
X
Y
W
a
Z
c
b
XY
W
First Rotation Second Rotation