3 Trees: traversal and analysis of standard search trees Summer Term 2010 Robert Elsässer Robert Elsässer
3 Trees: traversal and analysis of standard search trees
Summer Term 2010
Robert ElsässerRobert Elsässer
Binary Search Trees
Binary trees for storing sets of keys (in the internal nodes ofBinary trees for storing sets of keys (in the internal nodes of trees), such that the operationsfindinsertdelete (remove)are supportedare supported.Search tree property: All keys in the left subtree of a node p are smallerthan the key of p, and the key of p is smaller than all keys in the right subtree of p.
Implementation:Implementation:
27.04.2010 Theory 1 - traversal and analysis of standard search trees 2
Standard binary search trees (8)
Insert 5
3
9
123
9
12
44
5
Tree structure depends on the order of insertions into the initially empty treeempty treeHeight can increase linearly, but it can also be in O(log n), more precisely ⎡log2 (n+1)⎤.
27.04.2010 Theory 1 - traversal and analysis of standard search trees 3
Traversal of trees
Traversal of the nodes of a treeTraversal of the nodes of a treefor outputfor calculating the sum, average, number of keys ...g , g , yfor changing the structure
Most important traversal orders:1. Preorder = NLR (Node-Left-Right)
fi t i it th t th i l th l ft d i htfirst visit the root, then recursively the left and right subtree (if existent)
2 Postorder = LRN2. Postorder = LRN3. Inorder = LNR4. The mirror image versions of 1-3
27.04.2010 Theory 1 - traversal and analysis of standard search trees 4
4. The mirror image versions of 1 3
Preorder
Preorder traversal is recursively defined as follows:Preorder traversal is recursively defined as follows:Traversal of all nodes of a binary tree with root p in preorder:
Visit p, traverse the left subtree of p in preorder,traverse the right subtree of p in preorder.
1717
2211
7 14
12
27.04.2010 Theory 1 - traversal and analysis of standard search trees 5
Preorder implementation
// Preorder Node-Left-Rightvoid preOrder ()
preOrder(root);System.out.println ();
void preOrder(SearchNode n)
if (n == null) return;System.out.print (n.content+" ");preOrder(n.left);p ( )preOrder(n.right);
// Postorder Left-Right-Node// Postorder Left Right Nodevoid postOrder()
postOrder(root);System.out.println ();
// ...
27.04.2010 Theory 1 - traversal and analysis of standard search trees 6
Inorder
The traversal order is: first the left subtree, then the root, then the , ,right subtree:
// I d L ft N d Ri ht// Inorder Left-Node-Rightvoid inOrder()
inOrder(root);System out println ();System.out.println ();
void inOrder(SearchNode n)
if (n == null) return;if (n == null) return; inOrder(n.left); System.out.print (n.content+" ");inOrder(n.right);inOrder(n.right);
27.04.2010 Theory 1 - traversal and analysis of standard search trees 7
Example
17Preorder:17, 11, 7, 14, 12, 22Postorder:
17
2211Postorder:7, 12, 14, 11, 22, 17Inorder:7 14
7, 11, 12, 14, 17, 2212
27.04.2010 Theory 1 - traversal and analysis of standard search trees 8
Non-recursive variants with threaded trees
Recursion can be avoided if instead of null-pointers so-called
27.04.2010 Theory 1 - traversal and analysis of standard search trees 9
Recursion can be avoided if instead of null pointers so called
thread pointers to the successors or predecessors are used.
Example for Search, Insertion, Deletion
1717
2211
7 14
12
27.04.2010 Theory 1 - traversal and analysis of standard search trees 10
Sorting with standard search trees
Idea: Create a search tree for the input sequence and output the keys by an i d t linorder traversal.
Remark: Depending on the input sequence, the search tree may degeneratedegenerate.
Complexity: Depends on internal path length
Worst case: Sorted input: ⇒ Ω(n2) steps.
Best case: We get a complete search tree of minimal height of about log n. est case e get a co p ete sea c t ee o a e g t o about ogThen n insertions and outputs are possible in time O(n log n).
Average case: ?
27.04.2010 Theory 1 - traversal and analysis of standard search trees 11
Analysis of search trees
Two possible approaches to determine the internal path length:p pp p g
1. Random tree analysis, i.e. average over all possible permutations of keys to be inserted (into the initially empty tree)permutations of keys to be inserted (into the initially empty tree).
2. Shape analysis, i.e. average over all structurally different trees ith kwith n keys .
Difference of the expected values for the internal path:
1. ≈ 1.386 n log2n – 0.846·n + O(log n)
2. ≈ n·√πn + O(n)
27.04.2010 Theory 1 - traversal and analysis of standard search trees 12
Reason for the difference
3 3 1 3
2 1 3 23
2
1
1 2 2 1
3,2,1 3,1,2 1,3,2 3,2,1 2,1,3 und 2,3,1, , , , , , , , , , , ,
Random tree analysis counts more balanced trees more often.
27.04.2010 Theory 1 - traversal and analysis of standard search trees 13
Internal path length
Internal path length I: measure for judging the quality of a search tree t.p g j g g q yRecursive definition:
1. If t is empty, then
2. For a tree t with left subtree tl and right subtree tr :
Apparently:
27.04.2010 Theory 1 - traversal and analysis of standard search trees 14
Average search path length
For a tree t the average search path length is defined by:For a tree t the average search path length is defined by:
Question: What is the size of D(t) in the
- best- worst- average
f ?case for a tree t with n internal nodes?
27.04.2010 Theory 1 - traversal and analysis of standard search trees 15
Internal path: best case
We obtain a complete binary treeWe obtain a complete binary tree
27.04.2010 Theory 1 - traversal and analysis of standard search trees 16
Internal path: worst case
27.04.2010 Theory 1 - traversal and analysis of standard search trees 17
Random trees
Without loss of generality, let 1,…,n be the keys to be inserted.g y, , , y
Let s1,…, sn be a random permutation of these keys.
Hence, the probability that s1 has the value k, P(s1=k) = 1/n.
If k is the first key, k will be stored in the root.
Then the left subtree contains k 1 elements (the keys 1 k 1)Then the left subtree contains k-1 elements (the keys 1, …, k-1) and the right subtree contains n-k elements (the keys k+1, …,n).
27.04.2010 Theory 1 - traversal and analysis of standard search trees 18
Expected internal path length
I(n) : Expectation for the internal path length of aI(n) : Expectation for the internal path length of a randomly generated binary search tree with n nodesApparently we have:Apparently we have:
Assume: EI(n) 1.386n log2n - 0.846n + O(logn).
27.04.2010 Theory 1 - traversal and analysis of standard search trees 19
Proof (1)
and hence
From the last two equations it follows thatFrom the last two equations it follows that
27.04.2010 Theory 1 - traversal and analysis of standard search trees 20
Proof (2)
By induction over n it is possible to show that for all n ≥ 1:y p
i th th h i bis the n-th harmonic number,
which can be estimated as follows:
where the so-called Euler constantwhere the so-called Euler constant.
27.04.2010 Theory 1 - traversal and analysis of standard search trees 21
Proof (3)
Thus,,
and hence,
27.04.2010 Theory 1 - traversal and analysis of standard search trees 22
Observation
Search, insertion and deletion of a key in a randomly generated binary y y g ysearch tree with n keys can be done, on average, in O(log2 n) steps.
In the worsten case, the complexity can be Ω(n).p y ( )
One can show that the average distance of a node from the root in a randomly generated tree is only about 40% above the optimal value.
However, by the restriction to the symmetrical successor, the behaviour becomes worse.
If n2 update operations are carried out in a randomly generated search tree with n keys, the expected average search path is only Θ(√n).
27.04.2010 Theory 1 - traversal and analysis of standard search trees 23
Typical binary tree for a random sequence of keys
27.04.2010 Theory 1 - traversal and analysis of standard search trees 24
Resulting binary tree after n2 updates
27.04.2010 Theory 1 - traversal and analysis of standard search trees 25
Structural analysis of binary trees
Question: What is the average search path length of a binary tree with Ng p g yinternal nodes if the average is made over all structurally different binary trees with N internal nodes?
Answer: Let
IN = total internal path length of all structurally different binary trees Nwith N internal nodes
BN = number of all structurally different trees with N internal nodesN
Then IN/BN =
27.04.2010 Theory 1 - traversal and analysis of standard search trees 26
Number of structurally different binary trees
27.04.2010 Theory 1 - traversal and analysis of standard search trees 27
Total internal path length of all trees with N nodes
For each tree t with left subtree tl and right subtree tr :l g r
27.04.2010 Theory 1 - traversal and analysis of standard search trees 28
Summary
The average search path length in a tree with N internal nodes (averaged g p g ( gover all structurally different trees with N internal nodes) is:
1/N · IN/BN
27.04.2010 Theory 1 - traversal and analysis of standard search trees 29