Introduction to Trees, Binary Search Trees

advertisement
9 Trees
Another very important data structure is that of a tree. By definition:
A tree may be empty or it may consist of a root and 0 or more subtrees. Each subtree is
also a tree.
A
B
D
E
C
F
H
I
9.1
G
J
Definitions:
For examples refer to above diagram
Node: data stored by the tree (All the circles)
Root Node the top most node from which all other nodes come from. Note that all nodes
can be considered root node of their subtree.
Ex. A is the root of the entire tree. B is the root of the subtree containing B,D,E and F
Leaf Node A node with no subtrees
Ex. D,E,F,I,J,and G are all leaf nodes
Child: Root node of a subtree of a node is the child node
Ex. B is the child of A. I is the child of H
Parent : Opposite of a child node
Ex. A is parent of B. H is the parent of I
Siblings: All nodes that have the same parent node are siblings
Ex. E and F are siblings of D but H is not
Page 60
Ancestor: All nodes that can be reached by moving only in an upward direction in the
tree.
Ex. C, A and H are all ancestors of I but G and B are not.
Descendants of a node are nodes that can be reached by only going down in the tree.
Ex. Descendents of C are G,H,I and J
Levels: Nodes in Level 0 of our tree is A. Nodes in Level 1 are B and C, Nodes in Level
2 are D,E,F,G and H. Nodes in Level 3 are I and J
Height: number of levels in the tree (in our case 4)
Path: Set of branches taken to connect an ancestor of a node to the node. Usually
described by the set of nodes encountered along the path.
Binary tree: A binary tree is a tree where every node has 2 subtrees that are also binary
trees. The subtrees may be empty. Each node has a left child and a right child.
The following are NOT trees.
B
D
A
A
C
B
C
E
D
9.2
Tree Implementations
By its nature, trees are typically implemented using a Node/link data structure like that of
a linked list. In a linked list, each node has data and it has a next (and sometimes also a
previous) pointer. In a tree each node has data and a number of node pointers that point
to the node's children.
However, you could also use an array based implementation for a tree. Each element
would be an object containing data and several integers indicating the index of where the
root of all its subtrees are. This is how we implemented the Heap.
For BINARY TREES ONLY, you can use the array indexes as indication of a nodes level
and parents. element 0 is root. element 1 and 2 are its children. 3 and 4 are children of
element 1. 5 and 6 are children of 2. In other words, for node k, it left child is node 2k+1
and right child is 2k+2. Parent node k is node (k-1)/2. Array implementations are good
only when the trees are complete or much space will be wasted.
Page 61
9.3
Binary Trees
A binary tree is a tree where every node has 2 subtrees that are also binary trees. The
subtrees may be empty. The following is a binary tree:
A
B
D
C
F
H
I
G
J
The following isn't (B has 3 subtrees):
A
B
D
E
C
F
H
I
G
J
Page 62
A binary tree that is ordered is called a binary search tree. This is not the same as a tree
with the heap order property. A binary search tree is a binary tree where all values in left
subtree < value in current node < values in right subtree. The following is NOT a binary
search tree
A
B
D
C
F
H
I
G
J
The following IS a binary search tree:
D
B
A
H
C
F
E
I
G
A binary search tree structure can allow us to do searches quickly even using a linking
structure (under certain conditions). To find a value, we simply start at the root and look
at the value. If our key is less than the value, search left subtree. If key is greater than
value search right subtree. It provides a way for us to do a "binary search" on a linked
structure.
TNode<TYPE>* BST::Search(TYPE key){
TNode* retval=NULL;
TNode* curr=root_;
while(retval && curr){
if(curr->data==key)
retval=curr;
else if(key < curr->data)
curr=curr->left_;
else
curr=curr->right_;
}
return retval;
}
Page 63
9.3.1 Insertion
To insert into a binary search tree, we must maintain the nodes in sorted order. There is
only one place an item can go. Example Insert the letters D,B,C,E,A, and F in to an
initially empty Binary search tree in the order listed.
Insert D
D
Insert B. Must go on left as B < D
D
B
Insert C. C < D so look at left subtree. C > B so go on right branch from B
D
B
C
Insert E. E > D. D's right subtree is empty so put E there.
D
E
B
C
Page 64
Insert A. A < D. Look in left subtree. A < B so make it B's left subtree
D
E
B
A
C
Insert F. F > D and F > E so put it under E's right subtree.
D
E
B
A
C
F
9.3.2 Traversal
Suppose that I want to print all values in the tree from smallest to biggest. How would I
do this?
void PrintNode(TNode<TYPE>* node){
if(node){
PrintNode(node->left_);
node->data_.print();
PrintNode(node->right_);
}
}
void Print(BST& tree){
PrintNode(tree.root_);
}
A tree is defined recursively. Many algorithms are most easily written recursively. This
is especially true for functions involving tree traversals. A tree traversal is an algorithm
that is done by visiting every node just once and applying some function to that node. In
above we visit each node and print it just once. The above code is what is called InOrder traversal. This means all values less than the current node are first dealt with, then
Page 65
the current node then the nodes with values greater than the current node. Aside from InOrder tree traversal there is also PreOrder traversal and PostOrder traversal. In PreOrder
traversal, the current node is visited first then the children are visited (left first then right).
In Post Order traversal, the children are visited first and then the current node is visited.
If we did a PreOrder Printing of the tree the code would look like:
void PrePrintNode(<TNode<TYPE>* node){
if(node){
node->data_.print();
PrePrintNode(node->left_);
PrePrintNode(node->right_);
}
}
void PreOrderPrint(BST& tree){
PrePrintNode(tree.root_);
}
If we did a PostOrder Printing of the tree the code would look like:
void PostPrintNode(<TNode<TYPE>* node){
if(node){
PostPrintNode(node->left_);
PostPrintNode(node->right_);
node->data_.print();
}
}
void PostOrderPrint(BST& tree){
PostPrintNode(tree.root_);
}
9.3.3 Deletion
In order to delete a node, we must be sure to link up the subtree(s) of the node properly.
Let us consider the following situations:
D
B
A
H
C
F
E
L
G
J
I
K
Page 66
Delete Node with G. G has no children. Simply delete node and make the pointer from
the parent node point to NULL.
D
B
A
H
C
L
F
J
E
I
K
Page 67
Now, Lets delete a node like F which has only a left child but no right child. This is also
Easy. all we need to do is make the pointer from the parent node point to the left child.
D
B
A
H
C
F
L
E
J
I
K
Thus our tree is
D
B
A
H
C
E
L
J
J
I
K
Page 68
Delete node E (easy. E is a node with no children) so that we can have a node with just a
right subtree in H.
D
B
A
H
C
L
J
I
K
Now lets delete Node H. Node H only has a single right subtree. The process is the same
as that of deleting a node with just a left subtree, make the parent node point to the right
subtree instead of itself.
D
B
A
L
C
J
I
K
Delete node B. B has two children which means we can't just make D point to its
successors as there are two of them only one available link from D. To delete B, we
must:
1) find its inorder successor (next biggest value).
done by going right then go left as far as possible
2) promote inorder successor to replace node to be deleted (think of it as deleting
inorder successor and then replace value in current node with value of inorder successor.
Note that inorder sucessor will always either be a leaf or at most have one child
Page 69
In order successor of B is C. Delete node containing C and replace data in node B with
data in node C
D
C
L
A
J
I
K
What would tree look like if we were to delete node D?
In order successor of D is I. Remove and promote I.
I
L
L
C
A
J
K
This promotion can be handled two ways:
1) copy data into node and do a delete on inorder successor.
2) change the links involved (there are at most three of them)
method 2 is better as original address of the nodes are preserved and thus any other data
structures that had stored these address can remain valid.
So far this seems pretty good. Inserts and deletes are not too hard to do AND search is
very fast.
Only problem is the following is ALSO a binary search tree:
Page 70
B
C
A
D
E
F
G
H
I
If we were to insert in the following order: B,A,C,D,E,F,G,H,I we would get the above
tree.
Problem with our algorithms so far is that the order the data is presented determines how
efficient the structure will be. This is not a good.
Page 71
Download