Threaded Binary Trees & Tree Balancing

advertisement
Threaded Binary Trees
Given a binary tree with n nodes,
 the total number of links in the tree is 2n.
Each node (except the root) has exactly one incoming arc
 only n - 1 links point to nodes
 remaining n + 1 links are null.
One can use these null links to simplify some traversal processes.
A threaded binary search tree is a BST with unused links employed to
point to other tree nodes.
 Traversals (as well as other operations, such as backtracking)
made more efficient.
A BST can be threaded with respect to inorder, preorder or postorder
successors.
Threaded Binary Trees
Given the following BST, thread it to facilitate inorder traversal:
H
E
B
A
K
F
D
J
G
I
L
M
C
The first node visited in an inorder traversal is the leftmost leaf, node A.
Since A has no right child, the next node visited in this traversal is its
parent, node B.
Use the right pointer of node A as a thread to parent B to make backtracking
easy.
Threaded Binary Trees
H
E
B
A
K
F
D
J
G
I
L
M
C
The thread from A to B is shown as the arrow in above diagram.
Threaded Binary Trees
The next node visited is C, and since its right pointer is null, it also can be
used as a thread to its parent D:
H
E
B
A
F
D
C
K
J
G
I
L
M
Threaded Binary Trees
Node D has a null right pointer which can be replaced with a pointer to D’s
inorder successor, node E:
H
E
B
A
F
D
C
K
J
G
I
L
M
Threaded Binary Trees
The next node visited with a null right pointer is G. Replace the null pointer
with the address of G’s inorder successor: H
H
E
B
A
F
D
C
K
J
G
I
L
M
Threaded Binary Trees
Finally, we replace:
first, the null right pointer of I with a pointer to its parent
and then, likewise, the null right pointer of J with a pointer to its parent
H
E
B
A
F
D
C
K
J
G
I
L
M
Algorithm to right-thread BST
Perform an inorder traversal of the BST.
Whenever a node x with a null right pointer encountered
 replace this right link with a thread to the inorder successorof x.
If x is a left child of S then S is the inorder successor of x.
If x is not a left child, then the inorder successor S of x is
the nearest ancestor S of x that contains x in its left subtree.
Algorithm to right-thread BST
1. Initialize a pointer p to the root of the tree.
2. While (p)
// p is not null:
a. While (p->left) is not null,
p = p>left .
b. Visit the node pointed to by p.
c. While p->right is a thread do:
i. p = p->right.
ii. Visit the node pointed to by p.
d. p = p->right.
Must be able to distinguish between right links that are threads and those
that are pointers to right children
 add a boolean field rightThread in the node declaration
Algorithm to right-thread BST
class BinNode
{
public:
DataType
bool
BinNode
data;
rightThread;
*left,
*right;
BinNode ()
{
rightThread = false;
left = right = 0;
}
// reset when tree threaded
BinNode (DataType item)
{
data = item;
rightThread = false; // reset when tree threaded
left = right = 0;
}
};// end of class BinNode declaration
Tree Balancing
Binary search trees designed for fast searching!!
But the order of insertion into a binary search tree determines the shape
of the tree and hence the efficiency with which tree can be searched.
If it grows so that the tree is as “fat” as possible (fewest levels) then it
can be searched most efficiently.
However, if the tree grows “lopsided”, with many more items in one
subtree than another, than the search time degrades from logarithmic to
linear.
Optimally, a BST with n nodes should have log2n levels.
In the worst case, there could be n levels (one node per level), in which
binary search degenerates to sequential search.
Tree Balancing
EXAMPLE: A BST OF STATE ABBREVIATIONS
Construct the BST of state abbreviations when states NY, IL, GA, RI,
MA, PA, DE, IN, VT, TX, OH, and WY are inserted, in this order, into
an empty tree.
NY
YY
IL
GA
YY
DE
YY
MA
YY
IN
YY
RI
PA
YY
OH
YY
VT
YY
TX
YY
WY
YY
For such nicely balanced trees, the search time is O(log 2 n), where n is
the number of nodes in the tree.
Tree Balancing
If the abbreviations were inserted alphabetically,
i.e. DE, GA, IL, IN, MA, MI, NY, OH, PA, RI, TX, VT, WY
DE would be the root, GA its right child, IL the right child of GA, IN the
right child of IN, etc.
 BST degenerates into a linked list
 search time increases to O(n)
Need a way to keep BST height-balanced!
AVL Trees
An AVL tree is a binary tree that is height-balanced:
The difference in height between the left and right subtrees
at any point in the tree is restricted.
Define the balance factor of node x to be
the height of x’s left subtree minus the height of its right subtree
An AVL tree is a BST in which the balance factor of each node is
0, -1, or 1
Sample trees
AVL trees
Not AVL trees
AVL tree class definition
template <typename DataType>
class AVLTree
{ private:
class AVLNode
{ public:
DataType
data;
short int balanceFactor;
AVLNode
*left,*right;
AVLNode()
{
data = item;
balanceFactor = 0;
left = right = 0;
}
AVLNode(DataType item)
{
data = item;
balanceFactor = 0;
left = right = 0;
}
}; // end of class AVLNode declaration
. . .
}; // end of class AVLTree declaration
Keeping AVL trees balanced
When a new item inserted into a balanced binary tree
resulting tree may become unbalanced
tree can be rebalanced
transform subtree rooted at the node that is the nearest ancestor of
the new node unacceptable balance factor
This transformation carried out by rotation: the relative order of nodes in a
subtree is shifted, changing the root, and the number of nodes in the left and
the right subchildres.
Four standard types of rotations are performed on AVL trees.
Keeping AVL trees balanced
1. Simple right rotation:
When new item inserted in the left subtree of the left child B of the
nearest ancestor A with balance factor +2.
2. Simple left rotation:
When new item inserted in the right subtree of the right child B of the
nearest ancestor A with balance factor -2.
3. Left-right rotation:
When new item inserted in the right subtree of the left child B of the
nearest ancestor A with balance factor +2.
4. Right-left rotation:
When new item inserted in the left subtree of the right child B of the
nearest ancestor A with balance factor -2.
Keeping AVL trees balanced
Each of these rotations performed by simply resetting some links.
For example, consider a simple right rotation, used when item inserted in
the left subtree of the left child B of the nearest ancestor A with balance
factor +2.
A simple right rotation involves resetting three links:
1. Reset the link from the parent of A to B.
2. Set the left link of A equal to the right link of B.
3. Set the right link of B to point A.
Rebalancing cost/benefit
Balanced binary trees with n nodes will have a depth O(log 2 n).
AVL trees thus guarantee search time will be O(log 2 n).
Overhead involved in rebalancing as the AVL tree
justified when search operations exceed insertions
faster searches compensate for slower insertions.
Empirical studies indicate that on the average, rebalancing is required for
approximately 45 percent of the insertions. Roughly one-half of these
rotations require are double rotations.
M-ary trees
Other trees, such as
parse trees, game trees, genealogical trees
need to be searched efficiently but have more than two children
 may be several different directions that a search path may follow
(rather than left if less, right if greater)
 shorter search paths because there are fewer levels in the tree.
Definition:
An m-node in a search tree stores m - 1 data values k 1 < k 2, ... < k m-1,
and has links to m subtrees T 1, ... , T m, where for each i,
all data values in T i < k i <= all data values in Ti+1
2-3-4 tree
2-3-4 tree is a tree with the following properties:
1. Each node stores at most 3 data values.
2. Each internal node is a 2-node, a 3-node, or a 4-node.
3. All the leaves are on the same level.
Basic Operations:
construct, determine if empty, and search
insert a new item in the 2-3-4 tree so the result is a 2-3-4 tree
delete an item from the 2-3-4 tree so the result is a 2-3-4 tree
Expression trees
Links in 2-3-4 trees
Each node must have one link for each possible child,
even though most of the nodes will not use all of these links.
 the amount of “wasted” space may be quite large.
Say a 2-3-4 tree has n nodes.
Linked representation requires 4 links for each node, for a total of 4n links.
But only n – 1 of these links are used to connect the n nodes. This
 4n - (n - 1) = 3n + 1 of the links are null
 fraction of unused links is (3n +1) / 4n, approximately 75% of the links
For 2-3-4 trees, there is a special kind of binary tree called a red-black tree
that can be used to represent AVL trees (i.e. height-balanced 2-3-4 trees)
Red-black trees
A binary search tree with two kinds of links, red and black,
which satisfies the following properties:
1. Every path from root to leaf has the same number of black links.
2. No path from root to leaf has two or more consecutive red links.
Basic Operations:
construct, determine if empty, search, and traverse as for BSTs
insert new item in the red-black tree and maintain red-black properties
delete item from the red-black tree and maintain red-black properties
Red-black tree class
enum ColorType {RED, BLACK};
class RedBlackTreeNode
{
public:
TreeElementType data;
ColorType parentColor;
// RED if link from parent is red,
// BLACK otherwise
RedBlackTreeNode
};
*parent,
*left,
*right;
AVL rotations
AVL trees have been replaced in many applications
by 2-3-4 trees or red-black trees
AVL rotations are still used to keep a red-black tree balanced.
To construct a red-black tree, use top-down 2-3-4 tree insertion with 4-node
splitting during descent:
1. Search for a place to insert the new node.
(Keep track of its parent, grandparent, and great grandparent.)
2. When 4-node q found along the search path, split it as follows:
a. Change both links of q to black.
b. Change the link from the parent to red:
3. If there now are two consecutive red links
(from grandparent gp to parent p to q),
perform the appropriate AVL-type rotation
as determined by the direction (LL, RR, LR, RL)
B-trees
B-tree of order m -- generalization of 2-3-4 trees
Used for external searching
A tree with the following properties:
1. Each node stores at most m-1 data values.
2. Each internal node is a 2-node, a 3-node, … , or an m-node.
3. All the leaves are on the same level.
Basic Operations:
construct, determine if empty, and search
insert a new item in the B-tree so the result is a B-tree
delete an item from the B-tree so the result is a B-tree
Sample B-tree
Download