Chapter 5 : Trees

advertisement
Chapter 5 Trees
Instructors:
C. Y. Tang and J. S. Roger Jang
All the material are integrated from the textbook "Fundamentals of Data Structures in
C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).
Outline (1)


Introduction (5.1)
Binary Trees (5.2)





Binary Tree Traversals (5.3)
Additional Binary Tree Operations (5.4)
Threaded Binary Trees (5.5)
Heaps (5.6) & (Chapter 9)
Binary Search Trees (5.7)
Outline (2)





Selection Trees (5.8)
Forests (5.9)
Set Representation (5.10)
Counting Binary Trees (5.11)
References & Exercises
5.1 Introduction


What is a “Tree”?
For Example :

Figure 5.1 (a)


An ancestor
binary tree
Figure 5.1 (b)

The ancestry of
modern Europe
languages
The Definition of Tree (1)

A tree is a finite set of one or more
nodes such that :

root

…
T1
T2
Tn
(1) There is a specially designated node
called the root.
(2) The remaining nodes are partitioned into
n ≥ 0 disjoint sets T1, …, Tn, where each of
these sets is a tree.
We call T1, …, Tn, the sub-trees of the root.
The Definition of Tree (2)


The root of this tree is node A. (Fig. 5.2)
Definitions:





Parent (A)
Children (E, F)
Siblings (C, D)
Root (A)
Leaf / Leaves

K, L, F, G, M, I, J…
The Definition of Tree (3)


The degree of a node is the number
of sub-trees of the node.
The level of a node:



Initially letting the root be at level one
For all other nodes, the level is the level of
the node’s parent plus one.
The height or depth of a tree is the
maximum level of any node in the tree.
Representation of Trees (1)

List Representation


The root comes first, followed by a list of subtrees
Example: (A(B(E(K,L),F),C(G),D(H(M),I, J)))
data
link 1 link 2 ...
link n
A node must have a varying number of link
fields depending on the number of branches
Representation of Trees (2)

Left Child-Right Sibling
Representation


Fig.5.5
A Degree Two Tree


Rotate clockwise by 45°
A Binary Tree
data
left child right sibling
5.2 Binary Trees


A binary tree is a finite set of nodes that is
either empty or consists of a root and two
disjoint binary trees called the left sub-tree and
the right sub-tree.
Any tree can be transformed into a
binary tree.


root
By using left child-right sibling
representation
The left and right subtrees are
distinguished
The left
sub-tree
The right
sub-tree
Abstract Data Type
Binary_Tree (structure 5.1)
Structure Binary_Tree (abbreviated BinTree) is:
Objects: a finite set of nodes either empty or
consisting of a root node, left Binary_Tree, and
right Binary_Tree.
Functions:
For all bt, bt1, bt2  BinTree, item  element
Bintree Create()::= creates an empty binary tree
Boolean IsEmpty(bt)::= if (bt==empty binary
tree) return TRUE else return FALSE
BinTree MakeBT(bt1, item, bt2)::= return a binary tree
whose left subtree is bt1, whose right subtree is bt2,
and whose root node contains the data item
Bintree Lchild(bt)::= if (IsEmpty(bt)) return error
else return the left subtree of bt
element Data(bt)::= if (IsEmpty(bt)) return error
else return the data in the root node of bt
Bintree Rchild(bt)::= if (IsEmpty(bt)) return error
else return the right subtree of bt
Special Binary Trees

Skewed Binary Trees


Fig.5.9 (a)
Complete Binary
Trees


Fig.5.9 (b)
This will be defined
shortly
Properties of Binary Trees (1)

Lemma 5.1 [Maximum number of nodes] :




(1) The maximum number of nodes on level i of
a binary tree is 2i -1, i ≥ 1.
(2) The maximum number of nodes in a binary
tree of depth k is is 2k -1, k ≥ 1.
The proof is by induction on i.
Lemma 5.2 :

For any nonempty binary tree, T, if n0 is the
number of leaf nodes and n2 the number of
nodes of degree 2, then n0 = n2 +1.
Properties of Binary Trees (2)


A full binary tree of
depth k is a binary
tree of depth k
having 2k -1 nodes, k
≧ 0.
A binary tree with n nodes and depth k is
complete iff its nodes correspond to the
nodes numbered from 1 to n in the full
binary tree of depth k.
Binary Tree Representation


Array Representation (Fig. 5.11)
Linked Representation (Fig. 5.13)
Array Representation

Lemma 5.3 : If a complete binary tree with n
nodes (depth = └log2n + 1┘) is represented
sequentially, then for any node with index i,
1 ≦ i ≦ n, we have:




(1) parent (i) is at └ i / 2 ┘, i ≠ 1.
(2) left-child (i) is 2i, if 2i ≤ n.
(3) right-child (i) is 2i+1, if 2i+1 ≤ n.
For complete binary trees, this representation is
ideal since it wastes no space. However, for the
skewed tree, less than half of the array is utilized.
Linked Representation
typedef struct node *tree_pointer;
typedef struct node {
int data;
tree_pointer left_child,
right_child;
};
5.3 Binary Tree Traversals

Traversing order : L, V, R






L : moving left
V : visiting the node
R : moving right
Inorder Traversal : LVR
Preorder Traversal : VLR
Postorder Traversal : LRV
For Example



Inorder Traversal : A / B * C * D + E
Preorder Traversal : + * * / A B C D E
Postorder Traversal : A B / C * D * E +
Inorder Traversal (1)

A recursive function starting from the root

Move left Visit node Move right
Inorder Traversal (2)
In-order Traversal :
A/B*C*D+E
Preorder Traversal

A recursive function starting from the root

Visit node Move left  Move right
Postorder Traversal

A recursive function starting from the root

Move left Move right Visit node
Other Traversals

Iterative Inorder Traversal



Using a stack to simulate recursion
Time Complexity: O(n), n is #num of node.
Level Order Traversal


Visiting at each new level from the leftmost node to the right-most
Using Data Structure : Queue
Iterative In-order Traversal (1)
Iterative In-order Traversal (2)
Add “+” in stack
Add “*”
Add “*”
Add “/”
Add “A”
Delete “A” & Print
Delete “/” & Print
Add “B”
Delete “B” & Print
Delete “*” & Print
Add “C”
Delete “C” & Print
Delete “*” & Print
Add “D”
Delete “D” & Print
Delete “+” & Print
Add “E”
Delete “E” & Print
In-order Traversal :
A/B*C*D+E
Level Order Traversal (1)
Level Order Traversal (2)
Add “+” in Queue
Deleteq “+”
Addq “*”
Addq “E”
Deleteq “*”
Addq “*”
Addq “D”
Deleteq “E”
Deleteq “*”
Addq “/”
Addq “C”
Deleteq “D”
Deleteq “/”
Addq “A”
Addq “B”
Deleteq “C”
Deleteq “A”
Deleteq “B”
Level-order Traversal :
+*E*D/CAB
5.4 Additional Binary Tree Operations

Copying Binary Trees


Testing for Equality of Binary Trees


Program 5.6
Program 5.7
The Satisfiability Problem (SAT)
Copying Binary Trees

Modified from postorder traversal program
Testing for Equality of Binary Trees

Equality: 2 binary trees having identical topology
and data are said to be equivalent.
SAT Problem (1)

Formulas

Variables : X1, X2, …, Xn


Two possible values: True or False
Operators : And (︿), Or (﹀), Not (﹁)



A variable is an expression.
If x and y are expressions,
then ﹁ x, x ︿ y, x ﹀y are expressions.
Parentheses can be used to alter the normal
order of evaluation,
which is ﹁ before ︿ before ﹀.
SAT Problem (2)
SAT Problem (3)

The SAT problem



Is there an assignment of values to the variables
that causes the value of the expression to be true?
For n variables, there are 2n possible
combinations of true and false.
The algorithm takes O(g 2n) time

g is the time required to substitute the true and
false values for variables and to evaluate the
expression.
SAT Problem (4)

Node Data Structure for SAT in C
SAT Problem (5)

A Enumerated Algorithm

Time Complexity : O (2n)
SAT Problem (6)
void post_order_eval(tree_pointer node){
if (node){
post_order_eval(node->left_child);
post_order_eval(node->right_child);
switch(node->data){
case not: node->value=!node->right_child->value;
break;
case and: node->value=node->right_child->value &&
node->left_child->value; break;
case or: node->value=node->right_child->value ||
node->left_child->value; break;
case true: node->value=TRUE; break;
case false: node->value=FALSE; break;
} } }
5.5 Threaded Binary Trees (1)

Linked Representation of Binary Tree


more null links than actual pointers (waste!)
Threaded Binary Tree


Make use of these null links
Threads


Replace the null links by pointers (called threads)
If ptr -> left_thread = TRUE



Then ptr -> left_child is a thread (to the node before ptr)
Else ptr -> left_child is a pointer to left child
If ptr -> right_thread = TRUE


Then ptr -> right_child is a thread (to the node after ptr)
Else ptr -> right_child is a pointer to right child
5.5 Threaded Binary Trees (2)
typedef struct threaded_tree *threaded_pointer;
typedef struct threaded_tree {
short int left_thread;
threaded_pointer left_child;
char data;
short int right_child;
threaded_pointer right_child;
}
5.5 Threaded Binary Trees (3)
Head node of
the tree
Actual
tree
Inorder Traversal of
a Threaded Binary Tree (1)


Threads simplify inorder traversal algorithm
An easy O(n) algorithm (Program 5.11.)

For any node, ptr, in a threaded binary tree

If ptr -> right_thread = TRUE


Else (Otherwise, ptr -> right_thread = FALSE)


The inorder successor of ptr = ptr -> right_child
Follow a path of left_child links from the right_child of ptr
until finding a node with left_Thread = TRUE
Function insucc (Program 5.10.)

Finds the inorder successor of any node (without
using a stack)
Inorder Traversal of
a Threaded Binary Tree (2)
Inorder Traversal of
a Threaded Binary Tree (2)
Inserting a Node into
a Threaded Binary Tree

Insert a new node as a child of a parent node



Insert as a left child (left as an exercise)
Insert as a right child (see examples 1 and 2)
Is the original child node an empty subtree?

Empty child node (parent -> child_thread = TRUE)


See example 1
Non-empty child node (parent -> child_thread = FALSE)

See example 2
Inserting a node as the right child of
the parent node (empty case)





parent(B) -> right_thread = FALSE
child(D) -> left_thread & right_thread = TURE
child -> left_child = parent
child -> right_child = parent -> right_child
parent -> right_child = child
(1)
(3)
(2)
Inserting a node as the right child of
the parent node (non-empty case)
(3)
(2)
(4)
(1)
Right insertion in a threaded
binary tree
void insert_right(threaded_pointer parent,
threaded_pointer child){
threaded_pointer temp;
child->right_child = parent->right_child;
(1)
child->right_thread = parent->right_thread;
(2) child->left_child = parent;
child->left_thread = TRUE;
parent->right_child = child;
(3)
parent->right_thread = FALSE;
If (!child->right_thread){/*non-empty child*/
temp = insucc(child);
(4)
temp->left_child = child; } }
5.6 Heaps


An application of complete binary tree
Definition

A max (or min) tree


a tree in which the key value in each node is no
smaller (or greater) than the key values in its
children (if any).
A max (or min) heap

a max (or min) complete binary tree
A max heap
Heap Operations

Creation of an empty heap


Insertion of a new element into the heap


O (log2n)
Deletion of the largest element from
the (max) heap


PS. To build a Heap  O( n log n )
O (log2n)
Application of Heap

Priority Queues
Insertion into a Max Heap (1)
(Figure 5.28)
Insertion into a Max Heap (2)
void insert_max_heap(element item, int *n) {
int i;
if (HEAP_FULL(*n)){
fprintf(stderr, “the heap is full.\n); exit(1);
}
i = ++(*n);
while ((i!=1) && (item.key>heap[i/2].key)) {
heap[i] = heap[i/2]; i /= 2;
}
heap[i] = item;
}


the height of n node heap = ┌ log2(n+1) ┐
Time complexity = O (height) = O (log2n)
Deletion from a Max Heap

Delete the max (root) from a max heap



Step 1 : Remove the root
Step 2 : Replace the last element to the root
Step 3 : Heapify (Reestablish the heap)
Delete_max_heap (1)
element delete_max_heap(int *n)
{
int parent, child; element item, temp;
if (HEAP_EMPTY(*n)) {
fprintf(stderr, “The heap is empty\n”);
exit(1);
}
/* save value of the element with the
highest key */
item = heap[1];
/* use last element in heap to adjust heap */
temp = heap[(*n)--];
Delete_max_heap (2)
}
parent = 1; child = 2;
while (child <= *n) {
/* find the larger child of the current
parent */
if ((child < *n) &&
(heap[child].key<heap[child+1].key))
child++;
if (temp.key >= heap[child].key) break;
/* move to the next lower level */
heap[parent] = heap[child];
child *= 2;
}
heap[parent] = temp;
return item;
5.7 Binary Search Trees

Heap : search / delete arbitrary element


O(n) time
Binary Search Trees (BST)




Searching  O(h), h is the height of BST
Insertion  O(h)
Deletion  O(h)
Can be done quickly by both key value and
rank
Definition

A binary search tree is a binary tree, that
may be empty or satisfies the following
properties :



(1) every element has a unique key.
(2&3) The keys in a nonempty left(/right) subtree must be smaller(/larger) than the key in the
root of the sub-tree.
(4) The left and right sub-trees are also binary
search trees.
Searching a BST (1)
Searching a BST (2)

Time Complexity


search  O(h), h is the height of BST.
search2  O(h)
Inserting into a BST (1)

Step 1 : Check if the inserting key is
different from those of existing elements


Run search function  O(h)
Step 2 : Run insert_node function

Program 5.17  O(h)
Inserting into a BST (2)
void insert_node(tree_pointer *node, int num) {
tree_pointer ptr, temp = modified_search(*node, num);
if (temp || !(*node)) {
ptr = (tree_pointer) malloc(sizeof(node));
if (IS_FULL(ptr)) {
fprintf(stderr, “The memory is full\n”); exit(1);
}
ptr->data = num;
ptr->left_child = ptr->right_child = NULL;
if (*node)
if (num<temp->data) temp->left_child=ptr;
else temp->right_child = ptr;
else *node = ptr;
}
}
Deletion from a BST

Delete a non-leaf node with two children



Replace the largest element in its left sub-tree
Or Replace the smallest element in its right sub-tree
Recursively to the leaf  O(h)
Height of a BST

The Height of the binary search tree is
O(log2n), on the average.


Worst case (skewed)  O(h) = O(n)
Balanced Search Trees


With a worst case height of O(log2n)
AVL Trees, 2-3 Trees, Red-Black Trees

Chapter 10
5.8 Selection Trees

Application Problem



Merge k ordered sequences into a single
ordered sequence
Definition: A run is an ordered sequence
Build a k-run Selection tree
Time Complexity


Selection Tree’s Level  ┌ log2k ┐+ 1
Each time to restructure the tree


O(log2k)
Total time to merge n records

O(n log2k)
For Example
Tree of losers

The previous selection tree is called a winner tree


Each node records the winner of the two children
Loser Tree





Leaf nodes represent the first record in each run
Each non-leaf node retains a pointer to the loser
Overall winner is stored in the additional node, node 0
Each newly inserted record is now compared with its
parent (not its sibling)  loser stays, winner goes up
without storing.
Slightly faster than winner trees
Loser tree example
overall
winner
6 8
1
2
4
8
10
Run 1
9
10
9
2
3
9 15
5
9
8 9
10
20
3
7
6
15 9
20
11
17
6
4
15
12
8
5
13
9
6
14
90
7
90
15
17
8
15
*Figure 5.36: Tree of losers corresponding to Figure 5.34 (p.235)
5.9 Forests


A forest is a set of n ≧ 0 disjoint trees.
T1, …, Tn is a forest of trees

Transforming a forest into a Binary Tree
B(T1, …, Tn)




(1) if n = 0, then return empty
(2) a root (T1);
Left sub-tree equal to B(T11,T12, …, T1m), where
T11,T12, …, T1m are the sub-trees of root (T1);
Right sub-tree B(T2, …, Tn)
Transforming a forest into a Binary Tree
Root(T1)
T11,T12, T13
B(T2, T3)
Forest Traversals
Pre-order :
In-order :
Post-order :
5.10 Set Representation


Elements : 0, 1, …, n -1.
Sets : S1, S2, …, Sm

pairwise disjoint


If Si and Sj are two sets and i ≠ j, then there is no
element that is in both Si and Sj.
Operations

Disjoint Set Union


Ex: S1 ∪ S2
Find (i )
Union Operation

Disjoint Set Union

S1 ∪ S2 = {0, 6, 7, 8, 1, 4, 9}
Implement of Data Structure
Union & Find Operation

Union(i, j)


parent(i) = j  let i be the new root of j
Find(i)

While (parent[i]≧0)


i = parent[i]  find the root of the set
Return i;  return the root of the set
Performance



Run a sequence of union-find operations
Total n-1 unions  n-1 times, O(n)
Time of Finds  Σni=2 i = O(n 2)
Weighting rule for union(i, j)

If # of nodes in i < # of nodes in j


Then j becomes the parent of i
Else i becomes the parent of j
New Union Function

Prevent the tree from growing too high

To avoid the creation of degenerate trees

No node in T has level greater than log2n +1
void union2(int i, int j){
int temp = parent[i]+parent[j];
if (parent[i]>parent[j]) {
parent[i]=j; parent[j]=temp;
}
else {
parent[j]=i; parent[i]=temp;
}
}
Figure 5.45 Trees achieving
worst case bound (p.245)
Collapsing Rule (for new find
function)


Definition: If j is a node on the path
from i to its root then make j a child of
the root
The new find function (see next slide):


Roughly doubles the time for an individual
find
Reduces the worse case time over a
sequence of finds.
New Find Function

Collapse all nodes form i to root

To lower the height of tree
Performance of New Algorithm

Let T(m, n) be the maximum time required to
process an intermixed sequence of m finds
(m≧n) and n -1 unions, we have :

k1mα(m, n) ≦ T(m, n) ≦ k2mα(m, n)



k1, k2 : some positive constants
α(m, n) is a very slowly growing function and is a
functional inverse of Ackermann’s function A(p, q).
Function A(p, q) is a very rapidly growing function.
Equivalence Classes

Using union-find algorithms to
processing the equivalence pairs of
Section 4.6 (p.167)


At most time : O(mα(2m, n))
Using less space
5.11 Counting Binary Trees

Three disparate problems :




Having the same solution
Determine the number of distinct binary trees
having n nodes (problem 1)
Determine the number of distinct
permutations of the numbers from 1 to n
obtainable by a stack (problem 2)
Determine the number of distinct ways of
multiply n + 1 matrices (problem 3)
Distinct binary trees

N=1


N=2


2 distinct binary trees
N=3


only one binary tree
5 distinct binary trees
N=…
Stack Permutations (1)

A binary tree traversals




Pre-order : A B C D E F G H I
In-order : B C A E D G H F I
Is this binary tree unique?
Constructing this binary tree
Stack Permutations (2)


For a given preorder permutation 1, 2, 3, what are
the possible inorder permutations?
Possible inorder permutation by a stack 



(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 2, 1)
(3, 1, 2) is impossible
Each inorder permutation represents a distinct binary
tree
Matrix Multiplication (1)

The product of n matrices


Matrix multiplication is associative


Can be performed in any order
N = 3, 2 ways to perform



M1 * M2 * … * Mn
(M1 * M2) * M3
M1 * (M2 * M3)
N = 4, 5 possibilities
Matrix Multiplication (2)


Let bn be the number of different ways
to compute the product of n matrices.
We have :
number of distinct binary trees

Approximation by solving the recurrence of
the equation
Solution : (when x →∞)
∵


∴
Simplification :
Approximation :
Heapsort—An optimal sorting algorithm

A heap : parent  son

output the maximum and restore:

Heapsort:
construction
output
Phase 1: construction

input data: 4, 37, 26, 15, 48


restore the subtree rooted
at A(2):
restore the tree rooted at
A(1):
Phase 2: output
Implementation

using a linear array
not a binary tree.


The sons of A(h) are A(2h) and A(2h+1).
time complexity: O(n log n)
Time complexity
Phase 1: construction
d = log n : depth
# of comparisons is at most:
d 1
L
2(dL)2

L 0
d 1
d 1
L 0
L 0
=2d  2L  4  L2L-1
L
d
k
(  L2L-1 = 2k(k1)+1)
L 0
=2d(2d1)  4(2d-1(d  1  1) + 1)
:
= cn  2log n  4, 2  c  4
d-L
Time complexity
Phase 2: output
n 1
2  log i
i 1
= :
=2nlog n  4cn + 4, 2  c  4
=O(n log n)
log i
i nodes
給定4個城市的相互距離
1
12
1
8
2
3
2
3
10
4
最小展開樹問題
尋找一個將四個城市最經濟的聯結
1
12
1
8
2
3
2
3
10
4
旅行推銷員問題
Traveling Salesman Problem (TSP)
尋找一個從(1)出發,回到(1)的最短走法
1
12
1
8
2
3
2
3
10
4
TSP是一個公認的難題
NP-Complete

意義:我們現在無法對所有輸入找到一
個有效率的解法

避免浪費時間尋求更佳的解法
Ref: Horowitz & Sahni,
Fundamentals of Computer Algorithms,
P528.


2n相當可怕
N
N2
2n



10
0.00001 s
0.0001 s
0.001 s
30
0.00003 s
0.0009 s
17.9 min
50
0.00005 s
0.0025 s
35.7 year
像satisfiabilibility problem
目前只有exponential algorithm,還沒有人找
到polynomial algorithm (你也不妨放棄!)
這一類問題是NP-Complete Problem
Garey & Johnson “Computers & Intractability”
窮舉法(Enumerating)
(想想看什麼問題不能窮舉解?)
 旅行推銷員問題:
1
2
3

4
1
3!走法
最小展開樹問題:
16種樹
(n-1)!
n(n-2) Cayley’s Thm.
12
4
Ref: Even, Graph Algorithms, PP26~28



Labeled tree  Number sequence
One-to-One Mapping
N個nodes的labeled tree可以用一個
長度N-2的number sequence來表達。
Encoding: Data Compression.
Labeled treeNumber sequence
在每一個iteration裡,切除目前所有leaves中
編號最小的node及其edges,記錄切點,切到
只剩一條edge為止。
例.
2
5
6

4
7
3
1
Prune-sequence:7,4,4,7,5(切點)


Label最大者必在最後的edge.
每個node原先的degree數=此node在
Prune-sqeuence中出現的次數+1.
Number sequenceLabeled tree
Prune-sequence: 7,4,4,7,5
k
1
2
3
4
5
6
7
deg(k)
1
1
1
3
2
1
3
Iteration 1
0
1
1
3
2
1
2
Iteration 2
0
0
1
2
2
1
2
Iteration 3
0
0
0
1
2
1
2
Iteration 4
0
0
0
0
2
1
1
Iteration 5
0
0
0
0
1
0
1
Iteration 6
0
0
0
0
0
0
0
每一個iteration裡,選擇degree為1且編號最小的node,連接prune-sequence中
相對的node,之後兩個nodes的degree均減1.
1
7
Iteration 1
Iteration 2
1
7
2
4
Iteration 3
1
7
2
4
3
Iteration 4
Iteration 6
3

1
7
3
4
2
Iteration 5
1
7
6
4
2
4
2
3
1
7
5
5
6
Minimal spanning tree
Kruskal’a Algorithm
A
B
50
E
80
200
90
D
70
300
75
65
C
Begin
T <- null
While T contains less than n-1 edges, the smallest weight,
choose an edge (v, w) form E of smallest weight 【 Using priority queue, heap O (log n) 】,
delete (v, w) form E.
If the adding of (v, w) to T does not create a cycle in T,【 Using union, find O (log m)】
then add (v, w) to T;
else discard (v, w).
Repeat.
End.
O (m log m)
m = # of edge
做priority queue可以用
heap operation
1
2
O(log n)
Initial O(n)


3
4
7
5
6
Tarjan: Union & Find可以almost linear (Amortized)
Correctness



如果不選最小edge做tree而得到minimal
加入最小edge會有cycle
Delete cycle中最大的edge會得到更小cost之tree
(矛盾!)
建spanning tree可以看做
spanning forest加link
1. 加 edge(2,3) 不合法
2. 加 edge(1,4) 合法
 另一種看法:
S1={1,2,3}
S2={4,5}
Edge的端點要在不同set
Set的 Find, Union
O(log n)
1
2
4
3
5
Download