Trees

advertisement
Binary Trees
Tree Example
Tree Structures



A tree is a hierarchical structure that places
elements in nodes along branches that
originate from a root.
Nodes in a tree are subdivided into levels in
which the topmost level holds the root node.
Any node in a tree can have multiple
successors at the next level

Therefore a tree is a nonlinear structure.
Tree Structures (continued)

Operating systems use a general tree to
maintain file structures.
Tree Terminology

Tree structure

Collection of nodes that originate from a unique
starting node called the root.


Each node consists of a value and a set of zero or more
links to successor nodes.
The terms parent and child describe the relationship
between a node and any of its successor nodes.
Tree Terminology (continued)
root
A
child
sibling s
parent
B
C
D
interior (internal) node
E
F
G
H
leaf node
I
J
subtree
Tree Terminology (continued)

Consists of nodes connected by edges

A tree is an instance of a more general category called
a graph (a later slideshow discusses graphs)
Nodes
Edges
Nodes connected by edges
(a graph – but not a tree)
Tree Terminology









Tree – recursively defined as empty or a root node with
zero or more sub-trees
Node – a holder for data plus edges to children
Edge – connects a parent node to a child node
Root – a pointer to the first node, if it exists, or NULL
Leaf node – a node with no children
Internal node – a node with one or two children
Path – sequence of edges between two nodes
Height – longest path in a tree from root to any other node
Depth – number of edges from root to a node
Tree Terminology (continued)

A path between a parent node P and any node N in
its subtree is a sequence of nodes

P=X0, X1, . . ., Xk = N



Depth (also called Level) of a node




k is the length of the path
Each node Xi in the sequence is the parent of Xi+1 for 0  i  k-1.
Length of the path from root to the node.
Equivalent - number of edges from root to the node.
Viewing a node as a root of its subtree, the height of a
node is the length of the longest path from the node to a
leaf in the subtree.
The height of a tree is the maximum level in the tree.
Tree Terminology (continued)
Height = 3
Sorted Binary Trees





In a binary tree, each parent has no more than two children
each node (item in the tree) has a value
a total order (linear order) is defined on these values
left subtree of a node contains only values less than the node's value
right subtree contains only values greater than or equal to the node's value.
21
10
5
2
34
25
7
39
33
Binary Trees

A compiler builds unsorted binary trees while
parsing expressions in a program's source
code.
Binary Trees (continued)

Each node of a binary tree defines a left and
a right subtree. Each subtree is itself a tree.
Right child of T
Left child of T
Binary Trees (continued)

A recursive definition of a binary tree:

T is a binary tree if T


has no node (T is an empty tree)
or
has at most two subtrees.
Height of a Binary Tree


The height of a binary tree is the length of the longest path
from the root to a leaf node
Let TN be the subtree with root N and TL and TR be the roots
of the left and right subtrees of N. Then
height(N) = height(TN) =
{
-1
1+max( height(TL), height(TR))
if TN is empty
if TN not empty
leaf node will always have a height of 0
Height of a Binary Tree (concluded)
Degenerate binary tree (least dense)
Density of a Binary Tree

In a binary trees, the number of nodes at each
level falls within a range of values.




At level 0, there is 1 node, the root; at level 1 there
can be 1 or 2 nodes.
At any level k, the number of nodes is in the range
from 1 to 2k.
The number of nodes per level contributes to the
density of the tree.
Intuitively, density is a measure of the size of a tree
(number of nodes) relative to the height of the tree.
Density of a Binary Tree (continued)
Density of Binary Tree

If degenerate trees allowed:

Problem – search in basic binary tree is O(N)


Value could be anywhere in tree
No better than a list
Density of a Binary Tree (continued)

A complete binary tree of height h has all
possible nodes through level h-1, and the nodes
on depth h exist left to right with no gaps

Complete binary trees are excellent storage
structures due to packing a large number of
nodes near the root
Density of a Binary Tree (continued)
Density of a Binary Tree (continued)

Determine the minimum height of a complete
tree that holds n elements.



Through first h - 1 levels, total number of nodes is
1 + 2 + 4 + ... + 2h-1 = 2h - 1
At depth h, the number of additional nodes ranges
from a minimum of 1 to a maximum of 2h.
Hence the number of nodes n in a complete binary
tree of height h ranges between
2 h - 1 + 1 = 2h
 n  2h - 1 + 2h = 2h+1 - 1 < 2h+1
Density of a Complete Binary Tree

After applying the logarithm base 2 to all terms
in the inequality, we have
h  log2 n < h+1
and conclude that a complete
binary tree with n nodes must
have height
h = int(log2n)
search in complete binary tree is O(log2(n))
Binary Tree Nodes

Define a binary tree a node as an instance of
the generic TNode class.

A node contains three fields.


The data value, called nodeValue.
The reference variables, left and right that identify the
left child and the right child of the node respectively.
Binary Tree Nodes (continued)

The TNode class allows us to construct
a binary tree as a collection of TNode objects.
TNode Class
// declared as an inner class within the class building the tree
Building a Binary Tree
Using previous slide’s
TNode class
Scanning a Binary Tree

Next issue:

How will you retrieve the
data stored in the tree?
Iterative Level-Order Scan

A level-order scan visits the root, then nodes on
level 1, then nodes on level 2, etc.
Iterative Level-Order Scan

A level-order scan is an iterative process that uses
a queue as an intermediate storage collection.


Initially, the root enters the queue.
Start a loop ending when the queue is empty




Remove a node from the queue
Perform some action with the node
Add its children onto the queue
Because siblings enter the queue during a visit of their
parent, the siblings (on the same level) will exit the queue
in successive iterations.
Notice in the next example how I'm using Java's Queue interface
Iterative Level-Order Scan (continued)
Iterative Level-Order Scan (continued)
Step 1: remove A then add
B and C into queue
Step 2: remove B then add
D into the queue
Step 3: remove C then add
E into the queue
Step 4: remove D
Step 5: remove E
Iterative Level-Order Scan (continued)
A
remove A B C
remove B C D
remove C D E F
remove D E F G H
remove E F G H
remove F G H I
remove G H I
remove H I
remove I
J
J
remove J
Notice the order removed from queue and appended to the output s
Recursive Binary Tree-Scan Algorithms


If current node == null is stopping condition
To scan a tree recursively




Visit and display the node (D)
scan the left subtree (L) and scan the right subtree (R)
The order in which you perform the D, L, R tasks determines the
order in which nodes are retrieved
In the following code, t is initially the reference to the root node:
Inorder Scan L D R

Scan is in order of visits to the left subtree,
the node's own value, and visits to the right
subtree
Inorder Scan: G D J H B A E C F I
inorderDisplay method call stack:
inorderDisplay(a)
inorderDisplay(b)
inorderDisplay(d)
inorderDisplay(g)
inorderDisplay(null)
append g to s
inorderDisplay(null)
append d to s
inorderDisplay(h)
inorderDisplay(j)
inorderDisplay(null)
append j to s
inorderDisplay(null)
append h to s
inorderDisplay(null)
append b to s
inorderDisplay(null)
append a to s
inorderDisplay(c)
inorderDisplay(e)
inorderDisplay(null)
append e to s
inorderDisplay(null)
append c to s
inorderDisplay(f)
inorderDisplay(null)
append f to s
inorderDisplay(i)
inorderDisplay(null)
append i to s
inorderDisplay(null)
Postorder Scan L R D

Scan is in order of visits to the left
subtree, visits to the right subtree,
and the node's own value
Scan order: G J H D B E I F C A
Can you write the method call stack for this?
Method Call Stack
for Postorder
inorderDisplay method call stack:
inorderDisplay(a)
inorderDisplay(b)
inorderDisplay(d)
inorderDisplay(g)
inorderDisplay(null)
inorderDisplay(null)
append g to s
inorderDisplay(h);
inorderDisplay(j);
inorderDisplay(null);
inorderDisplay(null);
append j to s
inorderDisplay(null);
append h to s
// rest left to you…
Scan order: G J H D B E I F C A
Preorder Scan D L R

Scan is in order of the node's own value,
visits to the left subtree, and visits to the
right subtree
Scan order: A B D G H J C E F I
More Recursive Scanning Examples
Preorder (NLR):
Inorder (LNR):
Postorder (LRN):
A
D
G
B
G
D
D
B
B
G
A
H
C
H
I
E
E
E
H
I
F
I
C
C
F
F
A
Visitor Design Pattern

When an action is needed on each element of a collection



Don't know in advance what the type of each element will be
Defines the visit() method which denotes what a visitor does
For a specific visitor pattern
Create a Visitor interface
Create a class that implements the Visitor interface
In class scanning a tree
1.
2.
3.


Create an object of the class implementing the Visitor interface
During traversal, call Visitor object's visit() and pass the current value as an
argument
Visitor Design Pattern (continued)
1.
2.
Another possibility (requires T to be "Comparable"):
2.
3. scanInorder()

This recursive method scanInorder()
provides a generalized inorder traversal of a
tree that performs an action specified by a
visitor pattern.
Uses Visitor object's visit
instead of System.out.println
scanInOrder()

If Visitor parameter is VisitorOutput


Prints output to the console
If Visitor parameter is VisitMax

VisitMax parameter stores the tree element with
the max value after scanInOrder finishes
Program B_Tree

Illustrates use of all the scanning methods
Computing Tree Height

Recall that the height of a binary tree can be
computed recursively.
height(T) =
{
-1
if T is empty
1 + max(height(TL), height(TR))
if T is nonempty
F
Computing Tree Height (continued)
Copying a Binary Tree


Simple case – exact duplicate
Duplicate with additional information

Contain nodes with additional field

Possibility - references the parent - this allows a scan up the
tree along the path of parents
Copying a Binary Tree (continued)

Copy a tree using a postorder scan

Build the duplicate tree from the bottom up.
Clearing a Binary Tree


Clear a tree with a postorder scan
Remove the left and right subtrees before
removing the node.
Binary Search (Sorted) Trees
Binary Search Trees


Assume each data element has some key value
For every node



Key is greater than all keys found in the left subtree
Key less than all keys found in the right subtree
All nodes can be considered ordered (i.e., sorted)
BSTs - More


Average depth of a balanced tree is log2N
Function Definitions





Make_Empty
Find
Find_Min / Find_Max
Insert
Remove
O(N)
O(log2N)
O(log2N)
O(log2N)
O(log2N)
Most operations on a binary search tree take time directly proportional to
the tree's height, so it is desirable to keep the height small. Ordinary
binary search (unbalanced) trees have the primary disadvantage that
they can attain very large heights in rather ordinary situations, such as
when the keys are inserted in order. The result is a data structure similar
to a linked list, making all operations on the tree expensive.
Binary Search (sorted) Trees



a total order (linear order) is defined on node or key values
left subtree of a node contains only values less than node or key value
right subtree contains only values greater than or equal to node or key value.
21
10
5
2
34
25
7
39
33
Inserting into a Sorted Binary Tree


Create the new node (set children to null)
If first node in tree


Make new node be the root
Else – determine where to insert the new node


Set current node to root
Loop until done

If new node greater than current node



If current.right node is null – end of a branch

Set current.right to the new node

Set done to true
Else

Set current to current.right
Similar for less than on the left side…
Deleting from a Sorted Binary Tree

Several cases to consider

If delete leaf node


If delete node with one child


Remove node from tree
Delete node and replace with its child
If delete node with two children



Find the successor of the node
Copy the successor value into the deletion position
Delete the successor node (cases 1 and 2 above)


Successor will not have a left branch
See following slides
Deleting from a Sorted Binary Tree

Delete node with two children example
50
50
Delete 25
Copy successor
30
25
15
5
35
20
30
15
40
20
30
Delete this node
32
31
5
35
33
40
32
31
33
Deleting from a Sorted Binary Tree

How to find successor (node replacing deleted node)

50
Delete 25
25
15
5
35
20
30
40
32
31
33
Start with deleted node's right child
 Then follow path of left children to
the end – this is the successor
30 has no left child – this is the successor
The successor will never have a left child
Searching a Sorted Binary Tree for a Value


Set current to root
Loop while current's value isn't the item's value

If item sought value is less than current's value

Set current to current.right

Else

Set current to current.left

If current is null didn't find the item
Heaps


Chapter 22 Ford and Top
Array based binary trees
Array-Based Binary Trees




A complete binary tree of depth d
Contains all possible nodes through level d-1
Nodes at level d in the leftmost positions in the tree.
An array a can be viewed as a complete binary tree




root is a[0]
first-level children are a[1] and a[2]
second-level children are a[3], a[4], a[5], a[6]
and so forth.
Array-Based Binary Trees (continued)
Integer[] arr = {5, 1, 3, 9, 6, 2, 4, 7, 0, 8};
Array-Based Binary Trees (concluded)

For element a[i] in an n-element array-based
binary tree:
Left child of a[i] is
a[2*i + 1]
undefined if 2*i + 1  n
Right child of a[i] is
a[2*i + 2]
undefined if 2*i + 2  n
Parent of a[i] is
a[(i-1)/2]
undefined if i = 0
Heaps


Array based tree structure
Max heap:


Min heap:


If B is a child node of A, then key(A) >= key(B)
If relationship is reversed:
key(A) <= key(B)
Root is max if max heap, min if min heap
Heaps

A maximum heap is an array-based
tree in which the value of a parent is ≥ the value of its
children. A minimum heap uses the relation ≤.
Inserting into a Heap

Assume that an array with elements in
the index range 0  i < last < n forms a heap.

The new element will enter the array at index last with
the heap expanding by one element.
Inserting into a Heap (continued)

Move nodes on the path of parents down one
level until the item is assigned as a parent that
has heap ordering.
Path of parents for insertion of item = 50
Inserting into a Heap (continued)

The static method pushHeap() of the class Heaps inserts
a new value in the heap. The parameter list includes the
array arr, the index last, the new value item, and a
Comparator object of type Greater or Less indicating
whether the heap is a maximum or minimum heap.
Inserting into a Heap (continued)

The algorithm uses an iterative scan with variable
currPos initially set to last. At each step, compare the
value item with the value of the parent and if item is
larger, copy the parent value to the element at index
currPos and assign the parent index as the new value for
currPos. The effect is to move the parent down one
level. Stop when the parent is larger and assign item to
the position currPos.
Inserting into a Heap (continued)
Deleting from a Heap


Deletion from a heap is normally restricted to the root
only. Hence, the operation removes the maximum (or
minimum) element.
To erase the root of an n-element heap, exchange the
element at index n-1 and the root and filter the root down
into its correct position in the tree.
Deleting from a Heap (continued)
Deleting from a Heap (continued)
Filter down 18
18
arr[0]
30
40
arr[2]
arr[1]
5
arr[7]
10
25
arr[3]
arr[4]
3
arr[8]
8
arr[5]
38
arr[6]
adjustHeap()

The implementation of adjustHeap()
uses the integer variables currPos and childPos to scan
the path of children.


Let currPos = first and target = arr[first]. The iterative scan
proceeds until we reach a leaf node or target is ≥ to the values of
the children at the current position.
Move currPos and childPos down the path of children in tandem.
Set childPos = index of the largest (smallest) of
arr[2*currPos + 1] and arr[2*currPos + 2].
Implementing popHeap()

The implementation first captures the root
and then exchanges it with the last value in
the heap (arr[last-1]). A call to adjustHeap()
reestablishes heap order in a heap which
now has index range
[0, last-1). Method popHeap() concludes by
returning the original root value.
Complexity of Heap Operations

A heap stores elements in an
array-based tree that is a complete tree. The
pushHeap() and adjustHeap() operations
reorder elements in the tree by move up the
path of parents for push() and down the path
of largest (smallest) children for pop().
Assuming the heap has n elements, the
maximum length for a path between a leaf
node and the root is log2n, so the runtime
efficiency of the algorithms is O(log2 n)
Sorting with a Heap

If the original array is a maximum heap, an
efficient sorting algorithm can be devised.


For each iteration i, the largest element is arr[0].
Exchange arr[0] with arr[i] and then reorder the
array so that elements in the index range [0, i) are
a heap. This is precisely the action of popHeap(),
which is an O(log2n) algorithm.
By transforming an arbitrary array into a heap, this
algorithm will sort the array.
Building a Heap

Transforming an arbitrary array into a heap is
called "heapifying" the array.


The method makeHeap() in the Heaps class
performs this transformation.
Turn an n-element array into a heap by filtering
down each parent in the tree beginning with the
last parent at index
(n-2)/2 and ending with the root node at index 0
Building a Heap (continued)
Integer[] arr = {9, 12, 17, 30, 50, 20, 60, 65, 4, 19};
Building a Heap (continued)
9
9
17
12
30
65
50
4
20
19
adjustHeap () at 4
No changes
(a)
9
17
12
60
65
30
50
4
20
19
adjustHeap () at 3
Move 30 down 1 level
(b)
60
12
60
65
30
50
4
20
19
adjustHeap () at 2
Move 17 down 1 level
(c)
17
Heapsort

The heap sort is a modified version of the
selection sort for an array arr that is a heap.


For each i = n, n-1, ..., 2, call popHeap() which pops
arr[0] from the heap and assigns it at index i-1.
With a maximum heap, the array is assorted in
ascending order. A minimum heap sorts the array in
descending order.
Heapsort (continued)
public static <T> void
heapSort(T[] arr, Comparator<? super T> comp)
{
// "heapify" the array arr
Heaps.makeHeap(arr, comp);
int i, n = arr.length;
// iteration that determines elements
// arr[n-1] ... arr[1]
for (i = n; i > 1; i--)
{
// call popHeap() to move next
// largest to arr[n-1]
Heaps.popHeap(arr, i, comp);
}
}
Heapsort (concluded)



A mathematical analysis shows that the worst
case running time of makeHeap() is O(n).
During the second phase of the heap sort,
popHeap() executes n - 1 times. Each
operation has efficiency O(log2 n).
The worst-case complexity of the heap sort is
O(n) + O(n log2 n) = O(n log2 n).
Implementing a Priority Queue

Recall that the HeapPQueue class implements
the PQueue interface.


The class uses a heap as the underlying storage
structure.
The user is free to specify either a Less or Greater
comparator which dictates whether a deletion removes
the minimum or the maximum element from the
collection.
Implementing a Priority Queue (continued)
The HeapPQueue Class
public class HeapPQueue<T> implements PQueue<T>
{
// heapElt holds the priority queue elements
private T[] heapElt;
// number of elements in the priority queue
private int numElts;
// Comparator used for comparisons
private Comparator<T> comp;
// create an empty maximum priority queue
public HeapPQueue()
{
comp = new Less<T>();
numElts = 0;
heapElt = (T[]) new Object[10];
}
. . .
}
HeapPQueue Class peek()
// return the highest priority item
// Precondition: the priority queue is not empty;
// if it is empty, throws NoSuchElementException
public T peek()
{
// check for an empty heap
if (numElts == 0)
throw new NoSuchElementException(
"HeapPQueue peek(): empty queue");
// return the root of the heap
return heapElt[0];
}
HeapPQueue Class pop()
// erase the highest priority item and return it
// Precondition: the priority queue is not empty;
// if it is empty, throws NoSuchElementException
public T pop()
{
// check for an empty priority queue
if (numElts == 0)
throw new NoSuchElementException(
"HeapPQueue pop(): empty queue");
// pop the heap and save the return value in top
T top = Heaps.popHeap(heapElt, numElts, comp);
// heap has one less element
numElts--;
return top;
}
HeapPQueue Class push()
// insert item into the priority queue
public void push(T item)
{
// if the current capacity is used up, reallocate
// with double the capacity
if (numElts == heapElt.length)
enlargeCapacity();
// insert item into the heap
Heaps.pushHeap(heapElt, numElts, item, comp);
numElts++;
}
Download