Lecture9

advertisement
CS 221
Analysis of Algorithms
Ordered Dictionaries and Search Trees
 Portions of these slides come from
 Michael Goodrich and Roberto Tamassia,
Algorithm Design: Foundations, Analysis
and Internet Examples, 2002, John Wiley
and Sons.
 and its authors, Michael Goodrich and
Roberto Tamassia,
 the books publisher John Wiley & Sons
 and…
 www.wikipedia.org
Reading material
 Goodrich and Tamassia, 2002
 Chapter 2, section 2.5,pages 114-137
 see also section 2.6
 Chapter 3, section 3.1 pages 141-151
 Wikipedia:
 http://en.wikipedia.org/wiki/AVL_trees
in the previous episode…
 …we defined a data structure which we
called a dictionary. It was…
 a container to hold multiple objects or in
Goodrich and Tamassia’s terminology “items”
 each item = a (key, element) pair
 element = a “piece” of data
 think= name, address, phone number
 key = a value we associate the element to help
us find, retrieve, delete, etc an element
 think = rdbms autoincrement key, student ID#
Dictionaries
 Up til now we looked at
 Unordered dictionaries
 container for (k,e) pairs but…
 in no particular order
 Logfiles
 Hash Tables
Dictionaries
 A terminology note
 for purposes of our discussion –
 A linear unordered dictionary = logfile
 A lineary ordered dictionary = lookup table
Game Time
 Twenty Questions
 One person thinks of an object that can
be any person, place or thing…
 and does not disclose the selected object
until it is specifically identified by the
other players…
 All other players take turns asking
Yes/No questions in an attempt to
identify the mystery object
Game Time
 Twenty Questions
 An efficient problem solving strategy is
to ask questions for which the answers
will optimally narrow the size of the
problem space (possible solutions)
 for example,
 Q: Is it a person?
 A: Yes ….we just eliminated all places and
non-human objects from the solution set
Game Time
 Twenty Questions
 Size of problem?
 N=??? large ~∞
 Yes/No attack makes this a binary search
problem…
 So, what size of problem space can we
effectively search?
 220
Game Time
 Twenty Questions
 Something to think about…
 N is conceivably much larger than 220
 So, how is that we can usually solve this
problem in 20 steps or less…
 i.e. correctly identify the mystery object
Dictionaries
 Ordered Dictionaries
 suppose the items in a dictionary are
ordered (sorted)
 like low to high
 Would that make a difference in terms of





size()
isEmpty()
findElement()
insertItem()
removeItem()
Dictionaries
 Ordered Dictionaries
 suppose we implement an ordered dictionary as
a linear data structure or more specifically a
vector
 items are in vector in key order
 we gain considerable efficiency because we can
visit D[x], where x is a rank in O(1) time
 Can we achieve the same time of findElement()
time if the ordered dictionary were implemented
as a linked list?
Binary Search
Binary search performs operation findElement(k) on a
dictionary implemented by means of an array-based
sequence, sorted by key




similar to the high-low game
at each step, the number of candidate items is halved
terminates after O(log n) steps
Example: findElement(7)

0
1
3
4
5
7
1
0
3
4
5
m
l
0
9
11
14
16
18
m
l
0
8
1
1
3
3
7
19
h
8
9
11
14
16
18
19
8
9
11
14
16
18
19
8
9
11
14
16
18
19
h
4
5
7
l
m
h
4
5
7
l=m =h
Binary Search


Method
Logfile
Lookup Table
findElement
O(n)
O(log n)
insertItem
O(1)
O(n)
removeElement O(n)
O(n)
closetKeyBef
O(log n)
O(n)
Lookup tables are not very efficient for dynamic data (lot of
insertItem, removeElement
Lookup tables are efficient for dictionaries where
predominant access is findElement, and relatively little
inserts or removes

credit card authorizations, code translation tables,…
Binary Search Tree
 Binary tree for holding (k,e) items,
such that…
 each internal node v store elem e with
key k
 k of e in left subtree of v <= k of v
 k of e in right subtree of v >= k of v
 external nodes store no elements…
 only placeholder (NULL_NODE)
Binary Search Tree
 Each left
subtree is less
than its parent
 Each right
subtree is
greater than its
parent
 All leaf nodes
hold no items
58
31
90
25
12
42
36
62
75
Search
Algorithm findElement(k, v)
if T.isExternal (v)
return NO_SUCH_KEY
if k < key(v)
return findElement(k, T.leftChild(v)) 1
else if k = key(v)
return element(v)
else { k > key(v) }
return findElement(k, T.rightChild(v))
<
2
6
9
>
4 =
8
removeElement(k) – simple
case




To perform operation
removeElement(k), we
search for key k
Assume key k is in the
tree, and let let v be the
node storing k
If node v has a leaf child
w, we remove v and w
from the tree with
operation
removeAboveExternal(w)
Example: remove 4
6
<
2
9
>
4 v
1
8
w
5
6
2
1
9
5
8
RemoveElement(k) – more
complicated case

We consider the case
where the key k to be
removed is stored at a
node v whose children are
both internal




we find the internal node
w that follows v in an
inorder traversal
we copy key(w) into node
v
we remove node w and
its left child z (which
must be a leaf) by
means of operation
removeAboveExternal(z)
Example: remove 3
1
v
3
2
8
6
w
9
5
z
1
v
5
2
8
6
9
Binary Search Tree Performance

Consider a dictionary
with n items
implemented by means
of a binary search tree of
height h



the space used is O(n)
methods findElement ,
insertItem and
removeElement take
O(h) time
The height h is O(n) in
the worst case and O(log
n) in the best case
Balanced Trees
 When a path in a tree gets very long
relative to other paths in the tree…
 the tree is unbalanced
 In fact, in its extreme form an
unbalanced tree is a linear list.
 So, to achieve optimal performance…
 you need to keep the tree balanced
AVL Trees
 we want to maintain a balanced tree
 recall height of a node v = longest path from v
to an external node
 We want to maintain the principle
that
 for every node v the height of its
children can differ by no more than 1
 Height-Balance Property
AVL Trees
 h(right_subtree)-h(left_subtree) =
Balance Factor
 |h(right_subtree)-h(left_subtree)| =
{0,1}
 Tree with Balance Factor ≠ {-1,0,1}
 Unbalanced Tree
 Must be rebalanced
 Balance Factor exists for every node v
 except (trivially) external nodes
AVL Trees
 If Balance Factor = -1,0,1
 tree balanced
 does not need restructured
 If Balance Factor = -2, 2
 tree unbalanced
 needs restructured
 restructured done by process called
rotation
AVL Trees
 Rotation
 Four types – but two are symmetrical




Left Single Rotation
Right Single Rotation
Left Double Rotation
Right Double Rotation
 Since two are symmetrical –only
consider single and double rotation
AVL Trees
 Rotation
 if BF = 2
AVL Trees
 Binary Trees that maintain the
Height-Balance Property are called
 AVL trees
 the name comes from the inventors
 G.M. Adelson-Velsky and E.M. Landis in
paper entitled “An Algorithm for
Information Organization”
AVL Trees
Unbalanced Tree
Balanced Tree
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees
 h(right_subtree)-h(left_subtree) =
Balance Factor (BF)
 If BF = {-1,0,1} then tree balanced
(do nothing)
 If BF ≠{-1,0,1} then tree unbalanced
(must be restructured)
 Restructuring done by rotation
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees
 Rotation
 four cases – but pairs are symmetrical




left single rotation
right single rotation
left double rotation
right double rotation
 singe symmetric – we only examine
single and double
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Rotation
 If BF > 2 unbalance occurred further
down in right subtree
 Recursively walk down subtree until |BF|
=2
 If BF < -2 unbalance occurred further
down in left subtree
 Recursively walk down subtree until |BF|
=2
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Rotation
 If BF = 2 unbalance occurred in right
subtree
 Recursively walk down subtree until |BF|
=2
 If BF = -2 unbalance occurred in left
subtree
 Recursively walk down subtree until |BF|
=2
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Rotation
 If BF = 2 unbalance occurred in right
subtree
 Step down to subtree to find where
insertion occurred
 If BF = -2 unbalance occurred in left
subtree
 Step down to subtree to find where
insertion occurred
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Rotation
 If BF at subtree = 1
 insertion occurred on right leaf node
 single rotation required
 If BF at subtree = -1
 insertion occurred on left leaf node
 double rotation occurred
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Rotation
 See
 http://en.wikipedia.org/wiki/AVL_trees
from:http://en.wikipedia.org/wiki/AVL_trees
AVL Trees - Insertion
 Performance




rotations – O(1)
Recall h(T) maintained at O(log n)
insertItem – O(log n)
balanced tree - priceless
from:http://en.wikipedia.org/wiki/AVL_trees
Bounded –depth Search Trees
 Search efficiency in tree is related to
the depth of the tree
 Can use depth bounded tree to create
ordered dictionaries that run in O(log
n) for search and update run-time
Multi-way Search Trees
 Remember Binary Search Trees
 any node v can have at most 2 children
 what if we get rid of that rule
 Suppose a node could have multiple
children (>2)
 Terminology – if v has d children – v is a
d-node
Multi-way Search Trees
 Multi-way Search Tree - T
 Each Internal node must have at least
two children -- internal node is d-node
with d ≥ 2
 Internal nodes store collections of items
(k,e)
 Each d-node stores d-1 items
 Special keys k0 = -∞ and kd = ∞
 External nodes only placeholders
Download