Binary Search Trees

advertisement
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
F O U R T H
E D I T I O N
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
Binary search trees
Definition. A BST is a binary tree in symmetric order.
root
a left link
A binary tree is either:
a subtree
・Empty.
・Two disjoint binary trees (left and right).
right child
of root
null links
Anatomy of a binary tree
Symmetric order. Each node has a key,
and every node’s key is:
・
・Smaller than all keys in its right subtree.
Larger than all keys in its left subtree.
parent of A and R
left link
of E
S
E
X
A
R
C
keys smaller than E
key
H
9
value
associated
with R
keys larger than E
Anatomy of a binary search tree
3
Binary search tree demo
Search. If less, go left; if greater, go right; if equal, search hit.
successful search for H
S
E
X
A
R
C
H
M
4
Binary search tree demo
Insert. If less, go left; if greater, go right; if null, insert.
insert G
S
E
X
A
R
C
H
G
M
5
BST representation in Java
Java definition. A BST is a reference to a root Node.
A Node is composed of four fields:
・A Key and a Value.
・A reference to the left and right subtree.
smaller keys
larger keys
private class Node
{
private Key key;
private Value val;
private Node left, right;
public Node(Key key, Value val)
{
this.key = key;
this.val = val;
}
}
BST
Node
key
left
BST with smaller keys
val
right
BST with larger keys
Binary search tree
Key and Value are generic types; Key is Comparable
6
BST implementation (skeleton)
public class BST<Key extends Comparable<Key>, Value>
{
private Node root;
private class Node
{ /* see previous slide */
root of BST
}
public void put(Key key, Value val)
{ /* see next slides */ }
public Value get(Key key)
{ /* see next slides */ }
public void delete(Key key)
{ /* see next slides */ }
public Iterable<Key> iterator()
{ /* see next slides */ }
}
7
BST search: Java implementation
Get. Return value corresponding to given key, or null if no such key.
public Value get(Key key)
{
Node x = root;
while (x != null)
{
int cmp = key.compareTo(x.key);
if
(cmp < 0) x = x.left;
else if (cmp > 0) x = x.right;
else if (cmp == 0) return x.val;
}
return null;
}
Cost. Number of compares is equal to 1 + depth of node.
8
BST insert
Put. Associate value with key.
inserting L
S
E
X
A
H
C
Search for key, then two cases:
・Key in tree ⇒ reset value.
・Key not in tree ⇒ add new node.
R
M
P
search for L ends
at this null link
S
E
X
A
R
H
C
M
create new node
P
L
S
E
X
A
R
C
H
M
reset links
on the way up
L
P
Insertion into a BST
9
BST insert: Java implementation
Put. Associate value with key.
public void put(Key key, Value val)
{ root = put(root, key, val); }
concise, but tricky,
recursive code;
read carefully!
private Node put(Node x, Key key, Value val)
{
if (x == null) return new Node(key, val);
int cmp = key.compareTo(x.key);
if
(cmp < 0)
x.left = put(x.left, key, val);
else if (cmp > 0)
x.right = put(x.right, key, val);
else if (cmp == 0)
x.val = val;
return x;
}
Cost. Number of compares is equal to 1 + depth of node.
10
S
C
Tree shape
R
E
A
X
・Many BSTs correspond to same set of keys. typical case
S
best
case
is equal to 1 + depth of node.
・Number of compares for search/insert
H
E
X
S
C
best case
R
X
typical case
X
A
R
H
C
E
H
R
H
S
X
S
E
worst case
C
R
C
H
C
X
A
R
A
S
E
S
C
E
X
typical case
H
A
R
E
A
A
A
worst case
C
BST possibilities
E
H
R
S
BottomAline. Tree
depends on order of X
insertion.
worstshape
case
C
E
BST possibilities
H
11
BST insertion: random order visualization
Ex. Insert keys in random order.
12
Sorting with a binary heap
Q. What is this sorting algorithm?
0. Shuffle the array of keys.
1. Insert all keys into a BST.
2. Do an inorder traversal of BST.
A. It's not a sorting algorithm (if there are duplicate keys)!
Q. OK, so what if there are no duplicate keys?
Q. What are its properties?
13
Correspondence between BSTs and quicksort partitioning
P
H
T
D
O
E
A
I
S
U
Y
C
M
L
Remark. Correspondence is 1–1 if array has no duplicate keys.
14
BSTs: mathematical analysis
Proposition. If N distinct keys are inserted into a BST in random order,
the expected number of compares for a search/insert is ~ 2 ln N.
Pf. 1–1 correspondence with quicksort partitioning.
Proposition. [Reed, 2003] If N distinct keys are inserted in random order,
expected height of tree is ~ 4.311 ln N.
How Tall is a Tree?
How Tall is a Tree?
Bruce Reed
Bruce Reed
CNRS, Paris, France
CNRS, Paris, France
reed@moka.ccr.jussieu.fr
reed@moka.ccr.jussieu.fr
ABSTRACT
purpose of this note
ABSTRACT
have:
purpose
this note
that
for /3 -- ½ + ~ 3,
Let H~ be the height of a random
binaryofsearch
tree toonprove
n
nodes.
Wesearch
show that
there
constants a = 4.31107...
Let H~ be the height of a random
binary
tree on
n existshave:
THEOREM 1. E(H
a
n
d
/
3
=
1.95...
such
that
E(H~)
= c~logn - / 3 1 o g l o g n +
nodes. We show that there exists constants a = 4.31107...
Var(Hn)
.
O(1),
We
also
show
that
Var(H~)
=
O(1).
THEOREM 1. E(H~) = ~ l o g n - / 3 1 o g l=o gO(1)
n + O(1
a n d / 3 = 1.95... such that E(H~) = c~logn - / 3 1 o g l o g n +
Var(Hn) = O(1) .
O(1), We also show that Var(H~) = O(1).
R e m a r k By the def
Categories and Subject Descriptors
nition
given
is more
R e m a r k By the definition of a, /3 = 7"g~"
3~ The
firs
as
we
will
see.
Categories and Subject Descriptors
E.2 [ D a t a S t r u c t u r e s ] : Trees nition given is more suggestive of why this value is co
For more informatio
as we will see.
E.2 [ D a t a S t r u c t u r e s ] : Trees
consult
[6],[7],
For more information on randommay
binary
search
trees[
1. THE RESULTS
R eand
m a r[8].
k After I ann
may consult [6],[7], [1], [2], [9], [4],
A binary search tree is a binary tree to each node of which
1. THE RESULTS
developed
an alterna
Rkeys
e m aaxe
r k drawn
After I from
announced
these results, Drmota(unp
we
have
associated
a
key;
these
some
O(1)
using
complete
A binary search tree is a binary tree to each node of which
developed
anbealternative
proof of the fact that
Var(H
totally
ordered
set
and
the
key
at
v
cannot
larger
than
proofs
illuminate
we have associated a key; these keys axe drawn from some
O(1) than
using
different techniques. As
ou
15dif
at its
child
thecompletely
key at its left
decided
to
submit
th
totally ordered set and the key atthev key
cannot
beright
larger
thannor smaller
proofs illuminate different aspects of the problem, we
But… Worst-case height is N.
[ exponentially small chance when keys are inserted in random order ]
ST implementations: summary
guarantee
average case
operations
on keys
implementation
search
insert
search hit
insert
sequential search
(unordered list)
N
N
½N
N
equals()
binary search
(ordered array)
lg N
N
lg N
½N
compareTo()
BST
N
N
1.39 lg N
1.39 lg N
compareTo()
Why not shuffle to ensure a (probabilistic) guarantee of 4.311 ln N?
16
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
Minimum and maximum
Minimum. Smallest key in table.
Maximum. Largest key in table.
max()max
S
min()
E
min
X
A
R
C
H
M
Examples of BST order queries
Q. How to find the min / max?
18
Floor and ceiling
Floor. Largest key ≤ a given key.
Ceiling. Smallest key ≥ a given key.
floor(G)
max()
S
min()
E
X
A
R
C
H
ceiling(Q)
M
floor(D)
Examples of BST order queries
Q. How to find the floor / ceiling?
19
Computing the floor
Case 1. [k equals the key in the node]
finding floor(G)
The floor of k is k.
S
E
X
A
R
H
C
Case 2. [k is less than the key in the node]
M
G is less than S so
floor(G) must be
on the left
The floor of k is in the left subtree.
S
E
Case 3. [k is greater than the key in the node]
The floor of k is in the right subtree
(if there is any key ≤ k in right subtree);
X
A
R
H
C
G is greater than E so
floor(G) could be
M
on the right
S
E
otherwise it is the key in the node.
X
A
R
H
C
M
floor(G)in left
subtree is null
S
E
A
C
result
X
R
H
M
20
Computing the floor
finding floor(G)
public Key floor(Key key)
{
Node x = floor(root, key);
if (x == null) return null;
return x.key;
}
private Node floor(Node x, Key key)
{
if (x == null) return null;
int cmp = key.compareTo(x.key);
if (cmp == 0) return x;
if (cmp < 0)
return floor(x.left, key);
Node t = floor(x.right, key);
if (t != null) return t;
else
return x;
S
E
X
A
R
H
C
M
G is less than S so
floor(G) must be
on the left
S
E
X
A
R
H
C
G is greater than E so
floor(G) could be
M
on the right
S
E
X
A
R
H
C
M
floor(G)in left
subtree is null
S
}
E
A
C
result
X
R
H
M
21
Rank and select
Q. How to implement rank() and select() efficiently?
A. In each node, we store the number of nodes in the subtree rooted at
that node; to implement size(), return the count at the root.
node count
8
6
S
E
2
A
3
1
2
C
H
1
X
R
1
M
22
BST implementation: subtree counts
private class Node
{
private Key key;
private Value val;
private Node left;
private Node right;
private int count;
}
public int size()
{ return size(root);
}
private int size(Node x)
{
if (x == null) return 0;
ok to call
return x.count;
when x is null
}
number of nodes in subtree
initialize subtree
private Node put(Node x, Key key, Value val)
count to 1
{
if (x == null) return new Node(key, val, 1);
int cmp = key.compareTo(x.key);
if
(cmp < 0) x.left = put(x.left, key, val);
else if (cmp > 0) x.right = put(x.right, key, val);
else if (cmp == 0) x.val = val;
x.count = 1 + size(x.left) + size(x.right);
return x;
}
23
Rank
Rank. How many keys < k ?
Easy recursive algorithm (3 cases!)
node count
8
6
S
E
2
A
3
1
2
C
H
1
X
R
1
M
public int rank(Key key)
{ return rank(key, root);
}
private int rank(Key key, Node x)
{
if (x == null) return 0;
int cmp = key.compareTo(x.key);
if
(cmp < 0) return rank(key, x.left);
else if (cmp > 0) return 1 + size(x.left) + rank(key, x.right);
else if (cmp == 0) return size(x.left);
}
24
Inorder traversal
・Traverse left subtree.
・Enqueue key.
・Traverse right subtree.
public Iterable<Key> keys()
{
Queue<Key> q = new Queue<Key>();
inorder(root, q);
return q;
}
private void inorder(Node x, Queue<Key> q)
{
if (x == null) return;
inorder(x.left, q);
q.enqueue(x.key);
inorder(x.right, q);
}
BST
key
left
val
right
BST with larger keys
BST with smaller keys
smaller keys, in order
key
larger keys, in order
all keys, in order
Property. Inorder traversal of a BST yields keys in ascending order.
25
BST: ordered symbol table operations summary
sequential
search
binary
search
BST
search
N
lg N
h
insert
N
N
h
min / max
N
1
h
floor / ceiling
N
lg N
h
rank
N
lg N
h
select
N
1
h
ordered iteration
N log N
N
N
h = height of BST
(proportional to log N
if keys inserted in random order)
order of growth of running time of ordered symbol table operations
26
3.2 B INARY S EARCH T REES
‣ BSTs
‣ ordered operations
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
‣ deletion
ST implementations: summary
guarantee
average case
implementation
ordered
ops?
operations
on keys
search
insert
delete
search hit
insert
delete
sequential search
(linked list)
N
N
N
½N
N
½N
binary search
(ordered array)
lg N
N
N
lg N
½N
½N
✔
compareTo()
BST
N
N
N
1.39 lg N
1.39 lg N
???
✔
compareTo()
equals()
Next. Deletion in BSTs.
28
BST deletion: lazy approach
To remove a node with a given key:
・Set its value to null.
・Leave key in tree to guide search (but don't consider it equal in search).
E
E
A
S
A
S
delete I
C
H
☠
C
I
R
N
H
tombstone
R
N
Cost. ~ 2 ln N' per insert, search, and delete (if keys in random order),
where N' is the number of key-value pairs ever inserted in the BST.
Unsatisfactory solution. Tombstone (memory) overload.
29
Deleting the minimum
To delete the minimum key:
・Go left until finding a node with a null left link.
・Replace that node by its right link.
・Update subtree counts.
go left until
reaching null
left link
A
S
X
E
R
H
C
M
return that
node’s right link
S
X
E
R
A
public void deleteMin()
{ root = deleteMin(root);
H
C
}
private Node deleteMin(Node x)
{
if (x.left == null) return x.right;
x.left = deleteMin(x.left);
x.count = 1 + size(x.left) + size(x.right);
return x;
}
M
available for
garbage collection
update links and node counts
after recursive calls
7
S
5
X
E
R
C
H
M
Deleting the minimum in a BST
30
Hibbard deletion
To delete a node with key k: search for node t containing key k.
Case 0. [0 children] Delete t by setting parent link to null.
deleting C
update counts after
recursive calls
S
S
E
X
A
R
C
E
R
M
node to delete
replace with
null link
1
E
X
A
R
H
H
H
S
5
X
A
7
M
M
available for
garbage
collection
C
31
Hibbard deletion
To delete a node with key k: search for node t containing key k.
Case 1. [1 child] Delete t by replacing parent link.
deleting R
update counts after
recursive calls
S
S
E
X
A
R
C
H
M
node to delete
E
C
X
E
M
X
H
A
replace with
child link
S
5
H
A
7
C
M
available for
garbage
collection
R
32
Hibbard deletion
deleting E
node to delete
S
To delete a node with key k: search for node tEcontainingX key k.
A
R
H
C
M
Case 2. [2 children]
・
・Delete the minimum in t's right subtree.
A
C
・Put x in t's spot.
Find successor x of t.
t
S
E
node to delete
S
E
X
H
C
M
x
search for key E
successor
M
go right, then
go left until
reaching null
left link
min(t.right)
X
deleteMin(t.right)
X
M
7
S
5
H
A
X
R
C
S
E
E
R
R
H
C
min(t.right)
S
H
X
x
M
C
S
E
x
t.left
still a BST
successor
H
A
t
A
R
go right, then
go left until
reaching null
left link
R
x has no left child
but
X don't garbage collect x
x
deleting E
A
search for key E
M update links and
node counts after
recursive calls
Deletion in a BST
33
Hibbard deletion: Java implementation
public void delete(Key key)
{ root = delete(root, key);
}
private Node delete(Node x, Key key) {
if (x == null) return null;
int cmp = key.compareTo(x.key);
if
(cmp < 0) x.left = delete(x.left, key);
else if (cmp > 0) x.right = delete(x.right, key);
else {
if (x.right == null) return x.left;
if (x.left == null) return x.right;
Node t = x;
x = min(t.right);
x.right = deleteMin(t.right);
x.left = t.left;
}
x.count = size(x.left) + size(x.right) + 1;
return x;
search for key
no right child
no left child
replace with
successor
update subtree
counts
}
34
Hibbard deletion: analysis
Unsatisfactory solution. Not symmetric.
Surprising consequence. Trees not random (!) ⇒ √ N per op.
Longstanding open problem. Simple and efficient delete for BSTs.
35
ST implementations: summary
guarantee
average case
implementation
ordered
ops?
operations
on keys
search
insert
delete
search hit
insert
delete
sequential search
(linked list)
N
N
N
½N
N
½N
binary search
(ordered array)
lg N
N
N
lg N
½N
½N
✔
compareTo()
BST
N
N
N
1.39 lg N
1.39 lg N
√N
✔
compareTo()
equals()
other operations also become √N
if deletions allowed
Next lecture. Guarantee logarithmic performance for all operations.
36
Download