Lecture 2 Search Trees: Splay Trees, ”Multi-Way” Search Trees, B-Trees TDDC32

advertisement
Lecture 2
Search Trees: Splay Trees,
”Multi-Way” Search Trees, B-Trees
TDDC32
Lecture notes in Design and Implementation of a Software Module in Java 16 January 2013
Tommy Färnqvist, IDA, Linköping University
2.1
Lecture Topics
Innehåll
1
Splay Trees
1
2
(2, 4) Trees
8
3
B-Trees
9
1
2.2
Splay Trees
Splay Tree - basic idea
Recall the basic BST:
• Simple insert and delete when balanced, but. . .
• The “balance” is determined by order of inserts and deletes. . .
Combine with the “keep recent objects first” heuristics for lists?
• Often-used elements should be near the root!
insert: 1,2,4,5,8
insert: 5,2,1,4,8
2.3
The splay(k) operation
• Perform a normal search for k, remember all nodes we pass. . .
• Label the last node we inspect P
– If k is in T , then k is in node P,
– otherwise P is the parent of an empty tree
• Return back to root, at each node do a rotation to move P up the tree. . . (3 cases)
2.4
1
The splay(k) operation
• zig: parent(P) is the root: rotate around P:
2.5
Q
P
P
Q
c
a
a
b
b
c
The splay(k) operation
• zig-zig: P and parent(P) are both left children (or both right children): perform two rotations to shift
P up:
R
Q
Q
P
P
R
Q
d
a
P
R
a
c
a
b
c
d
b
b
c
d
2.6
The splay(k) operation
• zig-zag: One of P and parent(P) is a left child and the other is a right child or vice versa: Perform
two rotations in different directions:
R
P
Q
R
Q
a
P
a
d
b
b
c
d
c
Note: These rotations may increase the height of the tree!
2
2.7
find and insert
function FIND(k, T )
SPLAY (k, T )
if KEY(ROOT(T )) = k then return (k, v)
else return null
function INSERT(k, v, T )
SPLAY (k, T )
if KEY(ROOT(T )) = k then update its value with v
else make (k, v) a new root in T
2.8
Example: inserting 14
2.9
Example: inserting 14
3
2.10
Example: inserting 14
2.11
Example: inserting 14
4
2.12
Example: inserting 14
2.13
Example: inserting 14
5
2.14
delete
function DELETE(k, T )
if k is at a leaf then
perform SPLAY on the parent of the leaf
else if k is at an internal node then
replace the node with its inorder predecessor
perform SPLAY on the parent of the predecessor
Of course, using the inorder successor also works.
2.15
Exampel: deleting 8
2.16
6
Exampel: deleting 8
2.17
Exampel: deleting 8
2.18
Performance
• Each operation may face a totally unbalanced tree
– not guaranteed to operate in O(log n) time in worst case
• Amortized time is logarithmic:
7
– Any sequence of length m of these operations, starting with an empty tree, will use a total
amount of O(m log m) time
– Hence, the amortized cost/time of an operation is O(log n), although individual operations may
be significantly worse
2.19
2 (2, 4) Trees
New approach: Relax some condition
• AVL Tree: binary tree, accepts a small unbalance. . .
• Recall: Full binary tree: non-empty; degree is either 0 or 2 for each node Perfect binary tree: full, all
leaves have the same depth
• Can we build and maintain a perfect tree (if we skip “binary”)? Then we would always know the
worst search time exactly!
2.20
(2, 4) Tree
Previously:
• A single “pivot element”
• If larger we search to the right
• If smaller we search to the left
Now:
• Allow multiple (namely 1–3) pivot elements
• No. of children of an internal node = no. of pivot elements + 1 (i.e. 2–4)
5
2
5
2
8
10
8
15
12
18
2.21
More generally: (a, b)-trees
• Each node is either a leaf, or has c children, where a ≤ c ≤ b Each node has a − 1 to b − 1 pivot
elements
• 2 ≤ a ≤ (b + 1)/2 (but the root may have only two children (or none) even when a > 2)
• find works approximately as before
• insert must check that a node does not overflow (then we split the node)
• remove must check that a node does not become empty (then we merge nodes or transfer values
between nodes
Proposition 1 (14.1 in course book). The height of an (a, b)-tree storing n entries is Ω(log n/ log b) and
O(log n/ log a).
⇒ Shorter/flatter trees, but more work to be done in nodes.
8
2.22
Insertion in (a, b)-treewith a = 2 and b = 4
• As long as there is room in the child we find, add element to that child. . .
• If full, split and push the median (rounded up) pivot element upwards. . . . . . this may happen repeatedly
12
Insert(15)
4 6 12
4 6 12 15
4 6
15
2.23
Delete in a (2, 4)-tree
Three cases
• No constraints are violated by removal
• A leaf is removed (becomes empty) Then transfer some other key to that leaf, . . . ok if one of the two
nearest siblings have 2+ elements
30 50
25
35 40
Transfer of
30 and 35
30 50
Delete(25)
?
35 40
60
35 50
? 50
60
30
35 40
60
40
30
60
2.24
Delete in a (2, 4)-tree
• If a leaf is removed (becomes empty)
• Then transfer some other key to that leaf, or
• Merge (fuse) it with a neighbour
35 50
Delete(60)
30
35 50
40
?
35
35
30
40
60
30
30
40
40 50
50
2.25
Delete in a (2, 4)-tree
• An internal node becomes empty Root: replace with inorder predecessor or successor Then repair
inconsistencies with suitable merge and transfer operations. . .
3
2.26
B-Trees
B-Tree
•
•
•
•
Used to keep an index over external data (e.g. content of a disc)
It’s only an (a, b)-tree where a = db/2e, i.e. b = 2a − 1 or b = 2a
We may now choose b so that b references to children (and b − 1 keys) fit into a single disc block
By defining a = db/2e we will always fill up a disc block when two blocks are merged!
B-Trees are widely used for file systems and databases
• Windows: HPFS
• Mac: HFS, HFS+
• Linux: ReiserFS, XFS, Ext3FS, JFS
• Databases: ORACLE, DB2, INGRES, PostgreSQL
9
2.27
Voluntary Homework Problem
Show what happens step-by-step when you remove the values 39, 18, 50 from the following (2, 4)-tree:
20
10
3
30 40 50
18
23
39
44
59
2.28
10
Download