B'-Trees/B+

advertisement
+
B -Trees
• Same structure as B-trees.
• Dictionary pairs are in leaves only. Leaves form a
doubly-linked list.
• Remaining nodes have following structure:
j a0 k 1 a1 k2 a2 … k j aj
• j = number of keys in node.
• ai is a pointer to a subtree.
• ki <= smallest key in subtree ai and > largest
in ai-1.
Example B+-tree
9
5
1 3
16 30
5 6
 index node
 leaf/data node
9
16 17 30 40
B+-tree—Search
9
5
1 3
key = 5
6 <= key <= 20
16 30
5 6
9
16 17 30 40
B+-tree—Insert
9
5
1
Insert 10
16 30
5 6
9
16 17 30 40
Insert
9
5
1 3
16 30
5 6
• Insert a pair with key = 2.
• New pair goes into a 3-node.
9
16 17 30 40
Insert Into A 3-node
• Insert new pair so that the keys are in
ascending order.
123
• Split into two nodes.
1
23
• Insert smallest key in new node and pointer
to this new node into parent.
2
1
23
Insert
9
5
2
16 30
2 3
1
5 6
9
16 17 30 40
• Insert an index entry 2 plus a pointer into parent.
Insert
9
2 5
1
2 3
16 30
5 6
9
• Now, insert a pair with key = 18.
16 17 30 40
Insert
9
17
2 5
1
2 3
16 30
5 6
9
16
17 18
30 40
• Now, insert a pair with key = 18.
• Insert an index entry17 plus a pointer into parent.
Insert
17
9
2 5
1
2 3
16
5 6
9
30
16
17 18
• Now, insert a pair with key = 18.
• Insert an index entry17 plus a pointer into parent.
30 40
Insert
9 17
2 5
1
2 3
16
5 6
9
30
16
• Now, insert a pair with key = 7.
17 18 30 40
Delete
9
2 5
1
2 3
16 30
5 6
9
• Delete pair with key = 16.
• Note: delete pair is always in a leaf.
16 17 30 40
Delete
9
2 5
1
2 3
16 30
5 6
9
• Delete pair with key = 16.
• Note: delete pair is always in a leaf.
17
30 40
Delete
9
2 5
1
2 3
16 30
5 6
9
17
30 40
• Delete pair with key = 1.
• Get >= 1 from sibling and update parent key.
Delete
9
3 5
2
3
16 30
5 6
9
17
30 40
• Delete pair with key = 1.
• Get >= 1 from sibling and update parent key.
Delete
9
3 5
2
3
16 30
5 6
9
17
30 40
• Delete pair with key = 2.
• Merge with sibling, delete in-between key in parent.
Delete
9
5
3
16 30
5 6
9
17
30 40
• Delete pair with key = 3.
•Get >= 1 from sibling and update parent key.
Delete
9
6
5
16 30
6
9
17
30 40
• Delete pair with key = 9.
• Merge with sibling, delete in-between key in parent.
Delete
9
6
5
30
6
17
30 40
Delete
9
6
5
16 30
6
9
17
30 40
• Delete pair with key = 6.
• Merge with sibling, delete in-between key in parent.
Delete
9
16 30
5
9
17
30 40
• Index node becomes deficient.
•Get >= 1 from sibling, move last one to parent, get
parent key.
Delete
16
9
5
30
9
17
30 40
• Delete 9.
• Merge with sibling, delete in-between key in parent.
Delete
16
30
5
17
30 40
•Index node becomes deficient.
• Merge with sibling and in-between key in parent.
Delete
16 30
5
17
30 40
•Index node becomes deficient.
• It’s the root; discard.
B*-Trees
• Root has between 2 and 2 * floor((2m – 2)/3) + 1
children.
• Remaining nodes have between ceil((2m – 1)/3)
and m children.
• All external/failure nodes are on the same level.
Insert
• When insert node is overfull, check adjacent
sibling.
• If adjacent sibling is not full, move a dictionary
pair from overfull node, via parent, to nonfull
adjacent sibling.
• If adjacent sibling is full, split overfull node,
adjacent full node, and in-between pair from
parent to get three nodes with floor((2m – 2)/3),
floor((2m – 1)/3), floor(2m/3) pairs plus two
additional pairs for insertion into parent.
Delete
• When combining, must combine 3 adjacent
nodes and 2 in-between pairs from parent.
 Total # pairs involved = 2 * floor((2m-2)/3) +
[floor((2m-2)/3) – 1] + 2.
 Equals 3 * floor((2m-2)/3) + 1.
• Combining yields 2 nodes and a pair that is
to be inserted into the parent.
 m mod 3 = 0 => nodes have m – 1 pairs each.
 m mod 3 = 1 => one node has m – 1 pairs and
the other has m – 2.
 m mod 3 = 2 => nodes have m – 2 pairs each.
Splay Trees
• Binary search trees.
• Search, insert, delete, and split have amortized
complexity O(log n) & actual complexity O(n).
• Actual and amortized complexity of join is O(1).
• Priority queue and double-ended priority queue
versions outperform heaps, deaps, etc. over a
sequence of operations.
• Two varieties.
 Bottom up.
 Top down.
Bottom-Up Splay Trees
• Search, insert, delete, and join are done as in an
unbalanced binary search tree.
• Search, insert, and delete are followed by a
splay operation that begins at a splay node.
• When the splay operation completes, the splay
node has become the tree root.
• Join requires no splay (or, a null splay is done).
• For the split operation, the splay is done in the
middle (rather than end) of the operation.
Splay Node – search(k)
20
10
6
2
40
15
8
30
25
• If there is a pair whose key is k, the node
containing this pair is the splay node.
• Otherwise, the parent of the external node where
the search terminates is the splay node.
Splay Node – insert(newPair)
20
10
6
2
40
15
8
30
25
• If there is already a pair whose key is
newPair.key, the node containing this pair is the
splay node.
• Otherwise, the newly inserted node is the splay
node.
Splay Node – delete(k)
20
10
6
2
40
15
8
30
25
• If there is a pair whose key is k, the parent of the
node that is physically deleted from the tree is the
splay node.
• Otherwise, the parent of the external node where
the search terminates is the splay node.
Splay Node – split(k)
• Use the unbalanced binary search tree insert
algorithm to insert a new pair whose key is k.
• The splay node is as for the splay tree insert
algorithm.
• Following the splay, the left subtree of the root is
S, and the right subtree is B.
m
S
B
• m is set to null if it is the newly inserted pair.
Splay
• Let q be the splay node.
• q is moved up the tree using a series of splay
steps.
• In a splay step, the node q moves up the tree by
0, 1, or 2 levels.
• Every splay step, except possibly the last one,
moves q two levels up.
Splay Step
• If q = null or q is the root, do nothing (splay
is over).
• If q is at level 2, do a one-level move and
terminate the splay operation.
q
p
q
a
c
b
• q right child of p is symmetric.
a
p
b
c
Splay Step
• If q is at a level > 2, do a two-level move
and continue the splay operation.
q
gp
a
p
d
q
a
p
c
b
b
• q right child of right child of gp is symmetric.
gp
c
d
2-Level Move (case 2)
q
gp
p
p
a
gp
d
a
q
b
c
• q left child of right child of gp is symmetric.
b
c
d
Per Operation Actual Complexity
• Start with an empty splay tree and insert
pairs with keys 1, 2, 3, …, in this order.
1
1
2
2
1
Per Operation Actual Complexity
• Start with an empty splay tree and insert
pairs with keys 1, 2, 3, …, in this order.
3
2
1
3
2
3
1
2
1
4
Per Operation Actual Complexity
• Worst-case height = n.
• Actual complexity of search, insert, delete,
and split is O(n).
Digital Search Trees & Binary Tries
• Analog of radix sort to searching.
• Keys are binary bit strings.
 Fixed length – 0110, 0010, 1010, 1011.
 Variable length – 01, 00, 101, 1011.
• Application – IP routing, packet classification,
firewalls.
 IPv4 – 32 bit IP address.
 IPv6 – 128 bit IP address.
Digital Search Tree
• Assume fixed number of bits.
• Not empty =>
 Root contains one dictionary pair (any pair).
 All remaining pairs whose key begins with a 0
are in the left subtree.
 All remaining pairs whose key begins with a 1
are in the right subtree.
 Left and right subtrees are digital subtrees on
remaining bits.
Example
• Start with an empty digital search tree and
insert a pair whose key is 0110.
0110
• Now, insert a pair whose key is 0010.
0110
0010
Example
• Now, insert a pair whose key is 1001.
0110
0010
0110
0010
1001
Example
• Now, insert a pair whose key is 1011.
0110
0110
0010
1001
0010
1001
1011
Example
• Now, insert a pair whose key is 0000.
0110
0110
0010
0010
1001
1011
0000
1001
1011
Search/Insert/Delete
0110
0010
0000
1001
1011
• Complexity of each operation is O(#bits in a key).
• #key comparisons = O(height).
• Expensive when keys are very long.
Binary Trie
• Information Retrieval.
• At most one key comparison per operation.
• Fixed length keys.
 Branch nodes.
• Left and right child pointers.
• No data field(s).
 Element nodes.
• No child pointers.
• Data field to hold dictionary pair.
Example
0
1
0
0
1
1100
0
0001
1
0
0011
0
1000
1
1001
At most one key comparison for a search.
Variable Key Length
• Left and right child fields.
• Left and right pair fields.
 Left pair is pair whose key terminates at root of
left subtree or the single pair that might
otherwise be in the left subtree.
 Right pair is pair whose key terminates at root
of right subtree or the single pair that might
otherwise be in the right subtree.
 Field is null otherwise.
Example
0
null
1
0
00
01100
10
0
0000
11111
0
001
1000
101
1
00100
001100
At most one key comparison for a search.
Fixed Length Insert
0
0
0
0001
1
1
0
1
1
1100
0111
0
0011
0
1
1000
Insert 0111.
1001
Zero compares.
Fixed Length Insert
0
0
0
0001
1
1
0
1
1100
0111
0
0011
0
1000
Insert 1101.
1
1
1001
Fixed Length Insert
1100
0
0
0
0001
1
1
0
1
0111
1
0
0
0011
0
1000
Insert 1101.
1
1001
0
Fixed Length Insert
0
0
0
0001
1
1
0
1
0111
1
0
0
0011
0
1000
Insert 1101.
1
1001
0
1100
One compare.
1
1101
Fixed Length Delete
0
0
0
0001
1
1
0
1
0111
1
0
0
0011
0
1000
Delete 0111.
1
1001
0
1100
1
1101
Fixed Length Delete
0
1
0
0
0
0001
1
1
0
0
0011
0
1000
Delete 0111.
1
1001
0
1100
One compare.
1
1101
Fixed Length Delete
0
1
0
0
0
0001
1
1
0
0
0011
0
1000
Delete 1100.
1
1001
0
1100
1
1101
Fixed Length Delete
0
1
0
0
0
0001
1
0
1
0
0011
0
1000
Delete 1100.
1
1001
1
1101
Fixed Length Delete
1101
0
1
0
0
0
0001
1
0
0
0011
0
1000
Delete 1100.
1
1
1001
Fixed Length Delete
1101
0
1
0
0
0
0001
1
0
0011
0
1000
Delete 1100.
1
1001
1
Fixed Length Delete
0
1
0
0
0
0001
1
1101
0
0011
0
1000
Delete 1100.
1
1
1001
One compare.
Compressed Binary Tries
• No branch node whose degree is 1.
• Add a bit# field to each branch node.
• bit# tells you which bit of the key to use to
decide whether to move to the left or right
subtrie.
Binary Trie
1
0
3
0
0001
1
0
0
1
0011
2
4
0
1000
1
0
4
1
1001
bit# field shown in black outside branch node.
0
1100
0
1
1101
Compressed Binary Trie
0
1
1
3
0
0001
2
1
0011
0
4
0
1000
1
4
1
1001
bit# field shown in black outside branch node.
0
1100
1
1101
Compressed Binary Trie
0
1
1
3
0
0001
2
1
0011
0
4
0
1000
#branch nodes = n – 1.
1
4
1
1001
0
1100
1
1101
Insert
0
1
1
3
0
0001
2
1
0011
0
4
0
1000
Insert 0010.
1
4
1
1001
0
1100
1
1101
Insert
0
1
1
3
2
0
1
4
0001
0
0010
1
0011
0
4
0
1000
Insert 0100.
1
4
1
1001
0
1100
1
1101
Insert
0
1
1
2
2
3
1
0
0100
0
1
4
0001
0
0010
1
0011
0
4
0
1000
1
4
1
1001
0
1
1100
1101
Delete
0
1
1
2
2
3
1
0
0100
0
1
4
0001
0
0010
1
0011
0
4
0
1000
1
4
1
1001
Delete 0010.
0
1100
1
1101
Delete
0
1
1
2
2
3
0
0001
1
0
0100
1
0011
Delete 1001.
0
4
0
1000
1
4
1
1001
0
1100
1
1101
Delete
0
1
1
2
2
3
0
0001
1
0
0100
1
0011
0
1000
1
4
0
1100
1
1101
Download