Overview Data Structures for Databases He Tan Sept 9, 2004

advertisement
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
Overview
Real
world
Data Structures for Databases
Model
Databases
He Tan
DBMS
Queries Answers
Processing of queries
and
updates
Access to stored data
Physical
database
september 2007
1
1
He Tan
hetan@ida.liu.se
IISLAB
IDA
september 2007
He Tan
hetan@ida.liu.se
IISLAB
IDA
What is this about?
2
2
• How to make more efficient kinds of indexes
What do you need to learn?
• Multilevel indexing
• Index on mutiple keys
• Hashing
september 2007
3
3
He Tan
hetan@ida.liu.se
IISLAB
IDA
A sequential algorithm needs to access all 250,000 blocks
(transfer all blocks in main memory)
•
Blocks: a binary search would need to access
5
TDDB38/TDDI60 - HT 2004
Multilevel Index Example
•
Assume an ordered data file with 1,000,000 records of size
1000 byte and block size of 4,096 bytes. Assuming an index
record size of 32 bytes.
ƒ
On average, how many block accesses need to be performed to
find a single record when searching for the key field
a) Using no index?
log 2 b = log 2 250000 = 18
september 2007
4
4
He Tan
hetan@ida.liu.se
IISLAB
IDA
Record access
•
september 2007
The number of blocks for the data file is 250,000
b) Using a primary index?
5
september 2007
6
6
1
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
Multilevel indexes
Multilevel Indexes
• ”Index on the index”
ƒ Reduce the search space of the index by fitting indexes of
the index in fewer blocks until the top level index fits in one
block.
• The reduction is determined by the blocking factor.
• The value blocking factor is called as fan out (fo).
september 2007
7
7
He Tan
hetan@ida.liu.se
IISLAB
IDA
Assume an ordered datafile with 1,000,000 records of size
1000 byte and block size of 4,096 bytes. Assuming an index
record size of 32 bytes.
ƒ
ƒ All levels are based on physically ordered files.
On average, how many block accesses need to be performed to
find a single record when searching for the key field
• Use an overflow file and re-create the index during
file re-organisation.
He Tan
hetan@ida.liu.se
IISLAB
IDA
• Use a dynamic multilevel index structure
9
9
Search Tree
•
A search tree is a tree that is used to guide the search for a
record.
•
An ordinary search tree of order p consist of nodes that have
at most p-1 values and p pointers.
Problems with Multilevel Indexes
Problems when inserting and deleting data
a) Using a multilevel index
september 2007
8
8
He Tan
hetan@ida.liu.se
IISLAB
IDA
Multilevel Index Example
•
september 2007
september 2007
10
10
Search Trees
Pi
.
Pq
.
<P1, K1, P2, K2, …, Pq-1, Kq-1, Pq>
where q≤p and Pi is a pointer to a child node (or a null pointer)
1. Within each node, K1 < K2 < … < Kq-i
2. For all values X in the subtree pointed by Pi:
If 1< i < q, Ki-1 < X < Ki
If i = 1,
X < K1
If i = q,
Kq-1 < X
september 2007
11
TDDB38/TDDI60 - HT 2004
11
2
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
Search Tree: Example, order p=3
B-Tree
B-tree = Balanced tree.
ƒ all leaves are on the same level
ƒ all nodes except the root and leaves have at most p
pointers and at least p / 2 pointers.
september 2007
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
B-Tree: Example, order p=3
14
14
B-tree: Order
One node must fit in one block:
p ⋅ Pblock + ( p − 1) ⋅ ( Precord + K ) ≤ B ⇒ p ≤
p
Pblock
Precord
K
september 2007
15
15
He Tan
hetan@ida.liu.se
IISLAB
IDA
Given: B = 4096 bytes, Precord = 16 bytes,
Pblock = 8 bytes, K = 64 bytes, fill percentage = 69%
16
B+-tree
• A variation of the B-tree
• Data pointers only stored in leaf nodes.
Æ p <= 47
Nodes
Pointers
1
0.69*47≈33
33-1=32
33
33*33=1089
33*32=1056
Level2
1089
Level3
35,937
333 =35,937
334
=1,185,921
• The leaf nodes are usually linked to provide ordered
access.
Entries
Level1
Root
order, number of block pointer entries in a node
size of a block pointer
size of a record pointer
size of a search key field
16
He Tan
hetan@ida.liu.se
IISLAB
IDA
B-tree: Number of entries
•
september 2007
B+Precord+K
Pblock+Precord+K
• Most common dynamic multilevel index
implementation
332 *32=34,848
333
*32=1,149,984
The number of entries hold in the 3 level B-tree: 1,185,920
september 2007
17
TDDB38/TDDI60 - HT 2004
17
september 2007
18
18
3
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-Tree: Example, order p=3, pleaf=2
Order of insertion:
8, 5, 1, 7, 3, 12, 9, 6
8
5
1
7
3
12
9
6
5
3
7
8
Andersson
Hagberg
French
Silver
Daniels
Young
Zhing
Baker
B+-trees: Internal nodes
1.
Each internal node is of the form
<P1, K1, P2, K2, …, Pq-1, Kq-1, Pq>
2.
Within each internal node K1 < K2 < … < Kq-i
3.
For all search field values X in the subtree
pointed at by Pi, we have
Ki-1< X ≤ Ki
1
5
3
6
8
7
9
K1
X ≤ Ki
Ki-1 < X
12
for 1 < i < q
for i = 1
for i=q
P1
...
K1
He Tan
hetan@ida.liu.se
IISLAB
IDA
K1
4.
Each internal node has at most p tree pointers.
5.
Each internal node, except the root, has at least
K1
...
Ki −1 Pi
Ki
...
B+-trees: Leaf nodes
2.
Within each leaf node K1 < K2 < … < Kq-i
3.
Each entry contains a pointer to the record whose
search field value corresponds to the entry.
4.
Each leaf node has at least p / 2 values.
5.
All leaf nodes are at the same level.
Pr1
...
Ki
Pri
... K
q−1
Pq Pnext
K q −1 < X
21
21
He Tan
hetan@ida.liu.se
IISLAB
IDA
3
Each leaf node is of the form
K1
september 2007
1
Kq−1 Pq
K i −1 < X ≤ K i
X ≤ K1
K q −1 < X
<<K1, P1>, <K2, P2>, …, <Kq-1, Pq-1>, Pnext>
An internal node with q pointers (q≤ p),
has q-1 search field values.
P1
Kq−1 Pq
20
1.
p / 2 tree pointers. The root node has at least
two tree pointers if it is an internal nodes.
6.
...
20
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-trees: Internal nodes
Ki
K i −1 < X ≤ K i
X ≤ K1
september 2007
Ki −1 Pi
september 2007
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-tree Order
22
22
B+-trees
•
Given: B=4096 bytes,
One internal node must fit in one block:
⇒ p ≤
p ⋅ Pblock + ( p − 1) ⋅ K ≤ B
Precord=16 bytes, Pblock=8 bytes, K=64bytes,
B+K
Pblock + K
fill percentage=70%
Æ p <= 57; pleaf<=51
Nodes
One leaf node must fit in one block:
p leaf ⋅ ( Precord + K ) + Pblock ≤ B ⇒ p leaf ≤
september 2007
23
TDDB38/TDDI60 - HT 2004
B
p
pleaf
Pblock
K
Precord
B − Pblock
Precord + K
block size
order, number of pointer entries in an internal node
number of record pointer entries in a leaf node
size of a block pointer
size of a search key field
23
size of a record pointer
Pointers
Entries
≈ 40
40-1=39
Level1
40
40*40=1600
40*39=1560
Level2
1600
403 =64,000
402 *39=62,400
Root
Leaf level
1
0.7*57
Record pointers
64,000
64,000*0.7*51=2,284,800
the number of entries hold in the 3-level B-tree: 1,185,920
september 2007
24
24
4
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-trees Search
B+-Tree Search
Search: 8
• Very fast searching in the index structure:



5
log  p⋅ f  N 
N
p
f
25
25
He Tan
hetan@ida.liu.se
IISLAB
IDA
8
number of search values
order, number of block pointers per node
fill factor, 0≤f≤1
1
september 2007
7
3
september 2007
5
6
7
8
• Insertion and deletion can be expensive.
9
12
26
26
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-trees Insertion and Deletion
3
B+-tree: Insertion
When a leaf node is full it causes an overflow
ƒ The first  p 2 entries in the node are kept there, the
remaining are moved to a new leaf.
ƒ The search value of new node move up to the parent. If the
parent is full, it will overflow.
ƒ The resulting split can propagate all the way up to the root.
september 2007
27
27
B+-Tree
september 2007
28
28
B+-Tree
8
Insert: 8
TDDB38/TDDI60 - HT 2004
Insert: 5
5
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
B+-Tree
B+-Tree
5
8
Overflow – create a new level
5
1
Insert: 1
5
8
Insert: 7
B+-Tree
B+-Tree
5
1
Overflow - Split
5
7
3
8
1
3
5
5
7
8
Overflow - Split
Propagates to a new level
Insert: 3
Insert: 12
B+-Tree
B+-Tree
5
3
1
3
5
8
5
7
8
3
12
1
3
8
5
7
8
9
12
Overflow – Split
Insert: 9
TDDB38/TDDI60 - HT 2004
Insert: 6
6
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-Tree
B+-tree: Deletion
5
When a leaf node is less than haf full it causes an
underflow
3
1
7
3
5
6
7
ƒ Redistribute, merge with sibling,
ƒ The resulting combining can also propagate to internal
nodes.
8
8
9
12
Resulting B+-tree
september 2007
38
38
B+-Tree
B+-Tree
7
1
6
1
5
7
9
6
7
1
8
9
12
1
6
6
9
7
8
9
12
Underflow - redistribute
Delete: 5
Delete: 12
B+-Tree
B+-Tree
7
1
1
6
6
7
8
7
1
8
9
1
6
6
8
7
8
Underflow
Delete: 9
TDDB38/TDDI60 - HT 2004
merge with the left
propagate
reduce the tree levels
7
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
B+-Tree
1
B+-trees
6
• Many variations
1
6
7
ƒ B-trees
ƒ B+-trees
ƒ B*-trees (B+-tree with a fill factor of at least 2/3)
8
• Common modifications
ƒ Change the fillfactor from 0.5 to 1.0
ƒ Allow a node to become empty before merging
september 2007
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
Indexes on Multiple Keys
e.g. select * from employee where dept = ‘CS’ and age = ’40’
•
use index on dept to find employee with dept = ‘CS’, then test them
individually to see if age = ’40’
•
use index on age to find employee with age = ’40’, then test them
individually to see if dept = ‘CS’
•
use dept index to find pointers to all records of the CS department,
and use age index similarly, then take intersection of both sets of
pointers
45
45
He Tan
hetan@ida.liu.se
IISLAB
IDA
Indexes on Multiple Keys
•
Possible strategies for processing this query using indices on single attributes:
september 2007
44
44
ƒ ordered index on multiple attributes, treat the composite
as a single value
september 2007
46
46
He Tan
hetan@ida.liu.se
IISLAB
IDA
Hashing
If the set of records that matches each condition is large,
but the combination is not, an index on the composite
may be useful.
Static Hashing
• Buckets contain index entries.
• Fast search with equality condition on hash field.
• Hash function h(field) yields block address
0
1
h(key) = key mod M
• Collision
key
• Basic idea: static hashing
h
• Dynamic hashing techniques
ƒ Extendable hashing
ƒ Linear hashing
M-1
buckets
Overflow buckets
address space = bfr * M
september 2007
47
TDDB38/TDDI60 - HT 2004
47
september 2007
48
48
8
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
He Tan
hetan@ida.liu.se
IISLAB
IDA
Hashing
Extendible Hashing
• Additional access structure: directory
• Dynamic hashing techniques
d’=2
4* 12* 16* 20*
Insert 20
Bucket A
d’=1
ƒ Extendable hashing
ƒ Linear hashing
d’=2
4* 12* 13* 16*
d=2
d=2
Bucket A
00
d’=2
01
13*
Bucket A’
00
d’=2
01
10* 14*
10
10* 14*
10
Bucket B
11
Bucket B
11
d’=2
d’=2
15* 7* 19*
directory
directory
15* 7* 19*
Bucket C
september 2007
49
49
He Tan
hetan@ida.liu.se
IISLAB
IDA
september 2007
50
50
He Tan
hetan@ida.liu.se
IISLAB
IDA
Extendible Hashing
Linear Hashing
• Extend Æ double directory
ƒ
• hi(K) = K mod M
before insert, local depth of bucket = global depth. Insert
causes local depth to become > global depth;
• Extend when collision
ƒ Split bucket n in two.
ƒ Distribute entries in bucket n based on hi+1(K) = (K mod 2M)
ƒ n=n+1
• Shrink Æ half directory
ƒ If removal of data entry makes bucket empty
ƒ If each directory element points to the same bucket
• Retrieve: if (K mod M)<n then return hi+1(K)
• Gain: no performance degradation due to the collision
else return hi(K)
• At the cost of: 2 block accesses per record (directory +
data), space for directory, and bucket reorganization.
september 2007
He Tan
hetan@ida.liu.se
IISLAB
IDA
• Shrink: the buckets also are combined linearly
51
51
Bucket C
september 2007
52
52
Indexes in reality – Oracle:
Cluster Index
Indexes in reality – MySQL
ƒ InnoDB storage engine
• Create a clustered index for each table
• Rows are physically ordered by the primary key
• B-trees
•
Keep together (on disk) what belongs together
Æ faster retrieval of data
•
A cluster is made up of a group of tables that share common
columns and are often used together.
EMP_DEPT
EmpNo EmpName EmpDeptNo
EMP
ƒ
100
101
102
103
Smith
Wilson
Jones
Baker
10
10
20
20
ClusterKey
Deptno
10
DEPT
DeptNo
10
20
DeptName
Sales
Admin
Deptno
20
DeptName
Sales
EmpNo
EmpName
100
Smith
101
Wilson
DeptName
Admin
EmpNo
EmpName
102
Jones
103
Baker
Unclustered Tables
Related data stored apart
Clustered Tables
Related data stored together
september 2007
53
TDDB38/TDDI60 - HT 2004
53
9
Data Structures for Databases
Sept 9, 2004
AHz-04.27-1.0
He Tan
hetan@ida.liu.se
IISLAB
IDA
Indexes in reality – Oracle:
Cluster Index
Indexes in reality – Oracle:
Bitmap Index
CREATE CLUSTER emp_dept (deptno NUMBER(3));
•
On columns having low or medium distinct values
CREATE TABLE dept (
•
can even index NULL values;
•
each bit in the bitmap corresponds to a possible record pointer
deptno NUMBER(3) PRIMARY KEY,
deptname VARCHAR2(10) NOT NULL )
CLUSTER emp_dept (deptno);
CREATE TABLE emp (
empno NUMBER(5) PRIMARY KEY,
Record
Pointer
0x011
0x012
0x022
0x023
0x034
empname VARCHAR2(15) NOT NULL,
empdeptno NUMBER(3) REFERENCES dept)
CLUSTER emp_dept (empdeptno);
CREATE INDEX emp_dept_index ON CLUSTER emp_dept;
september 2007
EmpSalary
Currency
Smith
Baker
Jones
Müller
Meier
2000
1900
1950
2020
2010
$
$
$
€
€
€
€
Currency
‘$’ ‘€
€ ’
Bitmap
1
1
1
0
0
0
0
0
1
1
55
55
He Tan
hetan@ida.liu.se
IISLAB
IDA
EmpName
Summary
• Index files (primary, clustering, secondary)
• Search trees, B+-trees
september 2007
57
TDDB38/TDDI60 - HT 2004
57
10
Download