CS 245: Database System Principles Notes 03: Disk Organization Topics for today

advertisement
Topics for today
CS 245: Database System
Principles
• How to lay out data on disk
• How to move it to memory
Notes 03: Disk Organization
Hector Garcia-Molina
CS 245
Notes 3
1
What are the data items we want to store?
•
•
•
•
a
a
a
a
salary
name
date
picture
CS 245
Notes 3
2
What are the data items we want to store?
•
•
•
•
a
a
a
a
salary
name
date
picture
What we have available: Bytes
8
bits
CS 245
Notes 3
3
To represent:
CS 245
4
To represent:
• Integer (short): 2 bytes
• Characters
 various coding schemes suggested,
e.g., 35 is
00000000
00100011
most popular is ascii
Example:
A: 1000001
a:
1100001
5:
0110101
LF: 0001010
• Real, floating point
n bits for mantissa, m for exponent….
CS 245
Notes 3
Notes 3
5
CS 245
Notes 3
6
1
To represent:
• Boolean
e.g., TRUE
FALSE
To represent:
• Boolean
e.g., TRUE
FALSE
1111 1111
0000 0000
• Application specific
e.g., RED  1 GREEN  3
BLUE  2 YELLOW  4 …
1111 1111
0000 0000
• Application specific
e.g., RED  1 GREEN  3
BLUE  2 YELLOW  4 …
Can we use less than 1 byte/code?
Yes, but only if desperate...
CS 245
Notes 3
7
To represent:
Notes 3
9
To represent:
8
• String of characters
– Null terminated
e.g.,
c a
t
– Length given
e.g.,
3 c a
t
- Fixed length
CS 245
Notes 3
10
Key Point
• Bag of bits
Length
Notes 3
To represent:
• Dates
e.g.: - Integer, # days since Jan 1, 1900
- 8 characters, YYYYMMDD
- 7 characters, YYYYDDD
(not YYMMDD! Why?)
• Time
e.g. - Integer, seconds since midnight
- characters, HHMMSSFF
CS 245
CS 245
• Fixed length items
Bits
• Variable length items
- usually length given at beginning
CS 245
Notes 3
11
CS 245
Notes 3
12
2
Overview
Also
Data Items
Records
• Type of an item: Tells us how to
interpret
(plus size if fixed)
Blocks
Files
Memory
CS 245
Notes 3
13
Record - Collection of related data
items (called FIELDS)
Notes 3
14
• Main choices:
– FIXED vs VARIABLE FORMAT
– FIXED vs VARIABLE LENGTH
15
Fixed format
CS 245
Notes 3
16
Example: fixed format and length
A SCHEMA (not record) contains
following information
- # fields
- type of each field
- order in record
- meaning of each field
CS 245
Notes 3
Types of records:
E.g.: Employee record:
name field,
salary field,
date-of-hire field, ...
CS 245
CS 245
Notes 3
Employee record
(1) E#, 2 byte integer
(2) E.name, 10 char.
(3) Dept, 2 byte code
17
CS 245
55 s m i t h
02
83 j o n e s
01
Notes 3
Schema
Records
18
3
Variable format
Example: variable format and length
# Fields
Code identifying
field as E#
Integer type
46 4 S 4
F O RD
Code for Ename
String type
Length of str.
2 5 I
• Record itself contains format
“Self Describing”
Field name codes could also be strings, i.e. TAGS
CS 245
Notes 3
19
Variable format useful for:
CS 245
Notes 3
20
• EXAMPLE: var format record with
repeating fields
Employee  one or more  children
• “sparse” records
• repeating fields
• evolving formats
3
E_name: Fred
Child: Sally Child: Tom
But may waste space...
CS 245
Notes 3
21
Note: Repeating fields does not imply
- variable format, nor
- variable size
John
Sailing
Chess
CS 245
Notes 3
22
Note: Repeating fields does not imply
- variable format, nor
- variable size
--
John
Sailing
Chess
--
• Key is to allocate maximum number of
repeating fields (if not used  null)
CS 245
Notes 3
23
CS 245
Notes 3
24
4
Many variants between
fixed - variable format:
Record header - data at beginning
that describes record
Example: Include record type in record
5
27
May contain:
- record type
- record length
- time stamp
- other stuff ...
....
record type
record length
tells me what
to expect
(i.e. points to schema)
CS 245
Notes 3
25
Exercise: How to store XML data?
<table>
<description> people on the fourth floor <\description>
<people>
<person>
<name> Alan <\name>
<age> 42 <\age>
<email> agb@abc.com <\email>
<\person>
<person>
<name> Sally <\name>
<age> 30 <\age>
<email> sally@abc.com <\email>
<\person>
<\people>
<\table>
CS 245
Notes 3
CS 245
trusted
processor
E(r)
• Compression
from:
Data on the Web,
Abiteboul et al
– within record - e.g. code selection
– collection of records - e.g. find common
patterns
• Encryption
27
CS 245
Notes 3
Notes 3
28
Encrypting Records
search
F(r)=x
dbms
trusted
processor
??
E(r1)
E(r2)
E(r3)
E(r4)
...
CS 245
26
Other interesting issues:
Encrypting Records
new
record
r
Notes 3
dbms
E(r1)
E(r2)
E(r3)
E(r4)
...
29
CS 245
Notes 3
30
5
Search Key in the Clear
search
k=2
trusted
processor
Q: k=2
dbms
[1,
[2,
[3,
[4,
Encrypt Key
search
k=2
A: [2, E(b2)]
• each record is [k,b]
• store [E(k), E(b)]
• can search for records with k=E(x)
31
CS 245
Issues
Notes 3
32
How Do We Search Now?
???
• Hard to do range queries
• Encryption not good
• Better to use encryption that does not
always generate same cyphertext
search
k=2
trusted
processor
Q: k’=E(2)
dbms
A: [E(2,dhe), E(b2)]
[E(2, lkz), E(b4)]
[E(1, abc), E(b1)]
[E(2, dhe), E(b2)]
[E(3, nft), E(b3)]
[E(2, lkz), E(b4)]
...
k
E(k, random)
E
A: [E(2), E(b2)]
dbms
[E(1), E(b1)]
[E(2), E(b2)]
[E(3), E(b3)]
[E(4), E(b4)]
...
Notes 3
k
Q: k’=E(2)
E(b1)]
E(b2)]
E(b3)]
E(b4)]
...
• each record is [k,b]
• store [k, E(b)]
• can search for records with k=x
CS 245
trusted
processor
• each record is [k,b]
• store [E(k, rand), E(b)]
• can search for records with k=E(x,???)?
D
simplification
CS 245
Notes 3
33
CS 245
Notes 3
Solution?
34
Solution?
• Develop new decryption function:
D(f(k1), E(k2, rand)) is true if k1=k2
• Develop new decryption function:
D(f(k1), E(k2, rand)) is true if k1=k2
search
k=2
Q: check if D(f(2),*) true
trusted
processor
dbms
A: [E(2,dhe), E(b2)]
[E(2, lkz), E(b4)]
[E(1, abc), E(b1)]
[E(2, dhe), E(b2)]
[E(3, nft), E(b3)]
[E(2, lkz), E(b4)]
...
CS 245
Notes 3
35
CS 245
Notes 3
36
6
What are choices/issues with
data compression?
Issues?
• Cannot do non-equality predicates
• Hard to build indexes
• Leaving search keys uncompressed not as
bad
• Larger compression units:
– better for compression efficiency
– worse for decompression overhead
• Similar data compresses better
– compress columns?
CS 245
Notes 3
37
Next: placing records into blocks
CS 245
Notes 3
38
Next: placing records into blocks
assume fixed
length blocks
blocks
...
blocks
...
a file
CS 245
Notes 3
a file
39
Options for storing records in blocks:
(1)
(2)
(3)
(4)
CS 245
assume a single file (for now)
Notes 3
40
(1) Separating records
Block
separating records
spanned vs. unspanned
sequencing
indirection
Notes 3
CS 245
R1
R2
R3
(a) no need to separate - fixed size recs.
(b) special marker
(c) give record lengths (or offsets)
- within each record
- in block header
41
CS 245
Notes 3
42
7
(2) Spanned vs. Unspanned
With spanned records:
• Unspanned: records must be within one
block
block 1
R1
block 2
R2
R3
R1
...
R4 R5
• Spanned
block 1
R1
R2
R3
(a)
R2
R3
R4
(b)
R5
R6
R7
(a)
need indication
need indication
of partial record
“pointer” to rest
of continuation
(+ from where?)
block 2
R3
(a)
CS 245
R3
R4
(b)
R5
R6
R7
(a)
...
Notes 3
43
• Unspanned is much simpler, but may
waste space…
• Spanned essential if
record size > block size
Notes 3
Notes 3
44
(3) Sequencing
Spanned vs. unspanned:
CS 245
CS 245
• Ordering records in file (and block) by
some key value
Sequential file (  sequenced)
45
CS 245
Notes 3
46
Why sequencing?
Sequencing Options
Typically to make it possible to efficiently
read records in order
(a) Next record physically contiguous
(e.g., to do a merge-join — discussed later)
R1
Next (R1)
...
(b) Linked
R1
CS 245
Notes 3
47
CS 245
Next (R1)
Notes 3
48
8
Sequencing Options
(c)
Sequencing Options
Overflow area
Records
in sequence
(c)
header
R1
Records
in sequence
R1
R2
R3
CS 245
Overflow area
R4
R4
R5
R5
Notes 3
49
(4) Indirection
R2.1
R2
R3
CS 245
R1.3
R4.7
Notes 3
50
(4) Indirection
• How does one refer to records?
• How does one refer to records?
Rx
Rx
Many options:
Physical
CS 245
Notes 3
51
CS 245
Purely Physical
E.g., Record
Address
or ID
CS 245
=
Notes 3
52
Fully Indirect
E.g., Record ID is arbitrary bit string
Device ID
Cylinder #
Block ID
Track #
Block #
Offset in block
Notes 3
Indirect
map
rec ID
r
53
CS 245
Rec ID Physical
addr.
Notes 3
address
a
54
9
Tradeoff
Flexibility
Cost
to move records
Physical
of indirection
Many options
in between …
(for deletions, insertions)
CS 245
Notes 3
55
Header
Free space
R3
R4
R2
Notes 3
57
Options for storing records in blocks:
(1)
(2)
(3)
(4)
Notes 3
56
May contain:
A block:
CS 245
CS 245
Block header - data at beginning that
describes block
Example: Indirection in block
R1
Indirect
- File ID (or RELATION or DB ID)
- This block ID
- Record directory
- Pointer to free space
- Type of block (e.g. contains recs type 4;
is overflow, …)
- Pointer to other blocks “like it”
- Timestamp ...
CS 245
Notes 3
58
Case Study: salesforce.com
• salesforce.com provides CRM services
• salesforce customers are tenants
• Tenants run apps and DBMS as service
separating records
spanned vs. unspanned
sequencing
indirection
tenant A
tenant B
salesforce.com
CRM App
data
tenant C
CS 245
Notes 3
59
CS 245
Notes 3
60
10
Tenants have similar data
Options for Hosting
• Separate DBMS per tenant
• One DBMS, separate tables per tenant
• One DBMS, shared tables
tenant 1:
tenant 2:
CS 245
Notes 3
61
salesforce.com solution
customer tenant
1
1
2
2
A
a1
a2
a3
a1
cust-other tenant
1
1
2
3
CS 245
B
b1
b2
b3
b1
A
a1
a2
a1
a4
C
c1
c2
c2
c1
f1
D
E
G
D
Notes 3
customer A
a3
a1
a4
CS 245
B
b3
b1
-
C
c2
c1
-
D G
- - g1
d1
Notes 3
62
Other Topics
fixed schema for
all tenants
v1 f2 v2 ...
d1 E e1
e2 F f2
g1
d1
customer A B C D E F
a1 b1 c1 d1 e1 a2 b2 c2 - e2 f2
(1) Insertion/Deletion
(2) Buffer Management
(3) Comparison of Schemes
var schema for
all tenants
63
CS 245
Notes 3
64
Options:
Deletion
(a) Immediately reclaim space
(b) Mark deleted
Block
Rx
CS 245
Notes 3
65
CS 245
Notes 3
66
11
Options:
As usual, many tradeoffs...
(a) Immediately reclaim space
(b) Mark deleted
• How expensive is to move valid record
to free space for immediate reclaim?
• How much space is wasted?
– May need chain of deleted records
(for re-use)
– e.g., deleted records, delete fields, free
space chains,...
– Need a way to mark:
• special characters
• delete field
• in map
CS 245
Notes 3
67
Concern with deletions
CS 245
Notes 3
68
Solution #1: Do not worry
Dangling pointers
R1
CS 245
?
Notes 3
69
CS 245
Notes 3
70
Solution #2: Tombstones
Solution #2: Tombstones
E.g., Leave “MARK” in map or old location
E.g., Leave “MARK” in map or old location
• Physical IDs
A block
This space
never re-used
CS 245
Notes 3
71
CS 245
This space can
be re-used
Notes 3
72
12
Solution #2: Tombstones
Insert
E.g., Leave “MARK” in map or old location
Easy case: records not in sequence
 Insert new record at end of file or
in deleted slot
 If records are variable size, not
as easy...
• Logical IDs
map
ID
LOC
Never reuse
ID 7788 nor
space in map...
7788
CS 245
Notes 3
73
Insert
CS 245
Notes 3
74
Interesting problems:
Hard case: records in sequence
 If free space “close by”, not too bad...
 Or use overflow idea...
CS 245
Notes 3
75
• How much free space to leave in each
block, track, cylinder?
• How often do I reorganize file + overflow?
CS 245
Notes 3
76
Buffer Management
Free
space
CS 245
•
•
•
•
•
•
Notes 3
77
DB features needed
Why LRU may be bad
Pinned blocks
Forced output
Double buffering
Swizzling
CS 245
Notes 3
Read
Textbook!
in Notes02
78
13
Swizzling
Swizzling
Memory
Disk
block 1
Rec A
CS 245
Memory
block 1
block 1
block 2
block 2
Notes 3
79
Disk
block 1
Rec A
CS 245
Rec A
Notes 3
80
Row vs Column Store
Row Store
• So far we assumed that fields of a
record are stored contiguously (row
store)...
• Another option is to store like fields
together (column store)
• Example: Order consists of
CS 245
– id, cust, prod, store, price, date, qty
Notes 3
81
id1
cust1
prod1
store1
price1
date1
qty1
id2
cust2
prod2
store2
price2
date2
qty2
id3
cust3
prod3
store3
price3
date3
qty3
CS 245
Notes 3
Column Store
Row vs Column Store
• Example: Order consists of
• Advantages of Column Store
– id, cust, prod, store, price, date, qty
id1
id2
id3
id4
...
cust1
cust2
cust3
cust4
...
id1
id2
id3
id4
...
prod1
prod2
prod3
prod4
...
id1
id2
id3
id4
...
price1
price2
price3
price4
...
block 2
qty1
qty2
qty3
qty4
...
82
– more compact storage (fields need not
start at byte boundaries)
– efficient reads on data mining operations
• Advantages of Row Store
– writes (multiple fields of one record)more
efficient
– efficient reads for record access (OLTP)
ids may or may not be stored explicitly
CS 245
Notes 3
83
CS 245
Notes 3
84
14
Interesting paper to read:
Comparison
• Mike Stonebreaker, Elizabeth (Betty)
O'Neil, Pat O’Neil, Xuedong Chen, et al.
" C-Store: A Column-oriented DBMS,"
Presented at the 31st VLDB Conference,
September 2005.
• http://www.cs.umb.edu/%7Eponeil/
vldb05_cstore.pdf
• There are 10,000,000 ways to organize
my data on disk…
CS 245
Notes 3
85
Space Utilization
Complexity
Performance
CS 245
Notes 3
CS 245
-
87
Example
86
fetch record given key
fetch record with next key
insert record
append record
delete record
update record
read all file
reorganize file
CS 245
Notes 3
88
Summary
How would you design Megatron 3000
storage system? (for a relational DB, low end)
• How to lay out data on disk
Data Items
– Variable length records?
– Spanned?
– What data types?
– Fixed format?
– Record IDs ?
– Sequencing?
– How to handle deletions?
CS 245
Notes 3
To evaluate a given strategy, compute
following parameters:
-> space used for expected data
-> expected time to
Issues:
Flexibility
Which is right for me?
Notes 3
Records
Blocks
Files
Memory
DBMS
89
CS 245
Notes 3
90
15
Next
How to find a record quickly,
given a key
CS 245
Notes 3
91
16
Download