Why and How to Build a Trusted Database System Radek Vingralek

advertisement
Why and How to Build a Trusted Database System
on Untrusted Storage?
Radek Vingralek
STAR Lab, InterTrust Technologies
In collaboration with U. Maheshwari and W. Shapiro
1
What?
 Trusted Storage
 can be read and written
only by trusted programs
Stanford Database Seminar
2
Why?
Digital Rights Management
contract
content
Stanford Database Seminar
3
What? Revisited
volatile memory
processor
untrusted
storage
trusted storage
<50B
Stanford Database Seminar
4
What? Refined
 Must protect also against accidental data corruption
•
•
•
•
atomic updates
efficient backups
type-safe interface
automatic index maintenance
 Must run in an embedded environment
• small footprint
 Must provide acceptable performance
Stanford Database Seminar
5
What? Refined
 Can assume single-user workload
• none or a simple concurrency control
• optimized for response time, not throughput
• lots of idle time (can be used for database reorganization)
 Can assume a small database
• 100 KB to 10 MB
• can cache the working set
– no-steal buffer management
Stanford Database Seminar
6
A Trivial Solution
plaintext data
 Critique:
key
H(db)
encryption, hashing
COTS dbms
• does not protect
metadata
• cannot use sorted
indexes
trusted storage
db untrusted storage
Stanford Database Seminar
7
A Better Solution
plaintext data
 Critique:
• must scan, hash
and crypt the
entire db to read
or write
(COTS) dbms
encryption, hashing
db untrusted storage
key
H(db)
trusted storage
Stanford Database Seminar
8
Yet A Better Solution
plaintext data
 Open issues:
(COTS) dbms
key
H(A)
encryption, hashing
A
H(B)
H(C)
B
H(D)
D
• could we do better
than a logarithmic
overhead?
• could we integrate
the tree search
with data location?
C
H(E)
H(F)
E
H(G)
F
untrusted storage
G
Stanford Database Seminar
9
TDB Architecture
Backup Store
• full / incremental
• validated restore
Collection Store
• index maintenance
• scan, match, range
Collections of
Objects
Object Store
• object cache
• concurrency control
Object
• abstract type
Chunk Store
• encryption, hashing
• atomic updates
Chunk
• byte sequence
• 100B--100KB
Untrusted
storage
Trusted
storage
Stanford Database Seminar
10
Chunk Store - Specification
 Interface
• allocate() -> ChunkId
• write( ChunkId, Buffer )
• read( ChunkId ) -> Buffer
• deallocate( ChunkId )
 Crash atomicity
• commit = [ write | deallocate ]*
 Tamper detection
• raise an exception if chunk validation fails
Stanford Database Seminar
11
Chunk Store – Storage Organization
 Log-structured Storage Organization
• no static representation of chunks outside of the log
• log in the untrusted storage
 Advantages
•
•
•
•
•
traffic analysis cannot link updates to the same chunk
atomic updates for free
easily supports variable-sized chunks
copy-on-write snapshots for fast backups
integrates well with hash verification (see next slide)
 Disadvantages
• destroys clustering (cacheable working set)
• cleaning overhead (expect plenty of idle time)
Stanford Database Seminar
12
Chunk Store - Chunk Map
 Integrates hash tree and location map
• Map: ChunkId  Handle
• Handle = ‹Hash, Location›
• MetaChunk = Array[Handle]
trusted storage
H(R)
R
meta chunks
S
T
data chunks
X
Stanford Database Seminar
Y
13
Chunk Store - Read
 Basic scheme: Dereference handles from root to X
 Derefence
trusted storage
H(R)
• use location to fetch
• use hash to validate
cached
 Optimized
•
•
•
•
trusted cache: ChunkId  Handle
look for cached handle upward from X
derefence handles down to X
avoids validating entire path
X
Stanford Database Seminar
R
S
T
Y
14
Chunk Store - Write
 Basic: write chunks from X to root
 Optimized:
trusted storage
H(R)
• buffer dirty handle of X in cache
• defer upward propagation
R
S
dirty
T
X
Stanford Database Seminar
Y
15
Chunk Store - Checkpointing the Map
 When dirty handles fill cache
• write affected meta chunks to log
• write root chunk last
trusted storage
H(R)
X ...
X
... T S R
meta chunks
Stanford Database Seminar
16
Chunk Store - Crash Recovery
 Process log from last root chunk
• residual log
• checkpointed log
 Must validate residual log
trusted storage
H(R)
X ...
X
... T S R ...
Y
...
crash
residual log
Stanford Database Seminar
17
Chunk Store - Validating the Log
 Keep incremental hash of residual log in trusted storage
• updated after each commit
 Hash protects all current chunks
• in residual log: directly
• in checkpointed log: through chunk map
trusted storage
H*(residual-log)
X ...
X
... T S R ...
Y
...
crash
residual log
Stanford Database Seminar
18
Chunk Store - Counter-Based Log Validation
 A commit chunk is written with each commit
• contains a sequential hash of commit set
• signed with system secret key
 One-way counter used to prevent replays
 Benefits:
• allows bounded discrepancy between trusted and untrusted
storage
• doesn’t require writing to trusted storage after each
transaction
hash
hash
X ...
X
... T S R
c.c.
73
X
c.c.
74
...
crash
residual log
Stanford Database Seminar
19
Chunk Store - Log Cleaning
 Log cleaner creates free space by reclaiming obsolete chunk
versions
 Segments
• Log divided into fixed-sized regions called segments ( ~100 KB)
• Segments are securely linked in the residual log for recovery
 Cleaning step
• read 1 or more segments
• check chunk map to find live chunk versions
– ChunkId’s in the headers of chunk versions
• write live chunk versions to the end of log
• mark segments as free
 May not clean segments in residual log
Stanford Database Seminar
20
Chunk Store - Multiple Partitions
 Partitions may use separate crypto parameters (algorithms, keys)
 Enables fast copy-on-write snapshots and efficient backups
 More difficult for the cleaner to test chunk version liveness
Partition Map
Partition Map
Q
P
Q
P
Position Maps
Position Maps
Data chunks
Data chunks
Stanford Database Seminar
D
D2
21
Chunk Store - Cleaning and Partition Snapshots
P updates c
Snaphot PQ
Q&P
Q
P.a P.b P.c
P.a P.b P.c
Cleaner moves Q’s c
P
Q
P.c
P.a P.b P.c
Checkpoint
P.a
P.b
P.c
...
P.c
P
...
P.c
P.c
Crash!!
P.c
...
Residual log
Stanford Database Seminar
22
Backup Store
 Creates and restores backups of partitions
 Backups can be full or incremental
 Backup creation utilizes snapshots to guarantee backup
consistency (wrt concurrent updates) without locking
 Supports full and incremental backups of partitions
 Backup Store must verify during a backup restore
• integrity of the backup (using a signature)
• correctness of incremental restore sequencing
Stanford Database Seminar
23
Object Store
 Provides type-safe access to named C++ objects
• objects provide pickle and unpickle methods for persistence
• but no transparent persistence
 Implements full transactional semantics
• in addition to atomic updates
 Maps each object into a single chunk
• less data written and read from the log
• simplifies concurrency control
 Provides an in-memory cache of decrypted, validated,
unpickled, type-checked C++ objects
 Implements no-steal buffer management policy
Stanford Database Seminar
24
Collection Store
 Provides access to indexed collections of C++ objects using
scan, exact match and range queries
 Performs automatic index maintenance during updates
• implements insensitive iterators
 Uses functional indices
• an extractor function is used to obtain a key from an object
 Collections and indexes are represented as objects
• index nodes locked according to 2PL
Stanford Database Seminar
25
Performance Evaluation - Benchmark
 Compared TDB to BerkeleyDB using TPC-B
 Used TPC-B because:
• implementation included with BerkeleyDB
• BerkeleyDB functionality limited choice of benchmarks (e.g., 1
index per collection)
Stanford Database Seminar
26
Performance Evaluation - Setup
 Evaluation platform
•
•
•
•
•
733 MHz Pentium II, 256 MB
Windows NT 4.0, NTFS files
EIDE disk, 8.9 ms (read), 10.9 ms write seek time
7200 RPM (4.2 ms avg. rot. latency)
one-way counter: file on NTFS
 Both systems used a 4 MB cache
 Crypto parameters (for secure version of TDB):
• SHA-1 for hashing (hash truncated to 12 B)
• 3DES for encryption
Stanford Database Seminar
27
Performance Evaluation - Results
 Response Time (avg over 100,000 transactions in a steady state):
 TDB utilization was set to 60%
avg. response time (ms)
8
7
6.8
5.8
6
5
3.8
4
3
2
1
0
BerkeleyDB
TDB
Stanford Database Seminar
TDB-S
28
Response Time vs. Utilization
 Measured response times for different TDB utilizations:
avg. response time (ms)
8
7
6
5
TDB
4
BerkeleyDB
3
2
1
0
0.5
0.6
0.7
0.8
0.9
utilization
Stanford Database Seminar
29
Related Work
 Theoretical work
• Merkle Tree 1980
• Checking correctness of memory (Blum, et. al. 1992)
 Secure audit logs, Schneier & Kelsey 1998
• append-only data
• read sequentially
 Secure file systems
• Cryptographic FS, Blaze ‘93
• Read-only SFS, Fu et al. ‘00
• Protected FS, Stein et al. ‘01
Stanford Database Seminar
30
A Retrospective Instead of Conclusions
 Got lots of mileage from using log-structured storage
 Partitions add lots of complexity
 Cleaning not a big problem
 Crypto overhead small on modern PCs (< 6%)
 Code footprint too large for many embedded systems
• needs to be within 10 KB
• GnatDb (see a TR)
 For More Information:
• OSDI 2000 -- “How to Build a Trusted Database System on
Untrusted Storage.” U. Maheshwari, R. Vingralek, W. Shapiro
• Technical Reports available at http://www.star-lab.com/tr/
Stanford Database Seminar
31
Database Size vs. Utilization
350
database size (MB)
300
250
200
TDB
150
BerkeleyDB
100
50
0
0.5
0.6
0.7
0.8
0.9
utilization
Stanford Database Seminar
32
Stanford Database Seminar
33
Download