Client-server caching and object stores Benjamin Atkin

advertisement
Client-server caching and
object stores
Benjamin Atkin
batkin@cs.cornell.edu
Client-server database
design
 Low-level considerations
 How can database systems exploit
powerful client machines?
 What implementation techniques are
required?
 High-level considerations
 What interface is provided to applications?
 How can we efficiently implement it?
Client-server caching
2
Overview




Client-server systems
Advantages of caching
Object-oriented databases
Wisconsin's Exodus storage manager
 Cache consistency and transactions
 Implementation of programming interface:
QuickStore
Client-server caching
3
Client-server systems




Simplify the client machines
Share services: filesystem, database, ...
Run on powerful, dedicated hosts
User machines are "clients" of servers
 enables data sharing
 centralised maintenance
 greater security
 e.g. Sun’s Network File System
Client-server caching
4
Networks of workstations
 c. 1990: more powerful clients
 Move some processing to clients





faster response time
better utilisation of client machines
less load on server
greater scalability
autonomy in the face of server failure
Client-server caching
5
Naive client-server data
access
"read blue object"
...
"read blue object"
Client-server caching
6
Client-server caching
"read blue object"
...
"read blue object"
Client-server caching
7
Caching principles
 Analogous to hardware caching
 Server stores the canonical copy of data
 Client caches the results of each read
 Subsequent accesses served from cache
 What if the data changes? Alternatives:
 "cheap to detect incorrect data", e.g. DNS
 "validate before use"
 "notify on change"
Client-server caching
8
Caching in distributed
file systems
 CMU's Andrew File System
 clients cache all files on local disks
 50 client machines for each server
 UC Berkeley's Sprite OS
 file cache completes with virtual memory
 Coda follow-up to AFS, UCLA's Ficus
 client can completely disconnect from server
 prediction algorithms to determine what to cache
Client-server caching
9
Disadvantages of caching
 Increases client workload, complexity
 We may cache the wrong data!
 potentially wasted network traffic
 uses valuable space in the cache
 Data consistency problem
 stale cached data
 simultaneous writeback
Client-server caching
10
Client-server caching
revisited
Michael J. Franklin and Michael J. Carey
Dividing the work
 Query shipping
 clients send queries to the server
 Data shipping
 clients request data from server
 transactions run locally
 potential for caching
Client-server caching
12
Why cache data?
 A client may
 read a data object repeatedly
 read and write an object
 execute multiple transactions on an object
 Cache an object and execute
transactions locally
 Write back final value on commit
Client-server caching
13
Database client caching
client
server
begin transaction
read A
cache A
write A
end transaction
store A
begin transaction
read A
read B
...
Client-server caching
14
The downside





Introduces a consistency problem
Increases work at client
Slower under some conditions
Potentially higher abort rates ...
...
Client-server caching
15
Caching in EXODUS
 Small objects are grouped in fixed-size
disk pages
 Caching and locking at the page level
 Client has buffer manager, lock
manager
 Franklin+Livny investigate the best
strategy for caching with transactions
Client-server caching
16
Alternatives for caching
 Intra-transaction versus intertransaction caching
 Caching locks as well as data
 Local versus global locking
 Optimistic versus centralised locking
 Invalidation versus propagation of
updates
Client-server caching
17
What to do on writeback?
begin transaction
...
fetch blue object
Client-server caching
18
What to do on writeback?
commit transaction
?
Client-server caching
propagate or
invalidate?
19
A taxonomy of strategies
 Primary-copy server 2PL
 Caching 2PL
 no lock caching, validate data before use
 Optimistic 2PL variants
 O2PL-Dynamic, O2PL-New Dynamic
 Callback locking
Client-server caching
20
Optimistic 2PL




During transaction acquire local locks
At commit, validate with server
Propagation variant requires 2PC
Dynamic variant's propagation heuristic
 page is resident at receiving site
 accessed since last propagation of page
 previously invalidated this page incorrectly
Client-server caching
21
Callback locking
 Global locks required during transaction
 On lock conflict, server callback to
revoke other locks
 No validation required on commit
 CB-Read: cache only read locks
 CB-All: cache write locks as well, lock
downgrade on conflict
Client-server caching
22
Experiments
 Vary data access patterns
 Vary bottlenecks in the system
Client-server caching
23
HOTCOLD workload,
slow network
Client-server caching
24
FEED workload,
slow network
Client-server caching
25
HICON workload,
fast network
Client-server caching
26
Summary
 CB-Read, O2PL-ND come out best
 CB-Read implemented in EXODUS
 lower abort rate than O2PL-ND
 scales better with data contention
 Natural consequences of the optimistic
approach?
Client-server caching
27
QuickStore: a highperformance mapped
object store
Seth J. White and David J. DeWitt
Object-oriented versus
object-relational DBs
 Distributed application support
 Persistent store for program data
 Access through programming language
(C++), not SQL
 Transactions over objects
Client-server caching
29
The programming
interface
 Application manipulates object
identifiers (OIDs)
 "Swizzling" resolves OID to the object
 hardware swizzling: OID is a pointer, use
VM manipulations to do mapping
 software swizzling: OID contains a pointer,
indirection
Client-server caching
30
Design alternatives
 White+DeWitt compares QuickStore, E
 QuickStore uses hardware swizzling
 E uses software swizzling, interpreter
 Both extend C++, over EXODUS
storage manager (ESM)
 All objects are accessed in transactions
Client-server caching
31
QuickStore structure
client
frame A
page a
buffer pool
ESM
Client-server caching
object store
32
Fine points of pointer
swizzling
 Complication: objects can contain
pointers to other objects
 When a page is mapped to a frame
 if a pointer in page points to a mapped
page, make it point to the correct frame
 otherwise, make it point to a new frame
 Use page protection to catch accesses
to non-mapped pages: Unix mmap
Client-server caching
33
Page faults
 Page frames in memory have protection
bits: read-only, no access, etc.
 Incorrect access generates a "fault"
 Protection faults can be handled by the
application itself
 In QS, reference to no-access frame =>
bring the page from the object server
Client-server caching
34
QuickStore page faults
client
0x180
0x223
?fault
object store
Client-server caching
35
The mapping procedure





Fault on a pointer dereference
Request a page from the server
Load into buffer pool
Rewrite pointers in page
Map buffer slot to required frame
Client-server caching
36
The ESM buffer manager
 Limited buffer pool space available
 Frame-to-page mapping may need to
be removed to reclaim a buffer slot
 Modified clock algorithm for page
replacement
Client-server caching
37
Optimisations




Rewriting pointers is expensive
Store pointers in disk pages
Try to remap page to its previous frame
Changing protection bits is expensive!
 Try and change many at a time
 Log optimisation with page diffs
Client-server caching
38
Hardware versus software
swizzling
 Page-level swizzling obscures object
identity
 Pointers to deleted objects are still valid
 Using VM pointers allows a more
compact OID representation
Client-server caching
39
Comparison: the OO7
benchmark
 Parts database representative of
"CAD/CAM/CASE application"
 Multiple possible database sizes
 Hierarchical structure, composite parts
 Benchmark operations specified
 traversals of parts tree
 queries which retrieve random parts
Client-server caching
40
Cold times, small database
Client-server caching
41
Hot times, small database
Client-server caching
42
Cold times, medium
database
Client-server caching
43
QuickStore and E
compared
 QuickStore is not necessarily better!
 E performs better with low locality
 Compact representation
 small database: 6.6MB versus 10.5MB
 medium database: 54.2MB versus 94.1MB
 Log optimisation reduces commit times
Client-server caching
44
Download