Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

advertisement
Composable Consistency for
Wide Area Replication
Sai Susarla
Advisor: Prof. John Carter
1
PhD Dissertation Defense
Overview
Goal: middleware support for wide area caching in diverse
distributed applications
Key Hurdle: flexible consistency management
Our Solution: novel consistency interface/model Composable Consistency
Benefit: supports broader set of sharing needs than existing
models. Examples:
 file systems, databases, directories, collaborative apps – wider
variety than any existing consistency model can support
Demo Platform: novel P2P middleware data store - Swarm
2
PhD Dissertation Defense
Caching: Overview
The Idea: cache frequently used items locally for quick retrieval
Benefits
 Within cluster: load-balancing, scalability
 Across WAN: lower latency, improved throughput & availability
Applications
 Data stored in one place, accessed from multiple locations
 Examples:
» File system: personal files, calendars, log files, software, …
» Database: online shopping, inventory, auctions, …
» Directory: DNS, LDAP, Active Directory, KaZaa, …
» Collaboration: chat, multi-player games, meetings, …
3
PhD Dissertation Defense
Centralized Service
Primary
server cluster
client
user
Internet
4
PhD Dissertation Defense
Proxy-based Caching
Primary
server cluster
Consistency
protocol
client
user
Internet
Caching proxy
Server cluster
5
PhD Dissertation Defense
Caching: The Challenge
Applications have diverse consistency needs
Application
6
Sharing Characteristics
Consistency needs
Static web content, Read-mostly
media, s/w updates
Stale data, manual reload ok
Chat, whiteboard
Concurrent appends
Real-time sync, causal msg order
Auctions, ticket
sales, Financial DB
Write-sharing, conflicts,
varying contention
Serializability, latest data,
atomicity (ACID)
…
…
…
PhD Dissertation Defense
Caching: The Problem
Consistency requirements are diverse
Caching is difficult over WANs
 Variable delays, node failures, network partitions, admin domains, …
Thus, most WAN applications either:
 Roll their own caching solution, or
 Do not cache and live with the latency
Can we do better?
7
PhD Dissertation Defense
Thesis
"A consistency management system that provides
a small set of customizable consistency mechanisms
can efficiently satisfy the data sharing needs of
a wide variety of distributed applications."
8
PhD Dissertation Defense
Outline
Further Motivation
Application study  new taxonomy to classify
application sharing needs
Composable Consistency (CC) model
 Novel interface to express consistency semantics for each access
 Small option set can express more diverse semantics
Evaluation
9
PhD Dissertation Defense
Existing Models are Inadequate
Provide a few packaged consistency semantics for
specific needs:
 e.g., optimistic/eventual, close-to-open, strong
Or, lack enough flexibility to support diverse needs
 TACT (cannot express weak consistency or session semantics)
 Bayou (cannot support strong consistency)
Or, leave consistency management burden on
applications
 e.g., Oceanstore, Globe
10
PhD Dissertation Defense
Existing Middleware is Inadequate
Existing middleware support specific sharing needs





11
Read-only data: PAST, BitTorrent
Rare write-sharing: file systems (NFS, Coda, Ficus …)
Master-slave (read-only) replication: storage vendors, mySQL
Scheduled (nightly) replication: storage and DB services
Read-write replication in a cluster: commercial DB vendors, Petal
PhD Dissertation Defense
Application Survey
40+ applications with diverse consistency needs
12
Application
Sharing Characteristics
Consistency needs
Static web content,
media, s/w updates
Read-mostly
Stale data, manual reload ok
Stock quotes
Read-only
Limit max. staleness to T secs
Chat, whiteboard
Concurrent appends
Real-time sync, causal msg order
Multiplayer game
Heavy write-sharing
Real-time sync, totally order play moves
Auctions, ticket
sales, Financial DB
Write-sharing, conflicts,
varying contention
Serializability, latest data, atomicity
(ACID)
Personal file access
Rare write-sharing
Eventual consistency
Mobile file access,
collaborative sharing
Sequential write-sharing
Latest data, session semantics
Directory, calendars,
groupware
Write-sharing, mergeable
writes
Tight sync within campus,
relaxed sync across campuses
PhD Dissertation Defense
Survey Results
Found common issues, overlapping choices





Are parallel read and writes ok?
How often should replicas synchronize?
Does update order matter?
What if some copies are inaccessible?
…
Can we exploit this commonality?
13
PhD Dissertation Defense
Composable Consistency:
Novel interface to express consistency semantics
Concurrency
control
Access mode
Concurrent
Exclusive
Sync frequency
Manual
push, pull
T-seconds stale,
N missed writes
Replica
synchronization
Strength
Hard
Soft
Causality
Yes
No
Atomicity
Yes
No
Update ordering
None
Total
Inaccessible copy
Ignore
Fail access
Accept updates
Session
Immediately
Reveal updates
On close()
Immediately
Failure handling
View Isolation
Update Visibility
14
Serial
PhD Dissertation Defense
Example: Close-to-open (AFS)
Allow parallel reads and
writes
Latest data guaranteed
at open()
Fail access when
partitioned
Accept remote updates
only at open()
Reveal local updates to
others only on close()
15
Access mode
Concurrent
Exclusive
Sync frequency
Manual
push, pull
0 seconds
stale
Strength
Hard
Soft
Causality
Yes
No
Atomicity
Yes
No
Update ordering
None
Total
Inaccessible copy
Ignore
Fail access
Accept updates
Session
Immediately
Reveal updates
On close()
Immediately
Serial
PhD Dissertation Defense
Example: Eventual Consistency (Bayou)
Allow parallel reads and
writes
Sync copies at most
once every 10 minutes
Syncing should not
block or fail operations
Accept remote updates
as they arrive
Reveal local updates to
others as they happen
16
Access mode
Concurrent
Exclusive
Sync frequency
Manual
push, pull
10 minutes
stale
Strength
Hard
Soft
Causality
Yes
No
Atomicity
Yes
No
Update ordering
None
Total
Inaccessible copy
Ignore
Fail access
Accept updates
Session
Immediately
Reveal updates
On close()
Immediately
Serial
PhD Dissertation Defense
Handling Conflicting Semantics
What if two sessions have different semantics?
 If conflicting, block a session until conflict goes away (serialize)
 Otherwise, allow them in parallel
Simple rules for checking conflicts (conflict matrix)
Examples:
 Exclusive write vs. exclusive read vs. eventual write: serialize
 Write-immediate vs. session-grain isolation: serialize
 Write-immediate vs. eventual read: no conflict
17
PhD Dissertation Defense
Using Composable Consistency
Perform data access within a session e.g.,
 session_id = open(object, CC_option_vector);
 read(session_id, buf);
 write(session_id, buf);
OR, update(session_id, incr_counter(value));
 close(session_id);
Specify consistency semantics per-session at open()
via the CC option vector
 Concurrency control, replica synchronization, failure handling, view
isolation and update visibility.
System enforces semantics by mediating each access
18
PhD Dissertation Defense
Composable Consistency Benefits
Powerful: Small option set can express diverse semantics
Customizable: allows different semantics for each access
Effective: amenable to efficient WAN implementation
Benefit to middleware
 Can provide read-write caching to a broader set of apps.
Benefit for an application
 Can customize consistency to diverse and varying sharing needs
 Can simultaneously enforce different semantics on the same data for
different users
19
PhD Dissertation Defense
Evaluation
20
PhD Dissertation Defense
Swarm: A Middleware Providing CC
Swarm:




Shared file interface with CC options
Location-transparent page-grained file access
Aggressive P2P caching
Dynamic cycle-free replica hierarchy per file
Prototype implements CC (except causality & atomicity)
 Per-file, per-replica and per-session consistency
Network economy (exploit nearby replicas)
Contention-aware replication (RPC vs caching)
Multi-level leases for failure resilience
21
PhD Dissertation Defense
Client-server BerkeleyDB Application
App users
App users
LAN
LAN
Primary
App server
Internet
App logic
DB
kernel
22
FS
PhD Dissertation Defense
BerkeleyDB Application using Swarm
App users
App users
LAN
LAN
Primary
App server
App
logic
RDB wrapper
Swarm server
RDB plugin
DB
kernel
23
Internet
DB
FS
PhD Dissertation Defense
Caching Proxy App Server using Swarm
App users
App users
LAN
LAN
Primary
App server
App
logic
RDB wrapper
Internet
Proxy
App server
Swarm server
Swarm server
RDB plugin
DB
RDB plugin
DB
24
DB
DB
kernel
kernel
App
logic
RDB wrapper
FS
FS
PhD Dissertation Defense
Swarm-based Applications
SwarmDB: Transparent BerkeleyDB database
replication across WAN
SwarmFS: wide area P2P read-write file system
SwarmProxy: Caching WAN proxies for an auction
service with strong consistency
SwarmChat: Efficient message/event dissemination
No single model can support the sharing needs of
all these applications
25
PhD Dissertation Defense
SwarmDB: Replicated BerkeleyDB
Replication support built as wrapper library
Uses unmodified BerkeleyDB binary
Evaluated with five consistency flavors:
 Lock-based updates, eventual reads
 Master-slave writes, eventual reads
 Close-to-open reads, writes
 Staleness-bounded reads, writes
 Eventual reads, writes
Compared against BerkeleyDB-provided RPC version
Order-of-magnitude throughput gains over RPC by
relaxing consistency
26
PhD Dissertation Defense
SwarmDB Evaluation
BerkeleyDB B-tree index replicated across N nodes
Nodes linked via 1Mbps links to common router
40ms RTT to each other
Full-speed workload
 30% Writes: inserts, deletes, updates
 70% Reads: lookups, cursor scans
Varied # replicas from 1 to 48
27
PhD Dissertation Defense
SwarmDB Write Throughput/replica
Local SwarmDB server
Optimistic
20msec stale
10msec stale
RPC over WAN
Master-slave writes,
eventual reads
Close-to-open
Locking writes,
eventual reads
28
PhD Dissertation Defense
SwarmDB Query Throughput/replica
Local SwarmDB
server
Optimistic
RPC over WAN
10msec stale
Close-to-open
29
PhD Dissertation Defense
SwarmDB Results
Customizing consistency can improve WAN caching
performance dramatically
App can enforce diverse semantics by simply
modifying CC options
Updates & queries with different semantics possible
30
PhD Dissertation Defense
SwarmFS Distributed File System
Sample SwarmFS path
 /swarmfs/swid:0x1234.2/home/sai/thesis.pdf
Performance Summary




31
Achieves >80% of local FS performance on Andrew Benchmark
More network-efficient than Coda for wide area access
Correctly supports fine-grain collaboration across WANs
Correctly supports file locking for RCS repository sharing
PhD Dissertation Defense
SwarmFS: Distributed Development
32
PhD Dissertation Defense
Replica Topology
33
PhD Dissertation Defense
SwarmFS vs. Coda Roaming File Access
Compile Latency from Cold Cache
Coda-s
Network Economy
SwarmFS
300
Coda-s always gets files
from distant U1.
SwarmFS gets files from
nearest copy.
seconds
250
200
150
100
50
0
U1
34
I1,
24ms
C1,
50ms
T1,
160ms
F1,
130ms
PhD Dissertation Defense
SwarmFS vs. Coda Roaming File Access
P2P protocol more
efficient
Compile Latency from Cold Cache
Coda-s
Coda-s writes files
through to U1 for close-toopen semantics.
Hence, SwarmFS performs
better for temporary files.
300
250
seconds
Swarm’s P2P pull-based
protocol avoids this.
SwarmFS
200
150
100
50
0
U1
I1,
24ms
C1,
50ms
T1,
160ms
F1,
130ms
LAN#-node#, RTT to Home (U1)
35
PhD Dissertation Defense
SwarmFS vs. Coda Roaming File Access
Compile Latency from Cold Cache
Eventual consistency
inadequate
Coda-s
300
Coda-w behaves incorrectly
Trickle reintegration pushed
huge obj files to U1,
clogging network link.
seconds
linker found corrupt
object files.
Coda-w
Coda-w
Compile
errors
250
`make’ skipped files
SwarmFS
200
150
100
50
0
U1
36
I1,
24ms
C1,
50ms
T1,
160ms
F1,
130ms
PhD Dissertation Defense
Evaluation Summary
SwarmDB: gains of customizable consistency
SwarmFS: network economy under write-sharing
SwarmProxy: strong consistency over WANs under
varying contention
SwarmChat: update dissemination in real-time
By employing CC, Swarm middleware data store can
support diverse app needs effectively
37
PhD Dissertation Defense
Related Work
Flexible consistency models/interfaces
 Munin, WebFS, Fluid Replication, TACT
Wide area caching solutions/middleware
 File systems and data stores:
AFS, Coda, Ficus, Pangaea, Bayou, Thor, …
 Peer-to-peer systems:
Napster, PAST, Farsite, Freenet, Oceanstore, BitTorrent, …
38
PhD Dissertation Defense
Future Work
Security and authentication
Fault-tolerance via first-class replication
39
PhD Dissertation Defense
Thesis Contributions
Survey of sharing needs of numerous applications
New taxonomy to classify application sharing needs
Composable consistency model based on taxonomy
Demonstrated CC model is practical and supports
diverse applications across WANs effectively
40
PhD Dissertation Defense
Conclusion
Can a storage service provide effective WAN caching
support for diverse distributed applications? YES
Key enabler: a novel flexible consistency interface
called Composable consistency
Allows an application to customize consistency to
diverse and varying sharing needs
Allows middleware to serve a broader set of apps
effectively
41
PhD Dissertation Defense
42
PhD Dissertation Defense
SwarmDB Control Flow
43
PhD Dissertation Defense
Composing Master-slave
Master-slave replication
 serialize updates
» Concurrent mode writes (WR)
» Serial update ordering (apply updates at central master)
 eventual consistency for queries
» Options mentioned earlier
Use: mySQL DB read-only replication across WANs
44
PhD Dissertation Defense
Clustered BerkeleyDB
45
PhD Dissertation Defense
BerkeleyDB Proxy using Swarm
46
PhD Dissertation Defense
A Swarm-based Chat Room
3
4
P
2
1
callback(handle, newdata)
{
display(newdata);
}
main()
{
handle = sw_open(kid, "a+");
sw_snoop(handle, callback);
while (! done) {
read(&newdata);
display(newdata);
sw_write(handle, newdata);
}
sw_close(handle);
}
Sample Chat client code
Chat transcript: WR mode, 0 second soft staleness, immediate visibility, no isolation
Update propagation path
47
PhD Dissertation Defense
Download