ppt

advertisement
Wide-Area Service Composition:
Evaluation of Availability and Scalability
Bhaskaran Raman
SAHARA, EECS, U.C.Berkeley
Provider A
Provider B
Transcoder
Thin
Client
Cellular
Phone
Video-on-demand
server
Text
to
audio
Provider R
Email
repository
Provider Q
Problem Statement and Goals
Provider A
Video-on-demand
server
Provider A
Provider B
Transcoder
Thin
Client
Provider B
Goals
– Performance: Choose
set of service instances
– Availability: Detect and
handle failures quickly
– Scalability: Internetscale operation
Problem Statement
– Path could stretch across
– multiple service providers
– multiple network domains
– Inter-domain Internet paths:
– Poor availability [Labovitz’99]
– Poor time-to-recovery [Labovitz’00]
– Take advantage of service replicas
Related Work
– TACC: composition within cluster
– Web-server choice: SPAND, Harvest
– Routing around failures: Tapestry, RON
We address: wide-area n/w perf., failure
issues for long-lived composed sessions
Is “quick” failure detection possible?
• What is a “failure” on an Internet path?
– Outage periods happen for varying durations
• Study outage periods using traces
– 12 pairs of hosts
• Berkeley, Stanford, UIUC, UNSW (Aus), TU-Berlin (Germany)
• Results could be skewed due to Internet2 backbone?
– Periodic UDP heart-beat, every 300 ms
– Study “gaps” between receive-times
• Results:
– Short outage (1.2-1.8 sec)  Long outage (> 30 sec)
• Sometimes this is true over 50% of the time
– False-positives are rare:
• O(once an hour) at most
– Similar results with ping-based study using ping-servers
– Take away: okay to react to short outage periods, by switching
service-level path
UDP-based keep-alive stream
HB destination
HB source
Total time
Num. False
positives
Num.
Failures
Berkeley
UNSW
130:48:45
135
55
UNSW
Berkeley
130:51:45
9
8
Berkeley
TU-Berlin
130:49:46
27
8
TU-Berlin
Berkeley
130:50:11
174
8
TU-Berlin
UNSW
130:48:11
218
7
UNSW
TU-Berlin
130:46:38
24
5
Berkeley
Stanford
124:21:55
258
7
Stanford
Berkeley
124:21:19
2
6
Stanford
UIUC
89:53:17
4
1
UIUC
Stanford
76:39:10
74
1
Berkeley
UIUC
89:54:11
6
5
UIUC
Berkeley
76:39:40
3
5
Acknowledgements: Mary Baker, Mema Roussopoulos, Jayant Mysore, Roberto Barnes, Venkatesh Pranesh,
Vijaykumar Krishnaswamy, Holger Karl, Yun-Shen Chang, Sebastien Ardon, Binh Thai
Internet
Source
Architecture
Destination
Peering:
exchange
perf. info.
Composed services
Logical
platform
Peering relations,
Overlay network
Hardware
platform
Service clusters
Location of Service Replicas
Application
plane
Functionalities at the Cluster-Manager
Finding Overlay Entry/Exit
Service cluster: compute
cluster capable of running
services
Service-Level Path
Creation, Maintenance,
and Recovery
Link-State Propagation
At-least
Perf.
-once UDP Meas.
Liveness
Detection
Evaluation
•
What is the effect of recovery mechanism on application?
–
Text-to-Speech application
Leg-2
End-Client
1
–
Two possible places of failure
•
•
•
•
•
Text
to
audio
Leg-1
Text Source
Request-response protocol
Data (text, or RTP audio)
Keep-alive soft-state refresh
Application soft-state (for restart on failure)
20-node overlay network
One service instance for each service
Deterministic failure for 10sec during session
Metric: gap between arrival of successive audio packets at the client
What is the scaling bottleneck?
–
2
Parameter: #client sessions across peering clusters
•
–
–
–
Measure of instantaneous load when failure occurs
5000 client sessions in 20-node overlay network
Deterministic failure of 12 different links (12 data-points in graph)
Metric: average time-to-recovery
1
Recovery time: 2963 ms
Recovery time: 822 ms
(quicker than leg-2 due to
buffer at text-to-audio
service)
Recovery of
Application
Session:
CDF of
gaps>100ms
Recovery time: 10,000 ms
Jump at 350-400 ms: due to synch. text-to-audio processing (impl. artefact)
Average
Time-to-Recovery
vs. Instantaneous
Load
• Two services in each path
• Two replicas per service
• Each data-point is a
separate run
End-to-End recovery algorithm
2
High variance due to
varying path length
Load: 1,480 paths on failed link
Avg. path recovery time: 614 ms
Results: Discussion
• Recovery after failure (leg-2): 2,963 = 1,800 + O(700) + O(450)
1
– 1,800 ms: timeout to conclude failure
– 700 ms: signaling to setup alternate path
– 450 ms: recovery of application soft-state: re-process current
sentence
• Without recovery algorithm: takes as long as failure duration
• O(3 sec) recovery
– Can be completely masked with buffering
– Interactive apps: still much better than without recovery
• Quick recovery possible since failure information does not have
to propagate across network
• 12th data point (instantaneous load of 1,480) stresses emulator
limits
– 1,480 translates to about 700 simul. paths per clustermanager
2
– In comparison, our text-to-speech implementation can support
O(15) clients per machine
• Other scaling limits? Link-state floods? Graph computation?
Summary
• Service Composition: flexible service creation
• We address performance, availability, scalability
• Initial analysis: Failure detection -- meaningful
to timeout in O(1.2-1.8 sec)
• Design: Overlay network of service clusters
• Evaluation: results so far
– Good recovery time for real-time applications:
O(3 sec)
– Good scalability -- minimal additional provisioning
for cluster managers
• Ongoing work:
– Overlay topology issues: how many nodes, peering
– Stability issues
Feedback, Questions?
Presentation made using VMWare
Emulation Testbed
Rule for 12
App
Node 1
Emulator
Rule for 13
Lib
Rule for 34
Node 2
Rule for 43
Node 3
Node 4
Operational limits of emulator: 20,000 pkts/sec, for upto 500 byte pkts, 1.5GHz Pentium-4
Download