18-749 - Carnegie Mellon University

advertisement
12. Experimental Evaluation
18-749: Fault-Tolerant Distributed Systems
Tudor Dumitraş &
Prof. Priya Narasimhan
Carnegie Mellon University
Recommended readings and these lecture slides are available
on CMU’s BlackBoard
Electrical &Computer
ENGINEERING
What Are We Going To Do Today?



Overview of experimental techniques
Case study: “Fault-Tolerant Middleware and the Magical 1%”
Experimental requirements for the project
2
Overview of Experimental Techniques

Basics
–
–

Visual representation of data
–
–
–

Probability distributions, density functions
Outlier detection: 3σ test
Boxplots
3D, contour plots
Multivariate plots
Do’s and don’ts of experimental science
3
Experimental Research
“God has chosen that which is the most
simple in hypotheses and the most rich in
phenomena [...] But when a rule is extremely
complex, that which conforms to it passes
for random.”
Gottfried Wilhelm Leibniz,
Discours de Métaphysique, 1686
4
Statistical Distributions

If a metric is measured repeatedly, then we can determine its probability
distribution function (PDF)
–
–
–

PDF(x) is the probability that the metric
takes the value x
b
a PDF ( x)dx  Pr[a  metric  b]
Matlab function ksdensity
Common statistics
–
–
–
–
Mean = sum of values / #measurements (mean)
Median = half the measured values are below this point (median)
Mode = measurement that appears most often in the dataset
Standard deviation (σ) = how widely spread the data points are (std)


1 n
 Xi  X
n  1 i 1

2
where Xi is a measurement and X is the mean
5
Statistical Tools

Percentiles
–
–
–

“The Nth percentile” is a value X such that N% of the measured samples are less
than X
The median is the 50th percentile
Matlab function prctile
Outlier detection: 3σ test
–
–
–
Any value that is more than 3 standard deviations away from the mean is an
outlier
Latencyoutlier  Latency  3
For example, for latency:
In Matlab: outliers(a) = a (a > mean(a) + 3*std(a))
6
Line plot (plot)
–
–
–

Scatter plot (plot, scatter)
–
–

Determine a relationship between
two variables
Reveal clustering of data
Bar graphs (bar, bar3)
–

Y-axis is a function of X-axis values
Can use error bars to show standard
deviation
Can also do an area plot to emphasize
overhead or difference between similar metrics
Compare discrete values
Pie charts (pie, pie3)
–
0% Data upsets
50% Data upsets
Rounds
Client-perceived throughput
[bytes/s]

Nodes reached
Basic Plots
Latency [in µs]
Breakdown of a metric into its
constituent components
7
Boxplots

A “box and whisker” plot describes a probability distribution
–
–
–
–

The box represents the size of the inter-quartile range
(the difference between the 25th and 75th percentiles of the dataset)
The whiskers indicate the maximum and minimum values
The median is also shown
Matlab function boxplot
In 1970, US Congress instituted a random selection process for the military
draft
–
–
All 366 possible birth dates were placed in
a rotating drum and selected one by one
The order in which the dates were drawn
defined the priority for drafting
The boxplots show that men born later in the
year were more likely to be drafted
From http://lib.stat.cmu.edu/DASL/Stories/DraftLottery.html
8
Impact of Two Variables
3D plots
–
–
–
–

Z axis is a function of X and Y values
Surface plots: mesh, surf
Scatter plots: plot3, scatter3
Volume: display convex hull using
convhulln and trisurf
Contour plots
–
–
–
Represents a function of 2 variables
(the X and Y axes)
Suggests the values of the function
through color and annotations
Displays the isolines (variable
combinations that yield the same
value) of the function
62
65
94
67
72
p

70
97
110
80
pupset
9
Impact of Many Variables

Multi-variate plot
10
Do …
 Make Results Comparable
–
–
–
–
Use same hardware for all the experiments
Use same versions of your software
Avoid interference from other programs or make sure you always get the same
interference
Vary one parameter at a time
 Make Results Reproducible
–
–
Record and report all the parameters of your experimental setup
Archive and publish raw data
 Be Rigorous
–
–
–
–
Minimize the impact of your monitoring infrastructure
Report number of runs
Report mean values and standard deviations
Examine statistical distributions (modes, long tails, etc.)
11
Don’t …
14000

Latency [s]
12000
Forget to label the axes of your figures
10000
8000
6000
4000
Use different axis
limits when comparing
results
5
3
12000
2.5
10000
8000
6000
15
20
x 10
2
1.5
1
0.5
4000
2000
0
10
4
14000
Latency [s]

Latency [s]
2000
0
5
10
Clients
15
0
0
20
5
10
Clients
15
20

Plot mean values without looking
at the error margin
Latency [[s]
s]
Latency
15000
10000
5000
0
0
5
10
15
15
Clients
Clients
20
20
12
FT Middleware and the Magical 1%


Unpredictability of FT middleware
Unpredictability limited to 1% of
remote invocations
T. Dumitraş and P. Narasimhan. Fault-Tolerant Middleware and the Magical 1%. In ACM/IFIP/USENIX
Conference on Middleware, Grenoble, France, Nov.-Dec. 2005.
http://www.ece.cmu.edu/~tdumitra/public_documents/dumitras05magical.pdf
13
Predictability in FT Middleware Systems ?
C
R
C
C
R
Client
R
Server
CORBA
CORBA
Replicator
Replicator
Group Communication
Host OS
Host OS
Networking


Faults are inherently unpredictable
What about the fault-free case?
14
System Configuration for Predictability

Can we configure an FT CORBA system for predictable latency?

Software configuration
–
–
–
–
–

Operating system:
Group Communication:
Replication:
ORB:
Micro-benchmark:
RedHat Linux w/ TimeSys 3.1 kernel
Spread v. 1.3.1
MEAD v. 1.1
TAO Real Time ORB v. 1.4
10,000 remote invocations per client
Hardware configuration
–
–
–
25 hosts on the Emulab test bed
Pentium III at 850 MHz
100 Mb/s LAN
15
Experimental Methodology

Parameters varied:
–
–
–
–
–

active, warm passive
1, 2, 3 replicas
1, 4, 7, 10, 13, 16, 19, 22 clients
0, 0.5, 2, 8, 32 ms client pause
16, 256, 4096, 65536 bytes
Tested all 960 combinations, collected 9.1 Gb of data
–

Replication style:
Replication degree:
Number of clients:
Request arrival rates:
Sizes of reply messages:
Trace available at: www.ece.cmu.edu/~tdumitra/MEAD_trace
Statistical analysis of end-to-end latency:
–
–
–
–
Means, medians, standard deviations
Maximum and minimum values
1st, 5th, 95th, 99th percentiles
Numbers and sizes of the outliers
16
Example of Unpredictability



Maximum latency can be several orders of magnitude larger than the average
Distribution is skewed to the right and has a long tail
Long tail occurs on only one side because the latency cannot be arbitrarily low
–
MEAD latency is lower-bounded by CORBA and group communication latency
17
Systematic Unpredictability


Average values increase
linearly with the number of
clients
Maximum values are
unpredictable
18
Counting the Outliers

An outlier is a measurement that fails
the 3σ test

In most cases, less than 1% of the
measured latencies are outliers

Outliers originate in various modules
of the system:
–
–
–
The ORB
The group communication
The application
19
The “Magical” 1%
20
The “Magical” 1%
The “haircut” effect of removing 1% of the highest remote latencies
21
Observable Trends
10
10
7
10
99% latency [s]
Maximum latency [s]
10
6
5
4
10
5
4
3
10
65536
10
65536
2000
4096
Request size [bytes]
Request rate [req/s]
1000
16
500
0
1500
256
1000
16
2000
4096
1500
256

10
6
Request size [bytes]
500
0
Request rate [req/s]
The 99th percentile helps us identify trends in the data
–
E.g., latency increases with request rate and size
22
Interpretation

Predictable maximum latencies are hard to achieve
–
–
–
–

Tried to achieve predictability by selecting a good FT CORBA
configuration
Even in the fault-free case, end-to-end latencies have skewed distributions
for almost all 960 parameter combinations
Maximums are several orders of magnitude higher than averages
Unpredictability cannot be isolated to a single component
Magical 1%: achieving predictability through statistical approaches
–
–
We remove 1% of the highest measured latencies
Remaining samples have more deterministic properties
• 99th percentile helps us identify trends in the data
–
This allows us to extract tunable, predictable behavior out of fairly
complex, dependable systems
23
Experimental Evaluation of 18-749 Projects

Requirements for experimental evaluation
–
–
–


List of client invocations
Probes
Graphs
Tips
Digging deeper
24
Requirements for Experimental Evaluation

Things to hand in:
–
–
–
–

List of client invocations – the server methods you’re going to exercise
Raw data from the 7 probes in your application
Graphs of end-to-end latency
Interpretation of the results
Constraints
–
–
–
–
–
All clients must run on separate machines
Each client must issue at least 10,000 requests
All requests must receive a reply (two-way invocations)
The middle tier must have 2 replicas (e.g., primary & backup)
Try all 48 combinations of the following:
• Number of clients:
• Size of reply message:
• Inter-request time:

1, 4, 7, 10
original, 256, 512, 1024 bytes
0 (no pause), 20, 40 ms
Administrative
–
Each team must designate a chief experimenter
25
List of Client Invocations
METHOD
createObj()
getInfo()
deleteObj()
ONE_WAY
No
No
No
Name of remote
invocation
Is it a one-way
(no reply)?
DB_ACCESS
Yes
Yes
Yes
SZ_REQUEST
16
4
4
SZ_REPLY
4
256
4
Size of the forward
message before marshaling
(the combined sizes of all
the in and inout parameters)
Does it require a DB access
(all 3 tiers are involved)?
Size of the return
message before marshaling
(the combined sizes of all
the out and inout parameters)
26
Application Modifications

Use only two-way invocations
–
–

Tunable size of replies
–
–

The client must receive a reply from the server for each invocation
Suggestion: have at least 2 different invocations in your benchmark
Add a variable-sized parameter that is returned by the server
(e.g., sequence<octet>)
Try the following reply sizes: original, 256 bytes, 512 bytes and 1024 bytes
Inter-request time
–
–
–
Insert a pause in-between requests
Try the following pauses: 0 (no pause), 20, 40 ms
CAUTION:
• sleep(0) inserts a non-zero pause
• On most Linux kernels, you cannot pause for less than 10 ms
• For more information:
http://www.atl.lmco.com/projects/QoS/RTOS_html/periodic.html
27
Experiments Make Your Life Meaningful 
28
Stages of an Invocation
Client
Server
Database
Application
out
in
in
out
out
in
in
out
out
in
Replication
out
in
Middleware
reply
reply
request
request
Network
29
Data Probes (1 of 7)
Client
Server
Database
P1
File
Name
Application
out
in
in
out
out
in
DATA749_app_out_cli_${STY}_2srv_
out
in
${C}cli_${IRT}us_${BYT}req_${HOST
in
out
}_team${N}.txt
Replication
Data
Time (in µs)
is issued
out when each request
in
Example
request
67605
Middleware
69070
69877
72807reply
...
reply
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
30
Data Probes (2 of 7)
Client
Server
P1
Database
P2
out
in
in
out
ApplicationFile Name
DATA749_app_in_cli_${STY}_2srv_
out
in
${C}cli_${IRT}us_${BYT}req_${HOST
in
out
}_team${N}.txt
Replication
Data
out
Time
out (in µs) when
in each reply is received
in
Example
Middleware
request
67605
69070
69877
reply
72807
...
reply
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
31
Data Probes (3 of 7)
Client
Server
P3 P1
Database
P2
FileApplication
Name
out
in
in
out
out
in
DATA749_app_msg_cli_${STY}_2srv_
out
in
${C}cli_${IRT}us_${BYT}req_${HOST
in
out
}_team${N}.txt
Replication
Data
Name of eachoutinvocation
in
Example
request
createObj()
Middleware
createObj()
getInfo()
reply
deleteObj()
...
reply
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
32
Data Probes (example)
Client
P3 P1
Server
Database
P2
Application
out
in
in
out
out
in
in
out
out
in
Replication
out
in
Middleware
reply
reply
request
request
Network
Example:
probe1.record (new Long(gettimeofday()));
remoteFactory.createObj ();
probe2.record (new Long(gettimeofday()));
probe3.record (new String(“createObj()”));
33
Data Probes (4 of 7)
Client
File Name
Server
P3 P1
P2
Application
out
in
in
out
out
in
P4
in
out
Data
Time (in µs) when each request is received
Example
Replication
out
Database
DATA749_app_in_srv_${STY}_2srv_
${C}cli_${IRT}us_${BYT}req_${HOST}
_team${N}.txt
in
out
67605
69070
69877
72807
...
in
Middleware
reply
reply
request
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
34
Data Probes (5 of 7)
Client
Server
P3 P1
Database
P2
Application
out
in
P5
in
out
out
in
File Name
P4
Replication
out
DATA749_app_out_srv_${STY}_2srv_
${C}cli_${IRT}us_${BYT}req_${HOST}
_team${N}.txt
in
out
Data
in
out
Time (in µs) when each reply is completed
in
Example
Middleware
67605
69070
69877
72807
request
...
reply
request
reply
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
35
Data Probes (6 of 7)
Client
P3 P1
P2
P6
Application
out
in
Server
File Name
P5
in
out
out
in
DATA749_app_msg_srv_${STY}_2srv_
${C}cli_${IRT}us_${BYT}req_${HOST
}_team${N}.txt
P4
in
out
Replication
out
Database
Data
Name of each invocation
Example
in
out
createObj()
createObj()
getInfo()
deleteObj()
...
in
Middleware
reply
reply
request
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
36
Data Probes (7 of 7)
Client
P3 P1
P2
P6 P7
Application
out
in
Server
File Name
P5
in
out
out
in
DATA749_app_source_srv_${STY}_2sr
v_${C}cli_${IRT}us_${BYT}req_${HO
ST}_team${N}.txt
P4
in
out
Replication
out
Database
Data
Hostname of client sending the invocation
Example
in
out
black
black
blue
magenta
...
in
Middleware
reply
reply
request
request
Network
${IRT}
Legend
${STY}
${C}
Replication style
(ACTIVE or WARM_PASSIVE)
Number of clients
${BYT}
${HOST}
${N}
Inter-request time (in µs)
Reply size (in bytes)
Hostname
Your team number
37
Probe Invariant
Client
P3 P1
Server
P2
P6 P7
Application
out
in
Database
P5
in
out
out
in
P4
in
out
Replication
out
in
out
in
Middleware
reply
reply
request
request
Network
Probes at the same side and same level must have the same number of records!
38
Computing End-To-End Latency
Client
P3 P1
Server
P2
P6 P7
Application
out
in
Database
P5
in
out
out
in
P4
in
out
Replication
out
in
out
in
Middleware
reply
reply
request
request
Network
For request i:
Latency(i)  P2 (i)  P1 (i)
39
Computing the Components of Latency
Client
P3 P1
Server
P2
P6 P7
Application
out
in
Database
P5
in
out
out
in
P4
in
out
Replication
out
in
out
in
Middleware
reply
reply
request
request
Network
For request i:
Server(i)  P5 (i)  P4 (i)
Middleware (i )  Latency(i )  Server(i )
40
Computing the Request Arrival Rate
Client
P3 P1
Server
P2
P6 P7
Application
out
in
Database
P5
in
out
out
in
P4
in
out
Replication
out
in
out
in
Middleware
reply
reply
request
request
Network
For request i:
106
Req_rate (i ) 
P4 (i )  P4 (i  1)
41
Computing the Server Throughput
Client
P3 P1
Server
P2
P6 P7
Application
out
in
Database
P5
in
out
out
in
P4
in
out
Replication
out
in
out
in
Middleware
reply
reply
request
request
Network
For request i:
106
Throughput (i ) 
Size reply
P4 (i )  P4 (i  1)
42
Graphs Required

Line plots of latency for increasing number of clients and different reply
sizes (no pause)

Area plots of (mean, max) latency and (mean, 99%) latency, sorted by
increasing mean values

Bar graphs of latency component break-down for outliers and normal
requests

3D scatter plots of reply size and request rate impact on max and 99%
latency

Latency vs. throughput
43
Interpretation of Results

Short write-up containing the “lessons learned” from the experiments

What did you learn about your system?
–
–
–
–

What can you tell (good or bad) about the performance, dependability and
robustness of your application?
Were the results surprising?
If you observed some behavior you didn’t expect, how can you explain it?
What further experiments would be needed to verify your hypothesis?
Do your results confirm or infirm the magical 1% theory?
44
Tips for Experimental Evaluation

Avoid interference
–
–

Use separate machines for each client, server replica, NamingService/JNDI, FT
manager, database, etc.
Make sure there are no other processes using your CPU or bandwidth
Minimize impact of monitoring
–
–
–
Store data in pre-allocated memory buffer
Flush buffers to the disk at the end
Record timestamps as time from the start of the process
• Use 4-byte integers (long) for the timestamps

Automate the experimental process as much as possible
–

Create scripts for launching the servers and clients, for collecting data, for analyzing it
and for creating the graphs
Use Matlab for graphs and data processing
–
This is installed on the ECE cluster and is available to students
• Can also download it from https://www.cmu.edu/myandrew/
–
If you need help with plotting your graphs, please send email to us
45
Digging Deeper

Do the same thing while injecting faults

Other probes
–
–
–
–
–

Other ways to represent data
–
–
–

CPU usage (time spend in kernel, user mode)
Memory (total, resident set)
Bandwidth usage
Context switches
Major/minor page faults (page not in physical memory)
Boxplots for end-to-end latency
Impact of varying #clients, size, request rate on #outliers, size of outliers, latency, etc.
Do you see multi-modal distributions (can you explain them)?
Interpretation of results
–
–
–
–
Are outliers isolated or do they come in bursts?
What is the source of the outliers?
Can you predict anything about the behavior of your system?
What questions can you answer by looking at this data?
46
Summary of Lecture

What matters to you?
–
–
–
–
–

Email all questions to the course mailing list
–
–

What experiments should you run?
What data should you collect?
How should you present your data?
What should you analyze?
What lessons might you learn about your system?
The other two TAs and myself (Tudor) are on this list
We’re happy to sit down and work out the details with you and to help you run
your experiments
It might sound like a lot of work, but the hard part is behind you – you’ve
already built your system
–
Now, it’s time to understand what you actually built!
47
Download