Final Presentation

advertisement
Extension to PerfCenter: A Modeling and
Simulation Tool for Datacenter Application
Nikhil R. Ramteke,
Advisor: Prof. Varsha Apte,
Department of CSA, IISc
27th May 2011
Multi-tiered Networked Applications
DB
Auth
server
web
servers
server
Important Performance Metric
• Response time
• Utilization
• Throughput
• Waiting time
• Queue length
• Arrival Rate
• Blocking probability
• Average service time
Flow of a request through such a system
“login”
Machine 1
Machine 4
Machine 7
Web Server
DB Server
Web Server
Machine 2
Machine 5
Machin 8
Machine 3
Machine 6
Machine 9
Web Server
Auth Server
DB Server
Auth Server
Web Server
DB Server
3
PerfCenter
• Performance measurement
tool, it builds and solves the
system model
• It takes the system details as
an input and built the system
model.
• System model is built as a
network of queues.
• Built model is solved either by
simulation or by analytical
methods.
• Open source, available at:
http://www.cse.iitb.ac.in/perfnet/softp
erf/cape/home/wosp2008page
Fig: PerfCenter tool structure
PerfCenter
(Input Language)
Host Specification:
Server Specification:
host machine1[2]
ram 1000
cpu count 1
cpu buffer 99999
cpu schedP fcfs
cpu speedup 1
disk count 1
disk buffer 99999
disk schedP fcfs
disk speedup 1
.
.
.
end
server web
thread count 1
thread buffer 9999
thread schedP fcfs
thread size 0.610
staticsize 100
requestsize 0.5
task node2
task node5
task node9
.
.
.
end
Feature Enhancements to PerfCenter
(Problem Definition)
Among various enhancements possible, our contribution is
following:
Memory Model:
Memory can be bottleneck while deploying server on host.
Individual server utilization on a device:
•PerfCenter can predict the device utilization of host.
• But can not estimate the which server has contribution in what
amount,
• This feature enables the user to find bottleneck server quickly
Timeout and Retries:
Aimed at capturing the user behavior like “stop-reload”.
Memory Usage Modeling
PerfCenter System Model for memory usage:
Per server util = (Static Size +
Servers:
Thread size * total threads +
• Static size of server
Request size * Avg. Queue length
of request queue)/ RAM size
• Per thread memory usage
• Per request memory usage (increases with queue length)
Host:
• RAM size for each host
Input language specification:
Server web
staticsize 80
thread size 2
requestsize 2
end
Host host1
ram 2000
end
Metrics:
• util(host_name:ram) //overall RAM util
• util(host_name:server_name:ram) //RAM util by a server
Software Design Changes Required for
Memory model and Individual Server Utilization
 Memory Model
Added members static size, thread size, request size to the server class
software server,
 Added members ram size to host class,
 No change required to dynamic statistics calculation in simulation
 Use average queue length calculated at the end of simulation
 Individual server utilization of host devices:
 Must keep track of who is issuing request to device

R R R R R
S3 S1 S2 S2 S1


Server
Class member update: total busy time, utilization variables into software
queue class.
Some additional bookkeeping during simulation (per server statistics)
Timeouts and Retries:
• Characteristics of real users of server systems
•Impatience: users abandon if response is not received within
their expected time
•Retries: Users often retry just after abandoning a request
(E.g. “stop-reload” behavior on Web browser)
This behavior is common in client-server based applications.
Timeout may affect system performance in following ways:
• Reduction in Throughput,
• Completed requests may have already timed out – need to count
successful requests separately,
• Utilization may decrease due to less throughput,
• Average response time decrease due to increase in request
timeouts,
Timeouts and Retries:
When request is submitted to an application, one of the
following things can happen:
Possibility
of
Retry
Arrival
of
request
Drop
[Drop rate (D)]
Timeout during service
[Badput (B)]
Server
Timeout in Buffer
[Timeout in buffer
Rate (Tb)]
Request processing is
not aborted
immediately, processing
goes to completion, but
request counted as
failed
Successfully completed
[Goodput (G)]
Request does not leave
the queue immediately,
When it is picked by
s/w server then it is
counted as failed.
Timeouts and Retries:
(PerfCenter system model)
Mean timeout value is taken as an input with certain
distribution, timeout value of each request is set according
to it.
Input language:
loadparams
timeout distribution_name(distribution_parameters)
.
.
end
Eg: loadparams
timeout exp(0.5)
end
Timeouts and Retries:
(PerfCenter system model: )
Overall G, B, D and Tb can now be estimated with PerfCenter
as follows,
Output Language:
-
gput()
bput()
buffTimeout()
droprate()
//overall Goodput
// overall Badput
// overall Timeout in buffer rate
// overall drop rate
Timeouts and Retries:
Software Design Changes:
• Added members timeout flag, mean timeout in to the
Request class,
• Added number of request processed, number of request
timeout in buffer, number of request timed out in service,
Goodput, Badput, drop rate, timeout in buffer rate to the
Scenario simulation class.
• No extra events are added.
Validation:
 Validation done using sanity checks
 Results should follow expected rules and trends
 Scenario used for validation:
 Type of System : Open
 Service rate : 100
 Arrival rate : Varied from 10 to 100
 Timeout rate : 10
 Timeout distribution : Exponential
 Requests simulated : 1000000
 Number of repetition : 20
 Input File
Results
Fig : RAM utilization v/s Arrival Rate
Results
Goodput
Decreases
More
request
timed out
in buffer
Fig : G, B, Tb, D v/s Arrival Rate
Results
Utilization curve
follows Throughput
(G + B)
Starts decreasing
because more
requests are timing
out in buffer
Fig : Utilization, Throughput v/s Arrival Rate
Results
Utilization
decreases due
to more
request time
outs
Fig : Individual server utilization v/s Arrival Rate
Results
Avg. Response
time decreases
due to timeouts
Fig : Average Response Time v/s Arrival Rate
Summary of Work Done
Before Midterm:
Background Study
• Queuing theory,
• Simulation modeling,
• Performance issues of multi-tiered systems,
• PerfCenter
After Midterm:
Developed an abstraction, an input language and updated
PerfCenter simulation engine for
• Adding memory model,
• Updating utilization model for Individual server utilization
on device,
• Adding Timeout and Retries model.
Conclusion:
PerfCenter is performance measurement tool, and can now be
used by performance analysts with few more useful features
added, most important one being timeouts and retries.
Validated our model using test experiment. Illustrative results
shows how PerfCenter can be used for estimating application
performance in presence of following features.
• Memory model
• Individual server utilization
• Timeout and retries model.
As results show, this can change data center sizing plans.
Future work:
• Predicting G, B, Tb, D for individual queuing systems,
• More validation is needed to increase confidence in the tool,
• More features need to be added to increase power of the tool.
References:
1.
R.P. Verlekar, V. Apte, PP. Goyal, and B. Aggarwal.
Perfcenter: A methodology and tool for performance analysis of
application hosting centers.
MASCOTS '07: Modeling, Analysis, and
Simulation of Computer and Telecommunication Systems, 2007,
pages 201—208.
2. Supriya Marathe, Varsha Apte, and Akhila Deshpande.
Perfcenter: Perfcenter: A performance modeling tool for application
hosting centers.
WOSP '08 Proceedings of the 7th international workshop of
Software and Performance, 2008
3.
Kishor~S. Trivedi.
Probability and Statistics With Reliability, Queuing, and
Computer Science Applications.
PHI Learing Private Limited, Eastern Economy edition, 2009.
References:
4. Averill M. Law and W. David Kelton.
Simulation Modeling and Analysis.
Tata Mcgraw-Hill, 2000.
5. Daniel A. Menasce and Virgilio A. F. Almeida.
Scaling for E-Business, Technologies, Models, Performance and
Capacity Planning.
Prentice Hall PTR, 2000.
6. Supriya Marathe.
Performance Modeling for Distributed Systems.
Master's thesis, IIT Bombay, Mumbai, India, June 2008.
7. Puram Niranjan Kumar.
Validation, Defect Resolution and Feature Enhancements of
PerfCenter.
Master's thesis, IIT Bombay, Mumbai, India, June 2008.
Download