g ri a n

advertisement
Instructor:
Brian Davison
Presented by : Hua Jiang
Titima Boondarig
Web workload model
Examples
Hot topics
Problem-based approach
Content covers:
“A workload consists of the set
of all inputs a system receives over
a period of time.”
---- from the text book
Inspected
Analyzed
Criticized
A synthetic model
Derives from an explicit mathematical
model that can be
Modeling real world
Controlled manner
Advantage
Disadvantage
Difficult to characterize
Measurement differs
Inter-affected
Time-span effort
“To show something, to test something, to verify that the design
meets the requirements, to document the design!”
---Dr. John Hines, Air Force Research Laboratory
Identifying parameters
Analyzing measurement data
Validate the model
F(x) = e-λλx
Exponential distribution
F(x) = P(X>x)
Complementary cumulative distribution
function
Probability distribution
Mean
Median
Variance
Statistics
Pareto :
F(x) = (k/x)a, x>=k
P(r) = kr-c
HTTP message characteristics
Resources characteristics
User behavior
Response code
Request method
GET web page retrieval
POST web forms
method PUT file upload
DELETE file deletion
OPTIONS capabilities
HEAD status
1xx informational
2xx success
code
3xx redirection
4xx client error
5xx server error
10%~30% (304)
75%~90%
small fraction
majority
Zipf-like
Lognormal
Resource popularity
Resource changes
Temporal locality
Pareto
Pareto(tail), Lognormal(body)
Response sizes
Number of embedded
resources
Pareto(tail), Lognormal(body)
Content types
Resource sizes
Arun K. Iyengar, Mark S. Squillante, Li Zhang, "Analysis and Characterization
of Large-Scale Web Server Access Patterns and Performance," World Wide
Web, vol. 2, Baltzer, 1999.
Exponential
Pareto
Pareto (tail)
Session and
request arrivals
Clicks per session
Request inter-arrival
time
Arun K. Iyengar, Mark S. Squillante, Li Zhang, "Analysis and Characterization
of Large-Scale Web Server Access Patterns and Performance," World Wide
Web, vol. 2, Baltzer, 1999.
Combining workload parameters
Validating the workload model
Generating synthetic traffic
Log & Privacy policies
New technical developments
Application of user-level data
Server
Browser
WHY?
Proxy
Information available to software
components
HTTP header
Configuring the browser
Directing requests to an anonym zing proxy
Using SSL and HTTPS
Access to user-level data
Identify performance
Benchmarking web components
Capacity planning
Smaller and simple
Reports how fast it can retrieve content
Apachybench
Flexible tool
Fine-grained
Similar methodology to SPECWeb99
Httperf
Original and still popular web server benchmark
Free benchmarking system
Webstone
A commercial- grade benchmark system $200-$800
SPECWeb99
First World Wide Web Server benchmark.
Standardized workload, agreed to by major players in WWW market.
Retired on April 24, 2000.
SPECWeb96
K. Kant and Y. Won, “Server Capacity Planning for Web Traffic
Workload”, IEEE trans. on knowledge and data engineering,
Oct 1999, pp 731-747.
K. Kant and Y. Won, “Server Capacity Planning for Web Traffic
Workload”, IEEE trans. on knowledge and data engineering,
Oct 1999, pp 731-747.
Server application changes
Client sophisticated
Large and distributed servers
Social factors
Others
Overload control
Dynamic content
Locality
Is the property that an object whose
appearance is unchanged regardless of
the scale at which it is viewed.
Self-similarity
Web workload
Modeling approach
Characteristics
Applications
Future trends
Download