Self Similarity in World Wide Web: Traffic Evidence and Possible Causes

advertisement
Self Similarity in World Wide Web:
Traffic Evidence and Possible Causes
Mark E. Crovella and Azer Bestavros
Computer Science Dept,
Boston University
Presented by
Kalyan Boggavarapu
CSC 497
Lehigh University
Self-Similarity
 Def: is an object whose appearance is
unchanged regardless of the scale it
is used.
 Heavy tailed:
 a function exhibiting the power laws.
 E.g.: The geographical distribution of the
people in the world.
 World Wide Web traffic can show SelfSimilarity
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
2
Data Set
 Traces from NCSA Mosaic
 Jan, Feb 1995
 Logs: URL, session, User and
workstation ID
 Experiment Environment:
 37 SparkStation-2 workstations,
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
3
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
4
Self Similarity Characteristics
Parameters
 Degree of self Similarity - H
 Hurst parameter H ,range of (1/2 , 1)
 H->1 is the max self-similarity
 In this paper we would see
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
6
Analysis in two stages
 Stage 1:
 what is the appropriate value of H.
 Stage 2:
 Which parameter accurately measures this
parameter H.
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
7
Stage 1:
Estimate the value of H
Self Similarity for different time
intervals
 Step 1:

Estimate for short intervals ( 1 sec and above )
 using: web traffic data for a single hr
 Plot:
 Variance Time plot,


Rescaled range plot
Periodogram plot
 Step 2:
 Estimate for scaling to large intervals
 Whittle Estimator
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
9
Self Similarity characteristics
graphs 1
Slope => H
This line
is => H
Slope is
=> H
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
10
Whilttle Estimator
 Estimates: the confidence range of H
 Based: a time series
 FGN – Fractional Gaussian Noise Model
 Now check: if timeseries aggregation or
 Estimated H is consistent or not ?
 Infer: www traffic at stub networks is
self similar when traffic is high in
demand.
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
11
Expected feature: aggregation => H
Aggregation over a long range shows stability of
the hypothesis
Fully
busy
H
Whittle estimator
confirms our earlier
calculations of H
Variance of
95% Confidence
Interval of H
Least
busy
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
H decreasing as
it becomes less
busy
12
Stage 2:
Which parameter is useful to estimate
the value of H
Which parameter is responsible for
self similarity?
File requests => file transfers => unique files distribution
Alpha = 1.2
H (.7-.8)
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
14
Its Available files
Available files => Heavy tailed behavior of file transfer
Conclusion:
Distribution of available files
=>
( Web traffic self similarity =
Heavy tailed distribution of
file transfers)
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
15
Sources:

“Self-Similarity in World Wide Web Traffic: Evidence and Possible
Causes” (1996) Mark Crovella, Azer Bestavros Proceedings of
SIGMETRICS'96: The ACM International Conference on Measurement
.
and Modeling of Computer Systems
5/28/2016
Kalyan Boggavarapu CSC 497
Lehigh University
16
Download