DOC format - AU Journal

advertisement
Web Server Workload Forecasting –
Fuzzy Linguistic Approach
Chin Wen Cheong , Amy Lim Hui Lan & V.Ramachandran
Faculty of Information Technology
Multimedia University
63100 Cyberjaya, Selangor, Malaysia
E-mail : wcchin@mmu.edu.my
ABSTRACT
I.
Web server workload forecasting is one
of the essential considerations in web server
management and network upgrading. Due to
variability of server workload distribution
originated from unpredictable users’ surfing
behavior, the measurement of Web server
performance metrics is characterized and
modeled in fuzzy manner. A fuzzy inference
system is formed using four Web server
performance metrics and server utilization
index are derived to determine the servers’
utilization states for every time period(s). A
fuzzy Markov model is proposed to illustrate
the state transitions of server resource
utilization based on experts’ linguistic
evaluation
of
stationary
transition
probability. A steady state algorithm is
applied to explore the convergence of server
resource utilization after n transition
period(s).
The WWW to date is revolutionarily
growing and web service responsiveness is
degrading due to unprecedented workload.
The exponential growth of the clients’
demand causes the imbalanced workload
distribution
for
homogenous
and
heterogeneous web servers.
Traffic
congestion becomes critical when the DNS
scheduling [1,2] of a multiserver system fails
to provide scalability and flexibility to
handle the load traffic. Additionally, the
existence of the traffic burstiness
phenomenon in the peak hour [3, 4] at
certain time scales has caused the web
servers services to come to a crawl state.
Thus, in order to provide highly reliable
service in the business oriented WWW,
workload handling and forecasting is vital.
Keywords: Fuzzy Logic, Fuzzy Inference
System, Markov Chain.
INTRODUCTION
A great deal of literature works has been
carried out in forecasting including statistical
and artificial intelligence approaches. The
statistical approach comprises moving
average, exponential smoothing, time series,
regression and economic modeling [5]. By
the way, the artificial intelligence (AI)
concepts
which
include
knowledge
engineering, expert system, fuzzy logic,
neural network (ANN) and genetic algorithm
(GA) are introduced to the forecasting
methods. However, the traditional as well as
the AI forecasting methods are having
several drawbacks where the statistical
methods are ill defined to represent vague
input data and human judgement. Likewise,
despite the fact that genetic and ANN
forecasting possesses powerful searching and
learning of past data, GA as well as ANN are
more suitable dealing with numerical data
instead of linguistic values.
Due to
fluctuation of server workload distribution,
the numerical data collected is vague. To
increase the typicality and accuracy, the
forecasting method is expected to process
numerical information incorporating with
expert judgements for future workload
estimation. Thus, fuzzy control seems to be
a better candidate, since it is an effective
approach to utilize linguistic rule derived
from numerical data pairs. Currently, a lot
of fuzzy forecasting models have been
proposed such fuzzy self-regression by Feng
and Guang[6], fuzzy neural approach to time
series prediction by Nie [7] and forecasting
method from Wang and Mendel[8].
In this paper, a fuzzy Markov model is
formed to represent the transitions of the
server resource utilization and to forecast the
future server workload state based on expert
linguistic evaluation of state transition
matrix. The judgement of experts may refer
to the distribution of server utilization index
derived from a fuzzy inference system,
which is characterized by four Web servers’
performance metrics namely server’s
latency(millisecond), service rate(connection
per second), throughput(megabits per
second) as well as the occurrence of
error(error per second). Fuzzy rules are
established by taking into account of
different combinations of fuzzified workload
metrics values with regard to pre-defined
membership functions. A fuzzy algorithm is
utilized to explore the steady states of the
server workload after n transition periods.
II. WEB
SERVER
MODELING
WORKLOAD
The performance in terms of server
system throughput is normally viewed as the
rate of the requests that have been served.
Owing to the fluctuation of the user
requested file size over the WWW,
throughput[9] is sometimes measured in
megabits per second as well. Additionally,
response time or the CPU time of a server
system as a part of the overall latency is
considered as a metric to evaluate the web
server performance. Normally, it depends on
the availability of bandwidth, server’s
overall performance as well as the client’s
machine performance. Finally, the existence
of connection queue will fail the users from
interacting with the particular server.
Consequently, the degradation of the servers’
response is measured in terms of error per
second. For every time window the server
utilization index is governed by four
predefined Web server performance metrics
which specified by 4-tuples such that [10]:
I = { M1, M2, M3, M4 }
where:
M1 - server’s service rate(connection per second)
M2 - server’s throughput (megabits per second)
M3 - server’s latency(milliseconds)
M4 - server’s error frequency(error per millisecond)
Let A1, A2, A3 and A4 represent the fuzzy
sets for M1, M2, M3 as well as M4 over time
window Tc , where c is the integer number.
The fuzzification of the four metrics values
are based on predefined membership
function as illustrated in Table 1. At a
specific instant of time, all the measured data
for M1, M2, M3 and M4 are fuzzified by
membership function to their associate fuzzy
sets A1, A2, A3 and A4.
inference engine. Fuzzy inference system
involves some procedures to conclude either
a fuzzy or crisp result depending on the user
intention. The components of a fuzzy
system include fuzzification (fuzzy input
memberships), inference engine and
defuzzification respectively.
III. FUZZY INFERENCE SYSTEM
FOR SERVER’S UTILIZATION
DETERMINATION
For instance the workload metrics are
fuzzified into four fuzzy linguistic spaces as
follows:
Fuzzy inference system [11] is based on
fuzzy set theory and fuzzy rules-based
approach in decision analysis and variety of
fields, especially dealing with uncertain and
complex systems. The portability of fuzzy
system allows human linguistic language
approach to determine the grade of
membership as well as the fuzzy rules for the
A1 ={Low, Medium, High}
A2 ={Small, Fair, Large}
A3 ={Slow, Moderate, Fast}
A4 ={Not frequent, Moderate, Very frequent}
The rules which decide the utilization values
are given as below:
Table 1: Fuzzy Rules
IF
THEN
A1
O
A2
O
A3
O
A4
Utilization
State
Low
And
Small
And
Slow
and
none
Not Significant
Low
And
Small
And
Moderate
And
Not
Not Significant
Low
And
Small
And
Moderate
And
Moderate
Not Significant
Low
And
Small
And
Moderate
And
Very
Normal
Frequent
Frequent
High
And
Large
And
Moderate
And
Moderate
Extremely
Critical
High
And
Large
And
Moderate
And
Very
Extremely
Frequent
Critical
High
And
Large
And
Fast
And
Not
Extremely
Frequent
Critical
High
And
Large
And
Fast
And
Moderate
Extremely
High
And
Large
And
Fast
And
Very
Extremely
Frequent
Critical
Critical
Fuzzy inferencing is used as a max-min
composition in the calculation of server’s
utilization value.
Fuzzy inference with
predefined rules will integrate all the 4tuples intensity that finalizes a server’s
utilization states with four states such as
Very Critical, Critical, Normal and
Insignificant. Fig. 1 illustrates the linguistic
input and output of the model system.
IV. FUZZY MARKOV
OF
SERVER
UTILIZATION
MODELING
RESOURCE
Let a finite process in discrete time be
discrete state space (S1, S2 , S3 ,…, Sn ). A
fuzzy process is similar to a stochastic
process, therefore a finite fuzzy process can
be established based on the following
condition:
(a)
The transition probabilities in a finite
square nn matrix P with the
following structure which is known
as fuzzy state transition matrix of a
fuzzy process:
M = ( mij )
S1
S2
Sn
S1
m11
m12
m1n
S2
m21
m22
mnn
Sn
mn1
mn2
mnn
=
where mij represents the grade of
membership of the transition going
from state i to state j which confines
in 0 < mij < 1.
(b)
Generally the fuzzy state UX(k) at
each time interval as a row vector as
below:
(c)
UX(k) = ( Vi1(k) , Vi2(k) , …, Vin(k) )
Figure 1: Parameters and Associate Memberships
Function
where k denotes the instant of time,
V denotes the grade of membership
of sets with respect to fuzzy set X.
Additionally, for a row vector with
initial time k=0 is known as the
initial state designator of X with the
following form:
UX(0) = ( Vi1(0) , Vi2( 0) , …, Vin(0) )
Hence, the state designator of X at
time T=n can be obtained by the
following equation:
UX(n) = UX(0)  M(n) = UX(0)  [ mij(n) ]
normalized values are
illustrated in Table 4.
IV. NUMERICAL ESTIMATES OF
SERVER RESOURCES
UTILIZATION
C/s
Table 2: Utilization States References
States
Extremely
Critical(1)
Critical(2)
Normal(3)
Insignificant(4)
Server’s
utilization State
> 0.600
0.400-0.600
0.200-0.400
< 0.200.
The initiated predefined workload
parameters are shown in Table 3. Based on
Table 2, the occurrence of each state within
the time period is counted and their
M/s
t/ms
e/ms
FIS
Server
output
Time period
utilization
state
in an
hour(3minute
s)
Time 1
5.5
2.0
3
0.001
0.28
Normal(3)
Time 2
12.5
3.5
4
0.005
0.38
Normal(3)
Time 3
18.3
4.0
16
0.002
0.57
Critical(2)
Time 18
The existence of imprecision and
vagueness in a server’s resources utilization
is represented by assigning the linguistic
variable
as
the
utilization
index.
Consideration of a web server that delivers
few types of objects such as images, HTML
pages, audio and video clips, as well as some
formatted document as a monitoring target
analyzes the four predefined metrics.
Initially, it is assumed that a web server is
monitored for each hour. By implementing
the FIS, Web administrator compromised the
web server’s utilization status as illustrated
in Table 2.
as
Table 3: Server’s Utilization States Determination
Average
where the multiplication between the
initial designator row vector and the
fuzzy transition matrix involve union
and intersection basics operations.
determined
15.0
3.0
15.0
0.0
0.51
Critical(2)
0.33
Normal(2)
0.31
Normal(1)
06
Time 19
1.0
5.0
5.0
0.0
05
Time 20
4.0
3.0
5.0
0.0
01
Table 4 : Normalized Value of State Occurrence
States
Number of
Normalized
occurrence
value
Extremely critical(1)
6
0.75
Critical(2)
8
1.00
Normal(3)
4
0.50
Not significant(4)
2
0.33
The state transition matrix is determined
by experts' subjective evaluation by referring
to Table 4 on the distribution and changes of
server utilization index. It is assumed that
the state transition matrix and initial
designator vector for this specific case are
given as follows:
 H
 VM
P 
 M

 L
VH
M
EH
L
L
M
VL
M
L 
VL 
L 

L 
with the discrete state space( 1,2,3,4 ), and
UX(0) = [ VH EH M L ]
The state space diagram is illustrated in
Fig. 2.
= 0.81/0.1+0.49/0.2+0.16/0.3+0.04/0.4+0.01/0.5
Very Medium (VM)
= 0.09/0.2+0.25/0.3+1.0/0.4+ 0.49/0.5+0.04/0.6
Very High (VH)
= 0.04/0.5+0.49/0.6+1.0/0.7+0.25/0.8 + 0.04/0.9
1
2
Very Extremely High (VEH)
= 0.36/0.6+0.49/0.7+0.64/0.8+0.81/0.9+ 1.00/1.0
4
3
Figure 2: State Space Diagram
The web server resource utilization state
transition may occur due to the users
implosive demands or variability of network
condition.
The experts’ subjective
judgements, the fuzzy grades of membership
of linguistic variable Low, Moderate, High
and Extremely High are defined by the
below fuzzy sets:
Low (L)
= 0.9/0.1+0.7/0.2+0.4/0.3 + 0.2/0.4 + 0.1/0.5
Moderate (M)
= 0.3/0.2+0.5/0.3+ 1.0/0.4+ 0.7/0.5+ 0.2/0.6
High (H)
= 0.2/0.5+ 0.7/0.6+ 1.0/0.7+ 0.5/0.8+ 0.2/0.9
By using maxmin decision principle
[12], every state transition is selected for P
for time T=2. The multiplication of the fuzzy
state transition matrix P 2 is illustrated in the
tree diagram in Fig. 3. The tree diagram will
demonstrate the state status after 2 steps
given that the fuzzy process started in state 1,
state 2 and state 3 respectively. Based on
Fig. 3, the P 2 is defined as follows:
P
2


 



H
VH
M
VM
EH
M
M
M
M
M
L
M


VL

L 

L 
L
Therefore for time T=2, the state designator
UX(2) is determined as below:
UX(2) = UX(0)  P 2


 [ VH EH M L ] 



H
VH
M
VM
EH
M
M
M
M
M
L
M


VL

L 

L 
L
Extremely High (EH)
= 0.6/0.6+ 0.7/0.7+ 0.8/0.8+ 0.9/0.9+ 1.0/1.0
According to [12], the fuzzy sets for
very low, very medium, very high and very
extremely high are interpreted as below:
Based on basic union and intersection
operation, the state designator UX(2) is
obtained as below:
UX(2) = [ u11, u12, u13, u14 ]
Very low (VL)
= low  low
= [ H VH M L ]
where
u11
= (VHH)(EHVM)(MM)(LM)
= H  VM  M  L
=H
u12
= (VHVH)(EHEH)(MM)(LL)
= VH
u13
= (VHM)(EHM)(MM)(LM)
=M
u14
= (VHL)(EHVL)(ML)(LL)
=L
For quantitative measurement of the
fuzzy linguistic term, the roughly estimated
of the UX(2) can be obtained by selecting the
optimal grade of memberships which will
indicate the highest possibility of the server
utilization in state 1 and 2 after two
transitions. The quantitative calculation is
shown as follows:
UX(2) = [ H
VH
M
fuzzy designator vector U = [u1, u2, u3, u4].
The fuzzy steady-state designator vector U
= [u1, u2, u3, u4] can be determined according
to this equation:
 (u /\ m )  u
ij
i
n i
where i and j  1,2,..., n ( states)
Consequently, by considering the four
states vector, the four components of the
fuzzy steady-state row vector are show as
below:
u1
=(u1m11)(u2m21) ( u3m31)( u4m41)
u2
=(u1m12) (u2m22)  (u3m32)  (u4m42)
u3
=(u1m13)(u2m23)  (u3m33)  ( u4m43)
u4
= (u1m14) (u2m24)(u3m34)( u4m44)
The algorithm [13] to solve the row vector is
given as follows:
L]
Step 1: Initial the four component u1, u2, u3
and u4.
=[
0.2/0.5 + 0.7/0.6 + 1.0/0.7 + 0.5/0.8 + 0.2/0.9
0.04/0.5 + 0.49/0.6 + 1.0/0.7 + 0.25/0.8 + 0.04/0.9
Step 2: Let  as the threshold limit and use
the below equation to calculate each
component of the vector.
0.3/0.2 + 0.5/0.3 + 1.0/0.4 + 0.7/0.5 + 0.2/0.6
| Rx – Lx | < 
0.9/0.1 + 0.7/0.2 + 0.4/0.3 + 0.2/0.4 + 0.1/0.5
]
= [ 0.7
0.7
0.4
0.1 ]
For further analysis, if the fuzzy
transition matrix M is converging to a limit
when T=n, where n  , all the rows of
lim [U X(n) ] are equivalent to the steady-state
n
x = number of state
where the R and L are the right-handside and left-hand-side of the four
equation respectively.
Step 3: Terminate the computation if the
desired  has been fulfilled. Else,
randomly generate the values of the
u1, u2, u3 and u4 and go to step 2.
V. CONCLUSION
In this paper, a fuzzy inference system is
established to derive server utilization index
based on four workload parameters. Fuzzy
Markov model has been utilized to predict
the possibility of the server's resource
utilization states after some transition periods
and its steady states are derived. This fuzzy
Markov model for web server workload
forecasting presents another alternative
which will integrate the human experience
and judgement incorporating with numerical
data collected to forecast the incoming server
workload. The involvement of human factor
will definitely increase the typicality and
accuracy of estimation since it will adapt the
current adjustment. This forecasting model
will provide a useful reference especially for
future WWW accessibility and planning.
***
REFERENCES
[1] Michele Colajanni, Philip S. Yu and
Valeria Cardellini, “ Dynamic Load
Balancing in Geographically Distributed
Heterogeneous
Web
Servers”,
Proceeding of 18th International
Conference on Distributed Computing
Systems, 1998 , Page(s): 295 –302
[2] Michele Colajanni, Philip S. Yu,
“Scheduling Algorithms for Distributed
Web Servers”, Proc. ICDCS’97,
Baltimore, MD, May 1997, pp. 169-176.
[3] Crovella M., and Bestavros. A.,
"Explaining World Wide Web Traffic
Self-Similarity", Tech. Rep.BUCS-TR95F-015, Boston University, CD Dept,
Boston MA 02215, 1995.
[4] W. Leland and M.Taqqu, "On the SelfSimilar Nature of Ethernet Traffic", In
Proceedings of SIGCOMM'93, 1993.
[5] Chiraphadhanakul, S., Dangprasert, P.
and Avatchanakorn, V., “Genetic
Forecasting Algorithm with Financial
Applications”, Intelligent Information
Systems, 1997. ISS'97. Proceedings,
1997, Page(s): 174 –178, 1997.
[6] L.Feng and X.X.Guang, “A Forecasting
Model of Fuzzy Self Regression”,
Fuzzy Sets and Systems, 38, 239-242,
1993.
[7] J.Nie, “A Fuzzy-Neural Approach To
Time-Series Prediction”, in Proceeding
of IEEE International Conference on
Neural Network (Piscataway, NJ, IEEE
Service Center, 1994), pp.3164-3169,
1994.
[8] L. X. Wang and J. M. Mendel,
“Generating Fuzzy Rules By Learning
From Example”, IEEE Transaction
Systems, Man and Cybernetics, 22,
1414-1427,1992.
[9] Daniel A. Menasce, “Capacity Planning
for Web Performance: Metrics, Models
& Methods”, Prenctice Hall, Inc., 1998.
[10] V.
Ramachandran
&
V.
Sankaranarayanan, “Fuzzy Concepts
Applied To Statistical Decision Making
Methods”, 15th IFIP Conference, Zurich,
1991.
[11] J.-S.R. Jang, C.-T. Sun & E. Mizutani,
“Neuro-Fuzzy and Soft Computing – A
Computational Approach to Learning
and Machine Intelligence”, Prentice
Hall, Inc., 1997.
[12] Zadeh, L. A., “Linguistic Approach and
Its Applications in Decision Analysis”,
Directions in Large-Scale Systems –
Book, Plenum Press, pp 339-357, 1975.
Modelling – Linguistic Approach”,
Microelectorn. Reliab., Vol. 32, No.9,
pp.1311-1328, 1992.
[13] V.Ramachandran, V. Snakaranarayanan
and S. Seahasayee, “Fuzzy Reliability
T=0
T=1
(H)
T=2
1
(VH) 2
1
(M)
3
(L) 4
(VM)
1
(EH) 2
2
(L)
(VL)
(VH)
(EH)
3
4
1
2
3
(EM)
(L)
(VH)
3
4
1
(VEH) 2
4
(H)
(L)
3
4
min( mij )
maxmin( mij )
(H)
(VH)
(M)
(L)
1
2
3
4
m11
m12
m13
m14
=H
=H
=M
=L
m11(2) = H
(VM)
(EH)
(L)
(VL)
1
2
3
4
m11
m12
m13
m14
= VM
= EH
=L
= VL
m12(2) = VH
(M)
(L)
(M)
(L)
1
2
3
4
m11
m12
m13
m14
=M
=L
=M
=L
m13(2) = M
(L)
(VL)
(M)
(L)
1
2
3
4
m11
m12
m13
m14
=L
= VL
=L
=L
m14(2) = L
(H)
(VH)
(M)
(L)
1
2
3
4
m21
m22
m23
m24
=H
= EM
=L
= VL
m21(2) = VM
(VM)
(EH)
(L)
(VL)
1
2
3
4
m21
m22
m23
m24
=H
= EH
=M
=L
m22(2) = EH
(M)
(L)
(M)
(L)
1
2
3
4
m21
m22
m23
m24
=M
=M
=M
=L
m23(2) = M
(L)
(VL)
(M)
(L)
1
2
3
4
m21
m22
m23
m24
=L
=L
=L
=L
m24(2) = VL
(H)
(VH)
(M)
(L)
1
2
3
4
m31
m32
m33
m34
=H
= EM
=L
= VL
m31(2) = M
(VM)
(EH)
(L)
(VL)
1
2
3
4
m31
m32
m33
m34
=H
= EH
=M
=L
m32(2) = M
(M)
(L)
(M)
(L)
1
2
3
4
m31
m32
m33
m34
= EM
= EM
= EM
=L
m33(2) = M
(L)
(VL)
(M)
(L)
1
2
3
4
m31
m32
m33
m34
=L
=L
=L
=L
m34(2) = L
(H)
(VH)
(M)
(L)
1
2
3
4
m41
m42
m43
m44
=H
= EM
=L
= VL
m41(2) = M
(VM)
(EH)
(L)
(VL)
1
2
3
4
m41
m42
m43
m44
=H
= EH
=M
=L
m42(2) = L
(M)
(L)
(M)
(L)
1
2
3
4
m41
m42
m43
m44
=H
=H
= EM
=L
m43(2) = M
(L)
(VL)
(M)
(L)
1
2
3
4
m41
m42
m43
m44
=L
=L
=L
=L
m44(2) = L
Figure 3: Maxmin Principle Tree Diagram Transition States
Download