Simulation Modeling and Analysis Input Modeling 1

advertisement
Simulation Modeling and
Analysis
Input Modeling
1
Outline
•
•
•
•
•
•
•
Introduction
Data Collection
Matching Distributions with Data
Parameter Estimation
Goodness of Fit Testing
Input Models without Data
Multivariate and Time Series Input Models
2
Introduction
• Steps in Developing Input Data Model
– Data collection from the real system
– Identification of a probability distribution
representing the data
– Select distribution parameters
– Goodness of fit testing
3
Data Collection
• Useful Suggestions
–
–
–
–
–
–
Plan, practice, preobserve
Analyze data as it is collected
Combine homogeneous data sets
Watch out for censoring
Build scatter diagrams
Check for autocorrelation
4
Identifying the Distribution
• Construction of Histograms
– Divide range of data into equal subintervals
– Label horizontal and vertical axes appropriately
– Determine frequency occurrences within each
subinterval
– Plot frequencies
5
Physical Basis of Common
Distributions
• Binomial: Number of successes in n
independent trials each of probability p .
• Negative Binomial (Geometric): Number of
trials required to achieve k successes.
• Poisson: Number of independent events
occurring in a fixed amount of time and
space (Time between events is Exponential).
6
Physical Basis of Common
Distributions - contd
• Normal: Processes which are the sum of
component processes.
• Lognormal: Processes which are the product
of component processes.
• Exponential: Times between independent
events (Number of events is Poisson).
• Gamma: Many applications. Non-negative
random variables only.
7
Physical Basis of Common
Distributions - contd
• Beta: Many applications. Bounded random
variables only.
• Erlang: Processes which are the sum of
several exponential component processes.
• Weibull: Time to failure.
• Uniform: Complete uncertainty.
• Triangular: When only minimum, most
likely and maximum values are known.
8
Quantile-Quantile Plots
• If X is a RV with cdf F, the q-quantile of X
is the value  such that F() = P(X < ) = q
• Raw data {xi}
• Data rearranged by magnitude {yj}
• Then: yj is an estimate of the (j-1/2)/n
quantile of X, i.e.
yj ~ F-1[(j-1/2)/n]
9
Quantile-Quantile Plots -contd
• If F is a member of an appropriate family
then a plot of yj vs. F-1[(j-1/2)/n] is a
straight line
• If F also has the appropriate parameter
values the line has a slope = 1.
10
Parameter Estimation
• Once a distribution family has been
determined, its parameters must be
estimated.
• Sample Mean and Sample Standard
Deviation.
11
Parameter Estimation -contd
• Suggested Estimators
–
–
–
–
Poisson:  ~ mean
Exponential:  ~ 1/mean
Uniform (on [0,b]): b ~ (n+1) max(X)/n
Normal:  ~ mean; 2 ~ S2
12
Goodness of Fit Tests
• Test the hypothesis that a random sample of
size n of the random variable X follows a
specific distribution.
– Chi-Square Test (large n; continuous and
discrete distributions)
– Kolmogorov-Smirnov Test (small n;
continuous distributions only)
13
Chi-Square Test
• Statistic
20 = k (Oi - Ei)2/Ei
• Follows the chi-square distribution with ks-1 degrees of freedom (s = d.o.f. of given
distribution)
• Here Ei = n pi is the expected frequency
while Oi is the observed frequency.
14
Chi-Square Test -contd
• Steps
–
–
–
–
Arrange the n observations into k cells
Compute the statistic 20 = k (Oi - Ei)2/Ei
Find the critical value of 2 (Handout)
Accept or reject the null hypothesis based on
the comparison
• Example: Stat::Fit
15
Chi-Square Test - contd
• If the test involves a discrete distribution
each value of the RV must be in a class
interval unless combined intervals are
required.
• If the test involves a continuous distribution
class intervals must be selected which are
equal in probability rather than width.
16
Chi-Square Test - contd
• Example: Exponential distribution.
• Example: Weibull distribution.
• Example: Normal distribution.
17
Kolmogorov-Smirnov Test
• Identify the maximum absolute difference D
between the values of of the cdf of a random
sample and a specified theoretical
distribution.
• Compare against the critical value of D
(Handout).
• Accept or reject H0 accordingly
• Example.
18
Input Models without Data
• When hard data are not available, use:
–
–
–
–
–
Engineering data (specs)
Expert opinion
Physical and/or conventional limitations
Information on the nature of the process
Uniform, triangular or beta distributions
• Check sensitivity!
19
Multivariate and Time-Series
Input Models
• If input variables are not independent their
relationship must be taken into
consideration (multivariable input model).
• If input variables constitute a sequence (in
time) of related random variables, their
relationship must be taken into account
(time-series input model).
20
Covariance and Correlation
• Measure the linear dependence between two
random variables X1 (mean 1, std dev 1)
and X2 (mean 2, std dev 2)
X1 - 1 = (X2 - 2) + 
• Covariance:
cov(X1,X2) = E(X1 X2) - 1 2
• Correlation:
 = cov(X1,X2)/12
21
Multivariate Input Models
• If X1 and X2 are normally distributed and
interrelated, they can be modeled by a
bivariate normal distribution
• Steps
– Generate Z1 and Z2 indepedendent standard
RV’s
– Set X1 = 1 + 1 Z1
– Set X2 = 2 + 2(Z1 + (1-2)1/2 Z2)
22
Time-Series Input Models
• Let X1,X2,X3,… be a sequence of
identically distributed and covariancestationary RV’s. The lag-h correlation is
h = corr(Xt,Xt+h) = h
• If all Xt are normal: AR(1) model.
• If all Xt are exponential: EAR(1) model.
23
AR(1) model
• For a time series model
Xt =  +  (Xt-1 - ) + t
where
t are normal with mean = 0 and var = 2

24
AR(1) model -contd
1.- Generate X1 from a normal with mean 
and variance 2 /(1 - 2). Set t = 2.
2.- Generate t from a normal with mean = 0
and variance 2 .
3.- Set Xt =  +  (Xt-1 - ) + t
4.- Set t = t+1 and go to 2.
25
EAR(1) model
• For a time series model
Xt =  Xt-1 with prob
Xt =  Xt-1 + t with prob
where
t are exponential with mean = 1/ and

26
EAR(1) model - contd
1.- Generate X1 from an exponential with
mean  . Set t = 2.
2.- Generate U from a uniform on [0,1]. If U
<  set Xt =  Xt-1 . Otherwise generate
from an exponential with mean 1/ and set
Xt =  Xt-1 + t
4.- Set t = t+1 and go to 2.
27
Download