Selecting Input Probability Distributions Techniques for Assessing Sample Independence - 1 • Most methods to specify input distributions assume observed data X1, X2, ..., Xn are an independent (random) sample from some underlying distribution – If not, most methods are invalid – Need a way to check data empirically for independence – Heuristic plots vs. formal statistical tests for independence 2 Techniques for Assessing Sample Independence - 2 • Correlation plot: If data are observed in a time sequence, compute sample correlation with lag j and plot it n j ˆ j X i 1 i X ( n ) X i j X ( n) n j S 2 ( n) – If data are independent then the correlations should be near zero for all lags – Keep in mind that these are just estimates 3 Techniques for Assessing Sample Independence - 3 • Scatter diagram: Plot pairs (Xi, Xi+1) – If data are independent the pairs should be scattered randomly – If data are positively (negatively) correlated the pairs will lie along a positively (negatively) sloping line 4 Techniques for Assessing Sample Independence • Independent draws from expo(1) distribution (independent by construction) 5 Techniques for Assessing Sample Independence • Delays in queue from M/M/1 queue with utilization factor ρ = 0.8 (positively correlated) • There are also formal statistical testing procedures, like the “runs” test 6 Activity I: Hypothesizing Families of Distributions - 1 • First, need to decide what form or family to use—exponential, gamma, or what? • Later, need to estimate parameters and assess goodness of fit • Sometimes prior knowledge of a certain random variable’s role in simulation may be useful: – Requires no data – Use theoretical knowledge of random variable’s role in simulation 7 Activity I: Hypothesizing Families of Distributions - 2 • Sometimes prior knowledge of a certain random variable’s role in simulation may be useful: – Seldom have enough prior knowledge to specify a distribution completely. Exceptions are: • Arrivals one-at-a-time, constant mean rate, independent: exponential interarrival times or Poisson arrival process • Sum of many independent pieces: normal • Product of many independent pieces: lognormal • Percentage/probability: Beta (range is in (01,) – Often use prior knowledge to rule out distributions on basis of range • Service times: not normal (normal range always goes negative) – Still should be supported by data (e.g., for parameter-value estimation) Summary Statistics • Compare simple sample statistics with theoretical population versions for some distributions to get a hint – Bear in mind that we get only estimates subject to uncertainty • If sample mean X (n) and sample median xˆ0.5 (n) are close, suggests a symmetric distribution • Coefficient of variation of a distribution: cv = σ/μ; estimate via; sometimes useful for discriminating between continuous distributions cvˆ S (n) / X (n) – cv < 1 suggests gamma or Weibull with shape parameter α > 1 – cv = 1 suggests exponential – cv > 1 suggests gamma or Weibull with shape parameter α < 1 9 Summary Statistics • Lexis ratio of a distribution: = σ2/μ; estimate via ˆ S 2 (n) / X (n) Sometimes, useful for discriminating between discrete distributions: – < 1 suggests binomial – = 1 suggests Poisson – > 1 suggests negative binomial or geometric • Other summary statistics: range, skewness, kurtosis 10 Histograms Continuous Data Set • hj : Basically an unbiased estimate of Db f(x), where f(x) is the true (unknown) underlying density of the observed data and Db is a constant • Break range of data into k intervals of width Db each – k, Db are determined basically by trial and error – One rule of thumb: Sturges’s rule k 1 log 2 n 1 3.332 log 10 n – Compute proportion hj of data falling in jth interval; plot a constant of height hj above the jth interval – Shape of plot should resemble density of underlying distribution; compare shape of histogram to density shapes. Histograms Discrete Data Set • Basically an unbiased estimate of the (unknown) underlying probability mass function of the data • For each possible value xj that can be assumed by the data, let hj be the proportion of the data that are equal to xj; plot a bar of height hj above xj • Shape of plot should resemble mass function of underlying distribution; compare shape of histogram to mass-function shapes in Sec. 6.2.3 Multimodal Data • Histogram might have multiple local modes, rather than just one; no single “standard” distribution adequately represents it • Reason: data can be separated on some context-dependent basis (e.g., observed machine downtimes are classified as minor vs. major) • Separate data on this basis, fit separately, recombine as a mixture Quantile Summaries • Numerical synopsis of sample quantiles useful for detecting whether underlying density or mass function is symmetric or skewed one way or the other • Definition of quantiles: Suppose the CDF F(x) is continuous and strictly increasing whenever 0 < F(x) < 1, and let q be strictly between 0 and 1. Then the q-quantile of F(x) is the number xq such that F(xq) = q. If F–1 is the inverse of F, then xq = F–1(q) – q = 0.5: median – q = 0.25 or 0.75: quartiles – q = 0.125, 0.875: octiles – q = 0, 1: extremes 13 Quantile Summaries • Quantile summary: List median, average of quartiles, average of octiles, and average of extremes • If distribution is symmetric, then median average of quartiles average of octiles average of extremes • If distribution is skewed right, then median < average of quartiles < average of octiles < average of extremes • If distribution is skewed left, then median > average of quartiles > average of octiles > average of extremes 14 Box Plots • Graphical display of quantile summary • On horizontal axis, plot median, extremes, octiles, and a box ending at quartiles • Symmetry or asymmetry of plot indicates symmetry or skewness of distribution 15 Hypothesizing a Family of Distributions: Example with Continuous Data • Sample of n = 219 interarrival times of cars to a drive-up bank over a 90-minute peak-load period are obtained – Number of cars arriving in each of the six 15-minute periods was approximately equal (suggesting stationarity of arrival rate) – Sample mean = 0.399 (all times in minutes) > median = 0.270, skewness = +1.458 (suggesting right skewness) – cv = 0.953, close to 1 (suggesting exponential distribution) • Histograms (for different choices of interval width Db) suggest the exponential distribution 16 Hypothesizing a Family of Distributions: Example with Discrete Data • Sample of n = 156 observations on number of items demanded per week from an inventory over a three-year period are obtained – Range 0 through 11 – Sample mean = 1.891 > median = 1.00, skewness = +1.655 (suggesting right skewness) – Lexis ratio = 5.285/1.891 = 2.795 > 1 (suggesting negative binomial or geometric (special case of negative binomial) distribution) • Histogram suggests the geometric distribution 17 Activity II: Estimation of Parameters • Have: Hypothesized parametric distribution • Need: Numerical estimates of its parameter(s)—this constitutes the “fit” • Many methods to estimate distribution parameters – Method of moments – Unbiased estimates – Least squares estimation – Maximum likelihood estimation (MLE) • In some sense, MLE is the preferred method for our purposes – Good statistical properties – Somewhat justifies chi-square goodness-of-fit test – Intuitive – Allows estimates of error in the parameters—sensitivity analysis Activity II: Estimation of Parameters • Idea for MLEs: – Have observed sample X1, X2, ..., Xn – Came from some true (unknown) parameter(s) of the distribution form – Pick the parameter(s) that make it most likely that you would get what you did get (or close to what you got in the continuous case) – An optimization (mathematical-programming) problem, often messy Desirable properties of MLEs • Invariance Principle: Let ˆ1 , , ˆm be MLEs of the parameters θ1,…, θm. Then the mle of any function h(θ1,…, θm) of these parameters is the function h(ˆ1 , , ˆm ) of the MLEs. • Asymptomatic behavior of the MLEs: When the sample size is large: – The MLE is approximately the Minimum Variance Unbiased Estimator (MVUE) of θ. – The MLE is approximately normal 20 MLEs for Discrete Distributions • Have hypothesized family with PMF pθ (xj) = Pθ(X = xj) • Single (for now) unknown parameter θ to be estimated • For any trial value of θ, the probability of getting the already-observed sample is • P{Getting X1, X2,..., Xn} = P(X1)P(X2)...P(Xn) = P(X = X1)P(X = X2)...P(X = Xn) = pθ(X1) pθ(X2)... pθ(Xn) = Likelihood function L(θ) • Task: Find the value of θ that makes L(θ) as big as it can be • How? Differential calculus, take logarithm (turns products into sums), nonlinear programming methods, tabled values in the book • MLEs for continuous distributions are obtained by replacing the PMF with the PDF • MLEs for multiparameter distributions are obtained similarly where the likelihood function is a function of all of the parameters Example of Continuous MLE X (219) 0.399 • {Xi|i=1,…,219}: a random sample • Hypothesized distribution is the exponential distribution 1 x / e f ( x) 0 if x 0 otherwise • Likelihood function is: L( ) ( 1 e X1 / )( 1 e X2 / )...( 1 e Xn / ) n exp( 1 n X) i 1 i • Want value of β that maximizes L(β) over all β > 0 • Equivalent (and easier) to maximize the log-likelihood function l(β) = ln L(β) since ln is a monotonically increasing function 22 Example of Continuous MLE l ( ) n ln 1 n X i 1 i n dl ( ) n 1 2 d n X i 0 ˆ i 1 d 2l ( ) n 2 n d 2l 2 3 Xi 2 d i 1 d 2 ˆ X i 1 n i X ( n) n 2nX (n) n 0 2 3 2 X ( n) X ( n) X ( n) ˆ X (n) 0.399 from the observed sample of n 219 points 23 Example of Discrete MLE • Hypothesized distribution is the geometric distribution p p ( x) p(1 p) x (x 0,1, 2,...) • Likelihood function is L( p) p(1 p) p(1 p) X1 • • n X2 p(1 p) Xn Xi p (1 p) i1 n Want value of β that maximizes L(β) over all β > 0 Equivalent (and easier) to maximize the log-likelihood function l(β) = ln L(β) since ln is a monotonically increasing function 24 Example of Discrete MLE n Xi dl ( p) n 1 i 1 0 pˆ dp p 1 p X ( n) 1 n Xi d 2l ( p ) n 2 i 1 2 0 2 dp p 1 p MLE is pˆ 1 0.346 from the observed sample of n 156 points 1.891 1 n E[ X i ] d 2l ( p ) n n n(1 p) / p n i 1 E 2 p 2 (1 p) 2 p2 (1 p) 2 p 2 (1 p) dp d 2l ( p ) 2 ( p) n / E p (1 p) 2 dp pˆ z1 / 2 ( pˆ ) n pˆ z1 / 2 pˆ 2 (1 pˆ ) n 0.3462 (1 0.346) 90% CI: 0.346 1.645 0.346 0.037 [0.309, 0.383] 156 25 Activity III: Determining How Representative the Fitted Distributions Are • Have: Hypothesized family and estimated parameters • Question: Does the fitted distribution agree with the observed data? • Approaches – Heuristic • Density/Histogram overplots and frequency comparisons • Distribution function differences plots • Probability and quantile plots – P-P plot – Q-Q plot – Statisticals tests • Chi-square goodness-of-fit tests • Kolmogorov-Smirnov tests • Anderson-Darling tests 26 • Poisson process tests Heuristic Procedures Continuous Data • Density/Histogram Overplots and Frequency Comparisons – Density/histogram overplot: Plot Db fˆ ( x) over the histogram h(x); look for similarities (recall that the area under h(x) is Db and fˆ is the density of the fitted distribution) – Interarrival-time data for drive-up bank and fitted exponential 27 Heuristic Procedures Continuous Data • Data Set 3 : Against a Weibul distribution Density-Histogram Plot 0.19 Density/Proportion 0.15 0.11 0.08 0.04 0.00 2.30 4.70 7.10 9.50 Interval Midpoint 13 intervals of w idth 1.2 1 - Weibull 11.90 14.30 16.70 28 Heuristic Procedures • Frequency comparison – Take histogram intervals [bj–1, bj] for j = 1, 2, ..., k, each of width Db – Let hj be the observed proportion of data in jth interval – Let bj rj fˆ ( x)dx b j 1 be the expected proportion of data in jth interval if the fitted distribution is correct – Plot hj and rj together, look for similarities 29 Heuristic Procedures Discrete Data • Frequency comparison – Let hj be the observed proportion of data that are equal to the jth possible value xj – Let rj pˆ ( x j ) be the expected proportion of the data equal to xj if the fitted probability mass function p̂ is correct – Plot hj and rj together, look for similarities – Demand-size data for inventory and fitted geometric distribution 30 Distribution Function Differences Plots • Density/histogram overplots are comparisons of individual probabilities of fitted distribution with observed individual probabilities • Instead of individual probabilities, one can compare cumulative probabilities via fitted CDF Fˆ ( x ) against a (new) empirical CDF number of X i ' s x Fn ( x) proportion of data that are x n • Could plot Fˆ ( x) with Fn (x) and look for similarities, but it is harder to see such similarities for cumulative than for individual probabilities • Alternatively, plot Fˆ ( x) Fn ( x) against the range of x values and look for closeness to a flat horizontal line at 0. 31 Distribution Function Differences Plots Interarrival-time data for drive-up bank and fitted exponential distribution 32 Distribution Function Differences Plots Distribution-Function-Differences Plot 0.20 Data Set 3 Weibull 0.13 Difference (Proportion) 0.07 0.00 -0.07 -0.13 -0.20 1.76 3.96 6.17 8.37 10.58 12.79 x Use caution if plot crosses line 1 - Weibull (mean diff. = 0.01706) 14.99 33 17.20 Distribution Function Differences Plots Demand-size data for inventory and fitted geometric distribution Probability Plots • This is another way to compare CDF of fitted distribution with an empirical distribution directly from the data • Sort data into increasing order: X(1), X(2), ..., X(n) (called the order statistics of the data) • Another empirical CDF definition, defined only at the order statistics is Fn ( X (i ) ) is the observed proportion of data X (i ) = i / n (adjust to (i - 0.5) / n) • If F(x) is the true (unknown) CDF of the data, then F(x) = P(X x) for any x • So taking x = X(i), F(X(i)) = P(X X(i)), which is estimated by (i – 0.5)/n • Thus, we should have F(X(i)) (i – 0.5)/n, for all i = 1, 2, ..., n 35 P-P Plots q ~ Fn ( x) P-P Plot Fˆ ( x ) ~ Fn : Empirical distributi on Fˆn : Fitted model distributi on x P-P Plot • If the fitted distribution (with CDF F̂ ) is correct (i.e. close to the true unknown F), we should have (i 0.5) Fˆ ( X (i ) ) n • So, plotting the pairs (i 0.5) ˆ , F ( X ) (i ) n for all i = 1, 2, ..., n should result in an approximately straight line from (0, 0) to (1, 1) if F̂ is correct • This plot is valid for both continuous and discrete data • It is sensitive to misfits in the center of the range of the distribution 37 P-P Plot P-P plot of interarrival-time data for fitted exponential distribution 38 P-P Plot P-P plot of demand-size data for fitted geometric distribution 39 P-P Plot P-P plot of Data Set 3 against a Weibull distribution P-P Plot 1.00 0.80 Model Value 0.60 0.40 0.20 0.00 0.00 0.20 0.40 0.60 0.80 Sample Value Range of sample 1 - Weibull (discrepancy = 0.03753) 1.00 40 Q-Q Plot ~ Fn : Empirical distributi on Fˆn : Fitted model distributi on q xqS (ample) xqM (odel) Q-Q Plot Q-Q Plot • Taking the inverse Fˆ 1 of the P-P plot, we get the new plot ˆ 1 (i 0.5) F , X (i ) n • So, plotting these pairs for all i = 1, 2, ..., n should result in an approximately straight line from (X(1), X(1)) to (X(n), X(n)) if F̂ is correct • This plot is valid only for continuous data • Depending on the form of the fitted distribution, there may not be a ˆ 1 closed-form formula for F • It is sensitive to misfits in the tails of the distributions 42 Q-Q Plot Q-Q plot of interarrival-time data for fitted exponential distribution 43 Goodness-of-Fit Test • Formal statistical hypothesis tests for – H0: The observed data X1, X2, ..., Xn are IID random variables with F̂ distribution function • Caution: Failure to reject H0 does not constitute “proof” that the fit is good – Power of some goodness-of-fit tests is low, particularly for small sample size n • Also, large n creates high power, so tests will nearly always reject H0 – Keep in mind that null hypotheses are seldom literally true, and we are looking for an “adequate” fit of the distribution 44 Chi-Square Test • Very old (Karl Pearson, 1900), and general (continuous or discrete data) • It is a statistical formalization of frequency comparisons • Divide range of data into k intervals, not necessarily of equal width: – [a0, a1), [a1, a2), ..., [ak–1, ak] – a0 could be –or ak could be + • Compare actual amount of observed data in each interval with what the fitted distribution would predict – Let Nj = the number of observed data points in the jth interval – Let pj = the expected proportion of the data in the jth interval if the fitted distribution were literally true 45 Chi-Square Test aj fˆ ( x)dx for continuous p j a j1 pˆ ( x) for discrete a j1 x a j • Thus, n pj = expected (under fitted distribution) number of points in the jth interval • If fitted distribution is correct, would expect that Nj n pj • The test statistic is k 2 j 1 N np j 2 j np j 46 Chi-Square Test • Under H0: Fitted distribution is correct, 2 has (approximately—see book for details) a chi-square distribution with k – 1 degrees of freedom • Reject H0 at level α if 2 > upper critical value • Advantages – Completely general – Asymptotically valid (as n ) if MLEs were used • Drawbacks – Arbitrary choice of intervals (can affect test conclusion) – Conventional advice • Want n pj 5 or so for all but a couple of j’s • Pick intervals such that the pj’s are close to each other 47 Chi-Square Test Exponential distribution fitted to interarrival-time data • Choose k = 20 intervals so that pj = 1/20 = 0.05 for each interval (see book for details on how the endpoints were chosen, which involved inverting the exponential CDF and taking a20 = +) • Thus, npj = (219) (0.05) = 10.95 for each interval • Counted observed frequencies Nj, computed test statistic χ2 = 22.188 • Use degrees of freedom = k – 1 = 19; upper 0.10 critical level is χ219,0.90 = 27.204 • Since test statistic does not exceed the critical level, do not reject H0 48 Chi-Square Test Geometric distribution fitted to demand-size data • Since data are discrete, cannot choose intervals so that the pj’s are exactly equal to each other • Choose k = 3 intervals (classes) {0}, {1, 2}, and {3, 4, ...} • Obtained np1 = 53.960, np2 = 58.382, and np3 = 43.658 • Counted observed frequencies Nj, computed test statistic χ2 = 1.930 • Use degrees of freedom = k – 1 = 2; upper 0.10 critical level is χ22,0.90 = 4.605 • Since test statistic does not exceed the critical level, do not reject H0 49 Chi-Square Test • Data Set 3 against Weibull 50 Chi-Square Test • Data Set 3 against Gamma 51 Kolmogorov-Smirnov Test • Advantages with respect to chi-square tests – No arbitrary choices, like intervals – Exactly valid for any (finite) n • Disadvantage with respect to chi-square tests – Not as general • It is a kind of a statistical formalization of probability plots – Compare empirical CDF from data against fitted CDF • Yet another version of empirical distribution function is used – Fn(x) = proportion of the Xi data that are x (piecewise linear step function) – On the other hand, we have the fitted CDF Fˆ ( x ) – In a perfect world, Fn(x) = Fˆ ( x ) for all x 52 Kolmogorov-Smirnov Test • The worst (vertical) discrepancy is Dn sup F n( x) Fˆ ( x) x • Computing Dn (must be careful; sometimes stated incorrectly) i Dn max Fˆ X (i ) i 1, 2 ,..., n n i 1 Dn max Fˆ X (i ) i 1, 2 ,..., n n Dn max Dn , Dn i 1, 2 ,..., n • Reject H0: The fitted distribution is correct if Dn is not too big • If all parameters are known, reject H0 if 0.11 n 0.12 Dn c1 n • There are several different kinds of tables depending on the form and specification of the hypothesized distribution (see book for details and example) 53 Kolmogorov-Smirnov Test 54 Anderson-Darling Test • As in K-S test, look at vertical discrepancies between Fˆ ( x) and F n(x) • Difference: – K-S weights differences the same for each x – Sometimes more interested in getting accuracy in (right) tail • Queueing applications • P-K formula depends on variance of service-times – A-D applies increasing weight on differences toward tails – A-D is more sensitive (powerful) than K-S in tail discrepancies • Define the weight function ( x) 1 Fˆ ( x) 1 Fˆ ( x) • Note that Ψ(x) is smallest (= 4) in the middle (median) where Fˆ ( x) 1 / 2 and largest () in either tail Anderson-Darling Test • Test statistic is An2 n [ Fn ( x) Fˆ ( x)]2 ( x) fˆ ( x)dx (2i 1)ln Fˆ ( X n i 1 (i ) ) ln 1 Fˆ ( X ( n 11) ) n n • Reject H0: The fitted distribution is correct if An2 is too big • There are several different kinds of tables depending on the form and specification of the hypothesized distribution (see book for details and example) 56 Anderson-Darling Test 57 Poisson-Process Test • Common situation in simulation: modeling an event or arrival process over time: – Arrivals of customers or jobs, Breakdowns of machines, Occurrence of accidents • A popular (and realistic) model is the Poisson process with some rate λ • Equivalent definitions are 1. Number of events in (t1, t2] ~ Poisson with mean λ (t2 – t1) 2. Time between successive events ~ exponential with mean 1/ λ 3. Distribution of events over a fixed period of time is uniform • Use second or third definitions to develop test for observed data coming from a Poisson process: • Test for inter-event times being exponential (chi-square, K-S, A-D, ...) • Test for placement of events over time being uniform • See book for details and example The ExpertFit Software • • Need software assistance to carry out the statistical procedures Standard statistical-analysis packages do not suffice – Often too oriented to normal-theory and related distributions – Need wider variety of “nonstandard” distributions to achieve an adequate fit – Difficult calculations like inverting non-closed-form CDFs, computation of critical values and p-values for tests • • • • ExpertFit package is tailored to these needs Other packages exist, sometimes packaged with simulation-modeling software See book for details on ExpertFit and an extended, in-depth example In Arena, distribution fitting is done with Input Analyzer 59 Shifted Distributions • Many standard distributions have range [0, ) – Exponential, gamma, Weibull, lognormal, PT5, PT6, log-logistic • But in some situations we’d like the range to be [g, ) for some parameter – A service time cannot physically be arbitrarily close to 0; there is some absolute positive minimum gfor the service time • One can shift one of the above distributions to the right by g – Replace x in their density definitions by x – g(including in the definition of the ranges) • Introduces a new parameter gthat must be estimated from the data – Depending on the distribution form, this may be relatively easy (e.g., exponential) or very challenging (e.g., global MLEs are illdefined for gamma, Weibull, lognormal) 60 Truncated Distributions • Data are well-fitted by a distribution with range [0, ) but physical situation dictates that no value can exceed some finite constant b • Need to truncate the distribution above b to make effective range [0, b] • This is really a random variate-generation issue (covered in Chapter 8) 61 Multivariate Distributions, Correlations, and Stochastic Processes • Assumption so far: Want to generate independent, identically distributed (IID) random variables input to drive the simulation • Sometimes there is correlation between random variables in reality – A = interarrival time of a job from an upstream process – S = service time of job at the station being modeled – Perhaps a large A means that the job is “large,” taking a lot of time upstream— then it probably will take a lot of time here too (S large) so that Cor(A, S) > 0 – Ignoring this correlation can lead to serious errors in output validity – Need ways to estimate this dependence, and (later) generate it in the simulation 62 Multivariate Distributions • Some of the model’s input random variables together form a jointly distributed random vector • One must specify the joint distribution form and estimate its parameters • Correlations between the random variables is then determined by the joint distribution form • At present, this is limited to several specific special cases (see book for details): multivariate normal, multivariate lognormal, multivariate Johnson, and bivariate Bézier ( X1 , X 2 , , X n ) ~MVN ( , ) E X i i , Cov( X i , X j ) ij f ( x1 , x2 , , xn ) 1 2 n/2 det e ( x )T 1 ( x ) 2 63 Arbitrary Marginal Distributions and Correlations • This is less ambitious than specifying the joint distribution, but affords greater flexibility • Allow for possible correlation between input random variables, but fit their univariate (marginal) distributions separately • Must specify the univariate marginal distributions (earlier methods) and estimate the correlations (fairly easy) • Does not in general uniquely specify the joint distribution – Except in multivariate normal case where specifying the marginal distributions and all the correlations uniquely specify the joint distribution • Must take care that the correlations are compatible with the marginal distributions – Marginal distributions place constraints on what correlation structure is theoretically possible • How to generate this structure for input to the simulation? 64 Stochastic Processes • Suppose you have an input stochastic process {X1, X2, ...} where the Xi’s have the same distribution, but there is a correlation structure for them at various lags;for example, Xi is the size of the ith incoming message in a communications system, and it could be that large messages tend to be followed by other large messages (or the reverse) • One can regard this as an infinite-dimensional random vector for input • Some specific models (see book for details) are – Autoregressive (AR) X i 1 ( X i 1 ) 2 ( X i 2 ) p ( X i p ) i – Autoregressive moving average (ARMA) X i 1 ( X i 1 ) 2 ( X i 2 ) p ( X i p ) 1 i 1 2 i 2 q i q i 65 Selecting a Distribution in the Absence of Data • What if there are no data? • One must rely to some extent on subjective information (guesses) • You can ask an “expert” for: – min, max Uniform distribution [a,b] – min, max, mode Triangular distribution [a,b,c] – min, max, mode, mean Beta distribution [a,b,c,μ] • See book for details and example • It is a good idea to do sensitivity analysis and change the input distributions to see if output changes appreciably 66 Models of Arrival Processes • • • In many applications, we need to model arrival processes As in distributions, we need to specify its form and estimate the parameters Three common models: Poisson Process • Three “behavioral” assumptions: 1. Events occur one at a time 2. Number of events in a time interval is independent of the past 3. Expected rate of events is constant over time • Fitting: Fit exponential dsitribution to interevent times via MLE • Testing: Chi-square or K-S test for the exponential distribution P(T1 t ) e t , P( N (t ) n) e t t n n! 67 Poisson Process • Three “behavioral” assumptions: 1. Events occur one at a time 2. Number of events in a time interval is independent of the past 3. Expected rate of events is constant over time • Fitting: Fit exponential distribution to inter-event times via MLE • Testing: Chi-square or K-S test for the exponential distribution P(T1 t ) e t , P( N (t ) n) e t t n n! 68 Nonstationary Poisson Process • Drop behavioral assumption 3 above and keep the others • This allow for expected rate of events to vary with time, so replace arrival rate constant λ with a function λ(t) t (t ) ( s)ds 0 P( N (t2 ) N (t1 ) n) e ( t2 ) ( t1 ) (t2 ) (t1 ) n n! (T t ) (Tn ) P(Tn 1 Tn t | Tn ) e n • Estimation of the rate function – Assume rate function is constant over subintervals of time • Must specify subintervals thought to be appropriate • Must be careful to keep the units straight – Other methods exist (see book for discussion and references) Compound Poisson Process (Batch Arrivals) • Drop behavioral assumption 1 above and keep the others • This allows number of events arriving to be a discrete random variable independent of the event-time process • The event time process is a Poisson process X (t ) B1 B2 BN (t ) • Fitting: – Fit exponential distribution to interevent times via MLE – Fit a discrete random variable to observed “group” sizes • Testing – Chi-square or K-S tests separately for interevent times and group sizes 70 Assessing the Homogeneity of Different Data Sets • Sometimes we have different data sets on related but separate processes – Service-time observations for ten different days – Can the data sets be merged? – In other words, is the underlying distribution the same for each day? • Advantages of merging (if it can be justified) – Larger sample size, so get better specification of the input distribution – Just one specification problem rather than several – Just one distribution from which to generate in the simulation model • We need to test – H0: All the population distribution functions are identical, vs. – H1: At least one of the populations tends to yield larger observations than at least one of the other populations • Formal statistical test is the Kruskal-Wallis test, which is a nonparametric test based on the ranks of the data sets (see book for details)