INPUT MODELING What distribution to pick How to estimate parameters BASICS • Many phenomena in the modeled system have unknown parametric properties (stochastic) – Life length of a bulb, miss distance of a bullet, time until the next arrival, number of patrols in a given week • Must make two decisions: – Distribution (based on physics of the phenomena) – Parameters (based on data) THINGS TO THINK ABOUT • • • • Is the value bounded (above or below)? Is there a most likely value? Is the value positive? Is the value a function of several sub-values – Is it the minimum? – Is it the sum • Is it discrete (one of a small set of values) or continuous (all numbers [in an interval] are possible) Bernoulli Trials Context: Random events with two possible values Two events: Yes/No, True/False, Success/Failure Two possible values: 1 for success, 0 for failure. Example: Tossing a coin, Packet Transmission Status, An experiment comprises n trial. Probability Mass Function (PMF): Probability in one trial x j 1, j 1, 2,..., n p, PMF: p one trial p (x j ) 1 p q , x j 0 , j 1, 2 ,..., n 0, otherwise Expected Value: E X j p Variance :V X j 2 p 1 p Bernoulli Process: n trials p X 1 , X 1,..., X n , p X 1 p X 2 ...p X n 4 Geometric Distribution Context: the number of Bernoulli trials until achieving the FIRST SUCCESS. It is used to represent random time until a first transition occurs. q PMF: p (X k ) 0, k 1 p , k 0,1, 2,..., n PMF otherwise CDF: F X p X k 1 1 p Expected Value : E X Variance :V X 2 k 1 p k q p 2 1 p p2 Discrete example: roll of a die Discrete Uniform p(x) 1/6 1 2 3 4 5 6 P(x) 1 all x x BINOMIAL DISTN • • • • • Let X1, X2, …, Xn be Bernoulli (yes/no’s) w.p. p Let B = their sum B is Binomial(n, p) E[B] = np VAR[B] = SQRT(np(1-p)) n i Pr ob( B k ) p (1 p) n i i Probability Distribution Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f (x) such that for any two numbers a and b, P a X b f ( x)dx b a The graph of f is the density curve. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Probability Density Function For f (x) to be a pdf 1. The area of the region between the graph of f and the x – axis is equal to 1. y f ( x) Area = 1 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Probability Density Function P(a X b)is given by the area of the shaded region. y f ( x) a b Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. UNIFORM RANDOM VARIABLE 1/(b-a) a b • Hard boundaries • Everything in between has equal (1/(b-a)) probability • Special a = 0, b = 1 TRIANGULAR a c b • Sum two uniforms to get a triangular • If the uniforms are identical, c bisects ab • If you know min, max, most likely, use the triang. Exponential Distribution • Usually, exponential distribution is used to describe the time or distance until some event happens. • It is in the form of: 1 x f ( x) e – where x ≥ 0 and μ>0. μ is the mean or expected value. What should exponential distribution look like 1 0.8 0.6 R(x) 0.4 0.2 0 0 x Exponential Distribution µ=20 1 x exp x 0 , f (x ) 20 20 0, otherwise µ=20 0, F (x ) x 1 exp , 20 x 0 x 0 15 DISTINCTIVE PROPERTIES OF EXPONENTIALS(1) P[ X t s X s] P[ X t ] • Memoryless (very weird) • X always > 0 • X possible over whole real number line – you can shift this around • Minimums of a large set of values is exponental DISTINCTIVE PROPERTIES OF EXPONENTIALS(1) min( X 1 ,..., X n ) ~ Expon(nl ) • Let X1, …, Xn ~ Expon(l) • l is the rate of failure, arrival, transition – Plants die at a rate of 20/year • E[X] = 1/l – The expected life length of a plan is 1/20 years • The minimum of the X’s is also exponential and has rate nl Standard Normal Cumulative Areas Shaded area = (z ) Standard normal curve 0 z NORMALS • Most overused distribution in input modeling • Support on (-inf, inf) • 2 intuitive parameters to estimate – and • Available in Excel using NORMDIST and NORMINV STRONG LAW OF LARGE NUMBERS n X i 1 ~ N (n , n ) 2 i • So X-bar ~N(m, s) • The approximation depends on the size of n and the underlying distribution of the X’s • The X’s do not become Normal if there are a bunch of them SOME CHALLENGES • Number of minutes we wait before a machine fails. • Grab every 100th component off of an assembly line, inspect for a specific defect – A single outcome – Number of samples before a defect is detected – Number of defects in 10 samples – Number of defects in 300 samples • Height of a random student on a basketball team – If you know the tallest and smallest – If you know the tallest, the smallest, and the most common height – If you know all of the heights THE SECOND STEP • Distribution choice – Based on the physical/operational properties of the quantity • Parameters – Based on data we observe – Use the Method of Moments • Measure things from the data like X-bar, VAR, max, min • Use the properties of the distribution to estimate the parameters EXAMPLE: BINOMIAL • • • • Recall that = np, 2 = np(1-p) Binomials (B) are sums of Bernouli’s (n X’s) We observe M samples of B Easy case: n known M M n j 1 j 1 i 1 B Bi X i npˆ B pˆ n EXAMPLE: BINOMIAL • Hard case: neither n nor p known s nˆ pˆ (1 pˆ ) 2 B B s nˆ (1 ) nˆ nˆ s2 B 1 B nˆ s2 B 1 B nˆ B nˆ 2 s 1 B 2