Week 5 Sep 29 – Oct 3 Four Mini-Lectures QMM 510 Fall 2014 Chapter Contents 7.1 Describing a Continuous Distribution 7.2 Uniform Continuous Distribution 7.3 Normal Distribution 7.4 Standard Normal Distribution 7.5 Normal Approximations 7.6 Exponential Distribution 7.7 Triangular Distribution (Optional) 7-2 So many topics, so little time … Chapter 7 Continuous Probability Distributions ML 5.1 Chapter 7 Continuous Distributions Events as Intervals • Discrete Variable – each value of X has its own probability P(X). • Continuous Variable – events are intervals and probabilities are areas under continuous curves. A single point has no probability. 7-3 Chapter 7 Continuous Distributions PDF – Probability Density Function Continuous PDF: • Denoted f(x) • Must be nonnegative • Total area under curve = 1 • Mean, variance, and shape depend on the PDF parameters • PDF reveals the shape of the distribution 7-4 Chapter 7 Continuous Distributions CDF – Cumulative Distribution Function Continuous CDF: • • • • Denoted F(x) Shows P(X ≤ x), the cumulative proportion below x. Shows the area to the left of any given point on the PDF. There are Excel functions for either the PDF or CDF. 7-5 Chapter 7 Continuous Distributions Probabilities as Areas • Continuous probability functions: • Unlike discrete distributions, the probability at any single point is 0. • The entire area under any PDF, by definition, is 1. • Mean is the balance point of the distribution. 7-6 Chapter 7 Continuous Distributions Expected Value and Variance The mean and variance of a continuous random variable are analogous to E(X) and Var(X ) for a discrete random variable. Here the integral sign replaces the summation sign. Calculus is required to compute the integrals. 7-7 Chapter 7 Normal Distribution Characteristics of the Normal Distribution 7-8 • Normal or Gaussian (or bell-shaped) distribution was named for German mathematician Karl Gauss (1777 – 1855). • Defined by two parameters, µ and . • Denoted N(µ, ). • Domain is – < X < + (continuous scale). • Almost all (99.7%) of the area under the normal curve is included in the range µ – 3 < X < µ + 3. • Symmetric and unimodal about the mean. Characteristics of the Normal Distribution 7-9 Chapter 7 Normal Distribution Characteristics of the Normal Distribution • Normal PDF f(x) reaches a maximum at µ and has points of inflection at µ ± Bell-shaped curve Note: All normal distributions have the same shape but differ in the axis scales. • Excel function for PDF (height of the function at x) is =NORM.DIST(x, µ, , 0) 0 for PDF, 1 for CDF 7-10 Chapter 7 Normal Distribution Characteristics of the Normal Distribution • • Normal CDF has a “lazy-S” shape Excel function for CDF (area to left of x) is =NORM.DIST(x, µ, , 1) 0 for PDF, 1 for CDF 7-11 Chapter 7 Normal Distribution Characteristics of the Standard Normal Distribution Since for every value of µ and , there is a different normal distribution, we transform a normal random variable to a standard normal distribution with µ = 0 and = 1 using the formula z = (x µ)/. • 7-12 Chapter 7 Standard Normal Distribution Characteristics of the Standard Normal • Standard normal PDF f(x) reaches a maximum at z = 0 and has points of inflection at +1. • Shape is unaffected by the transformation. It is still a bell-shaped curve. Standard normal tables or Excel functions can be used to find the desired probabilities. • 7-13 Excel function for CDF (area to left of z) is =NORM.DIST(z, 1) Figure 7.11 Chapter 7 Standard Normal Distribution Characteristics of the Standard Normal • Standard normal CDF • • • • 7-14 A common scale from 3 to +3 is used. Entire area under the curve is unity. The probability of an event P(z1 < Z < z2) is a definite integral of f(z). However, standard normal tables or Excel functions can be used to find the desired probabilities. Chapter 7 Standard Normal Distribution Normal Areas from Appendix C-1 7-15 • Appendix C-1 allows you to find the area under the curve z. • For example, find P(0 < Z < 1.96): Chapter 7 Standard Normal Distribution from 0 to Normal Areas from Appendix C-1 • • • 7-16 Now find P(1.96 < Z < 1.96). Due to symmetry, P(1.96 < Z) is the same as P(Z < 1.96). So, P(1.96 < Z < 1.96) = .4750 + .4750 = .9500, or 95% of the area under the curve. Chapter 7 Standard Normal Distribution Basis for the Empirical Rule • • • 7-17 Approximately 68% of the area under the curve is between + 1 Approximately 95% of the area under the curve is between + 2 Approximately 99.7% of the area under the curve is between + 3 Chapter 7 Standard Normal Distribution Normal Areas from Appendix C-2 • Appendix C-2 allows you to find the area under the curve from the left of z (similar to Excel). • This table is the CDF (not the PDF). For example, P(Z < 1.96) =NORM.S.DIST(1.96,1) 7-18 P(Z < 1.96) P(1.96 < Z < 1.96) =NORM.S.DIST(-1.96,1) =NORM.S.DIST(1.96,1)NORM.S.DIST(-1.96,1) Chapter 7 Standard Normal Distribution Normal Areas from Appendices C-1 and C-2 • • Appendices C-1 and C-2 yield identical results. Use whichever table is easiest. Finding z for a Given Area • • • 7-19 Appendices C-1 and C-2 can be used to find the z-value corresponding to a given probability. For example, what z-value defines the top 1% of a normal distribution? This implies that 49% of the area lies between 0 and z, which gives z = 2.33 by looking for an area of 0.4900 in Appendix C-1. Chapter 7 Standard Normal Distribution Chapter 7 Standard Normal Distribution Finding Areas Using Standardized Variables • John took an economics exam and scored 86 points. The class mean was 75 with a standard deviation of 7. What percentile is John in? That is, what is P(X < 86) where X represents the exam scores? Appendix C-2: Cumulative Standard Normal Distribution (continued) This table shows the normal area less than z . • John’s score is 1.57 standard deviations above the mean. • P(X < 86) = P(Z < 1.57) = .9418 (from Appendix C-2) • John is approximately in the 94 percentile. 7-20 th z 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 0.00 0.5000 0.5398 0.5793 0.6179 0.6554 0.6915 0.7257 0.7580 0.7881 0.8159 0.8413 0.8643 0.8849 0.9032 0.9192 0.9332 0.9452 0.9554 0.9641 0.9713 0.9772 0.01 0.5040 0.5438 0.5832 0.6217 0.6591 0.6950 0.7291 0.7611 0.7910 0.8186 0.8438 0.8665 0.8869 0.9049 0.9207 0.9345 0.9463 0.9564 0.9649 0.9719 0.9778 0.02 0.5080 0.5478 0.5871 0.6255 0.6628 0.6985 0.7324 0.7642 0.7939 0.8212 0.8461 0.8686 0.8888 0.9066 0.9222 0.9357 0.9474 0.9573 0.9656 0.9726 0.9783 0.03 0.5120 0.5517 0.5910 0.6293 0.6664 0.7019 0.7357 0.7673 0.7967 0.8238 0.8485 0.8708 0.8907 0.9082 0.9236 0.9370 0.9484 0.9582 0.9664 0.9732 0.9788 0.04 0.5160 0.5557 0.5948 0.6331 0.6700 0.7054 0.7389 0.7704 0.7995 0.8264 0.8508 0.8729 0.8925 0.9099 0.9251 0.9382 0.9495 0.9591 0.9671 0.9738 0.9793 0.05 0.5199 0.5596 0.5987 0.6368 0.6736 0.7088 0.7422 0.7734 0.8023 0.8289 0.8531 0.8749 0.8944 0.9115 0.9265 0.9394 0.9505 0.9599 0.9678 0.9744 0.9798 0.06 0.5239 0.5636 0.6026 0.6406 0.6772 0.7123 0.7454 0.7764 0.8051 0.8315 0.8554 0.8770 0.8962 0.9131 0.9279 0.9406 0.9515 0.9608 0.9686 0.9750 0.9803 0.07 0.5279 0.5675 0.6064 0.6443 0.6808 0.7157 0.7486 0.7794 0.8078 0.8340 0.8577 0.8790 0.8980 0.9147 0.9292 0.9418 0.9525 0.9616 0.9693 0.9756 0.9808 0.08 0.5319 0.5714 0.6103 0.6480 0.6844 0.7190 0.7517 0.7823 0.8106 0.8365 0.8599 0.8810 0.8997 0.9162 0.9306 0.9429 0.9535 0.9625 0.9699 0.9761 0.9812 0.09 0.5359 0.5753 0.6141 0.6517 0.6879 0.7224 0.7549 0.7852 0.8133 0.8389 0.8621 0.8830 0.9015 0.9177 0.9319 0.9441 0.9545 0.9633 0.9706 0.9767 0.9817 • Finding Areas by Using Standardized Variables You can use Excel, Minitab, TI83/84, etc. to compute these probabilities directly. The Excel functions are shown: Without standardizing: =NORM.DIST(x, µ, , 1) =NORM.DIST(86, 75, 7, 1) =.9420 7-21 With standardizing: =NORM.S.DIST(z, 1) =NORM.S.DIST(1.57, 1) =.9418 Slight difference is due to rounding z to 1.57 Chapter 7 Standard Normal Distribution ML 5.2 Inverse Normal • How can we find the various normal percentiles (5th, 10th, 25th, 75th, 90th, 95th, etc.) known as the inverse normal? That is, how can we find X for a given area? • We simply turn the standardizing transformation around: Solving for x in z = (x − μ)/ gives x = μ + zσ 7-22 Chapter 7 Inverse Normal Inverse Normal: Excel Finding x: 7-23 Finding z: Chapter 7 Inverse Normal Distribution Inverse Normal: Example • John’s economics professor decides that any student who scores below the 10th percentile must retake the exam. • The exam scores are normal with μ = 75 and σ = 7. • What is the score that would require a student to retake the exam? • We need to find the value of x that satisfies P(X < x) = .10. • The z-score for with the 10th percentile is z = −1.28. 7-24 Chapter 7 Inverse Normal Distribution Inverse Normal The logical steps to solve the problem are: • Use Appendix C to find z = −1.28 to satisfy P(Z < −1.28) = .10. • Substitute z = −1.28 into z = (x − μ)/σ to get −1.28 = (x − 75)/7 • Solve for x to get x = 75 − (1.28)(7) = 66.03 (or 66 after rounding) • Students who score below 66 points on the economics exam will be required to retake the exam. or use Excel to obtain z: =NORM.S.INV(0.1) = 1.282 7-25 or use Excel to solve in one step: =NORM.INV(0.1,75,7) = 66.03 Chapter 7 Inverse Normal Distribution Normal Approximation to the Binomial Chapter 7 Normal Approximations • Binomial probabilities are difficult to calculate when n is large. • Use a normal approximation to the binomial distribution. • As n becomes large, the binomial bars become smaller and the PDF approaches a continuous distribution. 7-26 Normal Approximation to the Binomial • Rule of thumb: when n ≥ 10 and n(1 ) ≥ 10, then it is appropriate to use the normal approximation to the binomial distribution. • Set the mean and standard deviation for the binomial distribution equal to the normal µ and , respectively. 7-27 Chapter 7 Normal Approximations Example: Coin Flips • If we flip a coin n = 32 times and = .50, are the requirements for a normal approximation to the binomial distribution met? • Yes, because: n = 32 x .50 = 16 n(1 ) = 32 x (1 .50) = 16 7-28 (at least 10 “successes”) (at least 10 “failures”) • When translating a discrete scale into a continuous scale, care must be taken about individual points. • For example, find the probability of more than 17 heads in 32 flips of a fair coin. This can be written as P(X 18). • However, “more than 17” actually falls between 17 and 18 on a discrete scale. Chapter 7 Normal Approximations Example: Coin Flips • • • 7-29 Chapter 7 Normal Approximations Since the cutoff point for “more than 17” is halfway between 17 and 18, we add 0.5 to the lower limit and find P(X > 17.5). This addition to X is called the Continuity Correction. At this point, the problem can be completed as any normal distribution problem. Example: Coin Flips P(X > 17) = P(X ≥ 18) P(X ≥ 17.5) = P(Z > 0.53) = 0.2981 7-30 Chapter 7 Normal Approximations Normal Approximation to the Poisson • • The normal approximation to the Poisson distribution works best when is large (e.g., when exceeds the values in Appendix B). Set the normal µ and equal to the mean and standard deviation for the Poisson distribution. Example: Utility Bills • • • 7-31 On Wednesday between 10 a.m. and noon customer billing inquiries arrive at a mean rate of 42 inquiries per hour at Consumers Energy. What is the probability of receiving more than 50 calls in an hour? = 42, which is too big to use the Poisson table. Use the normal approximation with = 42 and = 6.48074. Chapter 7 Normal Approximations Example: Utility Bills • • 7-32 To find P(X > 50) calls, use the continuity-corrected cutoff point halfway between 50 and 51 (i.e., X = 50.5). At this point, the problem can be completed as any normal distribution problem. Chapter 7 Normal Approximations Bottom Line: • With Excel, we do not need these approximations for calculations. • They are still useful when Excel is not available. • They are taught to show the logical connection between discrete and continuous distributions. 7-33 Chapter 7 Normal Approximations ML 5.3 Characteristics of the Exponential Distribution • If events per unit of time follow a Poisson distribution (e.g., customer arrivals), the waiting time until the next event (e.g., customer arrival) follows the exponential distribution. • The time until the next event is a continuous variable. Note: We seek tail probabilities such as P(X x) or P(X ≤ x). 7-34 Chapter 7 Exponential Distribution Characteristics of the Exponential Distribution Probability of waiting less than or equal to x Probability of waiting more than x Note: A point has no area so P(X ≤ x) is the same as P( X < x) and similarly P(X > x) is the same as P( X x). 7-35 Chapter 7 Exponential Distribution Example: Customer Waiting Time 7-36 • Between 2 p.m. and 4 p.m. on Wednesday, patient insurance inquiries arrive at Blue Choice insurance at a mean rate of 2.2 calls per minute. • What is the probability of waiting more than 30 seconds (i.e., 0.50 minutes) for the next call? • Set = 2.2 events/min and x = 0.50 min • P(X > 0.50) = e–x = e–(2.2)(0.5) = .3329 or a 33.29% chance of waiting more than 30 seconds for the next call. Chapter 7 Exponential Distribution Example: Customer Waiting Time Given λ = 2.2 inquiries per minute, what is the probability of waiting more than 30 seconds (i.e., 0.50 minutes) for the next call? P(X > 0.50) = e–x = e–(2.2)(0.5) = .3329 7-37 P(X ≤ 0.50) = 1-.3329 = .6671 Chapter 7 Exponential Distribution Inverse Exponential • If the mean arrival rate is 2.2 calls per minute, what is the 90th percentile for waiting time (the top 10% of waiting time)? • Find the x-value that defines the upper 10%. 7-38 Chapter 7 Inverse Exponential Distribution Inverse Exponential If the mean arrival rate is 2.2 calls per minute, what is the 90th percentile for waiting time (the top 10% of waiting time)? Find the x-value that defines the upper 10%. 7-39 Chapter 7 Inverse Exponential Distribution Mean Time Between Events 7-40 Chapter 7 Exponential Distribution Bottom Line: You may encounter the exponential model in any situation that involves customer arrivals, waiting lines, and queueing (e.g., retail business, call center, concert, theme park, bank, grocery store, airline check-in, traffic planning). Such applications are not rare in our crowded world. Study simulation (Chapter 18) to learn more about how such situations can be modeled to plan facility capacity, predict waiting times, and study system throughput. 7-41 Chapter 7 Exponential Distribution Characteristics of the Triangular Distribution 7-42 ML 5.4 Chapter 7 Other Continuous Distributions Characteristics of the Triangular Distribution • The triangular distribution is a way of thinking about variation that corresponds rather well to what-if analysis in business. • It is not surprising that business analysts are attracted to the triangular model. • Its finite range and simple form are more understandable than a normal distribution. 7-43 Chapter 7 Other Continuous Distributions Characteristics of the Triangular Distribution • It is more versatile than a normal because it can be skewed in either direction. • Yet it has some of the nice properties of a normal, such as a distinct mode. • The triangular model is especially handy for what-if analysis when the business case depends on predicting a stochastic variable (e.g., the price of a raw material, an interest rate, a sales volume). • If the analyst can anticipate the range (a to c) and most likely value (b), it will be possible to calculate probabilities of various outcomes. • Many times, distributions will be skewed, so a normal wouldn’t be much help. 7-44 Chapter 7 Other Continuous Distributions Triangular Distribution: Example T(15, 20, 30) 7-45 Chapter 7 Other Continuous Distributions Triangular Distribution: Example T(15, 20, 30) 7-46 Chapter 7 Other Continuous Distributions Characteristics of the Uniform Distribution If X is a random variable that is uniformly distributed between a and b, its PDF has constant height. • • 7-47 Denoted U(a, b) Area = base x height = (b a) x 1/(b a) = 1 Chapter 7 Uniform Continuous Distribution Characteristics of the Uniform Distribution 7-48 Chapter 7 Uniform Continuous Distribution Example: Anesthesia Effectiveness • An oral surgeon injects a painkiller prior to extracting a tooth. Given the varying characteristics of patients, the dentist views the time for anesthesia effectiveness as a uniform random variable that takes between 15 minutes and 30 minutes. • X is U(15, 30) • a = 15, b = 30, find the mean and standard deviation. • Find the probability that the effectiveness of the anaesthetic takes between 20 and 25 minutes. 7-49 Chapter 7 Uniform Continuous Distribution Example: Anesthesia Effectiveness P(20 < X < 25) = (25 – 20)/(30 – 15) = 5/15 = 0.3333 = 33.33% 7-50 Chapter 7 Uniform Continuous Distribution Chapter 7 Uniform Continuous Distribution Uses of Uniform Distribution • Can be a conservative “what-if” baseline model. • Excel’s =RAND() function follows this model: μ = (a + b)/2 = (0 + 1)/2 = .5000 σ = [(b - a)2/12]1/2 = [(1 - 0)2/12]1/2 = [1/12]1/2 = .2887 Try it yourself! Calculate a bunch of =RAND() values in Excel, and look at the mean and standard deviation. They should be close to the above predictions (if sample is large). 7-51 0.84328 0.33170 0.45351 0.53490 0.46443 0.43802 0.00549 Mean = 0.494637 0.68397 0.69134 0.56953 0.04807 0.70129 0.15553 0.96473 0.62752 0.98558 0.25002 0.37406 0.08978 0.32222 0.63328 0.09071 0.65731 0.36416 0.78566 0.05013 0.29142 0.28581 St Dev = 0.271894 0.72185 0.73706 0.34992 0.79984 0.33627 0.71570 0.82808 0.34901 0.61517 0.09537 0.47772 0.25935 0.27208 0.81790 0.78645 0.97143 0.80646 0.14220 0.50000 0.36504 0.59686 Comparison of Models • The normal distribution is the used most often. • The exponential is useful in modeling waiting lines (queues). • The triangular distribution is a way of thinking about variation that corresponds well to what-if analysis in business. • The uniform distribution is a useful baseline model or for random sampling (randomizing a list). 7-52 Chapter 7 Continuous Distributions