Lincoln High School Mathematics Department Topic 5 – Probability Distributions (AS 3.6 - 90646) Achievement Criteria: Achieved: Use probability distribution models to solve straightforward problems. Binomial probabilities using formulae and tables. Binomial probabilities involving turning successes to failures. Mean and variance of Binomial random variables. Poisson probabilities using formulae and tables. Mean and variance of Poisson random variables Standard Normal distributions. Other Normal distributions. Merit: Use probability distribution models to solve problems. Sum of independent normally distributed variables. Inverse Normal problems. Advanced Binomial problems. Advanced Poisson problems. Calculating probabilities using combined events of Normal, Poisson & Binomial distributions. Calculating sample statistics as estimates of population parameters. Calculating probabilities using continuity corrections. Excellence: Use and justify probability distribution models to solve complex problems. Inverse Poisson problems. Linear combinations of independent normally distributed variables Selecting and justifying the use of models. There are lots of families of probability distributions. We will focus on three of the most common ones: the Normal Distribution (which involves continuous data), the Binomial Distribution (discrete data), and the Poisson Distribution (discrete data). Normal Distribution The most widely used probability distributions belong to the Normal Distribution family. The graphs of these distributions have a “bell shaped” curve which are symmetrical about the mean, , and are given by the probability density function f ( x) 1 2 e 1 x 2 2 y x © T. D. Phillipps 2007 Lincoln High School Mathematics Department The shape of the curve depends on the standard deviation, . y small large x Standard Normal Distribution The standard normal distribution is special in that it has a mean of = 0 and standard deviation of = 1. The random variable associated with this distribution is Z, and values of probabilities associated with it are tabulated. The tables give the probability that Z lies between 0 and some value. y P(0 < Z < z) shade 0 z z When solving problems involving the standard normal distribution, it always pays to draw a diagram to clarify what area you are looking for. eg. Use tables to calculate the following: 1. P(0 < Z < 1.85) y shade Just look up 1.8 down the left of the table and then move across to the 5 column. 0.4678 1.85 0 2. P(0 < Z < 0.148) z y shade Look up 0.1 down the left of the table, then move across to the 4 column and note the probability. Finally go to 8 in the Differences column and add the value to the last digit(s) of the probability. 0 0.148 z 0.0589 3. y P(-1.2 < Z < 0) shade Because the standard normal curve is symmetrical about 0, we can treat this just like P(0 < Z < 1.2). Look up 1.2 down the left of the table. -1.2 0 z 0.3849 © T. D. Phillipps 2007 Lincoln High School 4. Mathematics Department y P(Z < 1.72) This includes everything to the left of 1.72. As the curve is symmetrical, the probability of getting below 0 is 0.5 (half the graph). shade P(Z < 1.72) = 0.5 + P(0 < Z < 1.72) = 0.5 + 0.4573 = 0.9573 5. z 0 1.72 P(Z > 2.14) y If we look up 2.14 it will give us the probability from 0 to 2.14. But we want the bit to the right of 2.14. To get this we subtract the bit we look up from 0.5. shade z 2.14 0 P(Z > 2.14) = 0.5 – P(0 < Z < 2.14) = 0.5 – 0.4838 = 0.0162 6. P(1.3 < Z < 2.485) y P(1.3 < Z < 2.485) = P(0 < Z < 2.485) – P(0 < Z < 1.3) = 0.4935 – 0.4032 = 0.0903 shade Looking up 2.485 will produce everything from 0 to 2.485. However we don’t want the part from 0 to 1.3, so we need to subtract it. 0 1.3 2.485 z Inverse Standard Normal Problems Sometimes we know the probability, but do not know what it is the probability of. In this case we use the table in reverse, finding the probability in the middle and using it to read off the Z value. eg. Find the value of c such that P(Z < c) = 0.9. 0.9 is greater than 0.5, so this means c must be to the right of the mean, 0. As the table only gives from 0 to c we have: P(0 < Z < c) = 0.9 – 0.5 = 0.4 y shade 0.9 0 c z So, we look in the middle of the table for which Z value has a probability of 0.4. c = 1.281 (or 1.282) © T. D. Phillipps 2007 Lincoln High School Mathematics Department Other Normal Distributions For other Normal Distributions with different means and standard deviations, we can calculate probabilities by first converting them to Standard Normal Distributions. Rule: If X is Normally distributed with a mean and standard deviation , then X Z has a Standard Normal Distribution. NB: Effectively the Z value is “how many standard deviations the x value is away from the mean”. eg. X is a normally distributed random variable with a mean of 8 and a standard deviation of 2. Find P(X < 11). y Step 1: Draw a diagram. Step 2: Convert the X value to a Z value. X Z = shade 11 8 2 = 1.5 = 8 11 0 z z x z Step 3: Calculate the probability. P(X < 11) = P(Z < 1.5) = 0.5 + 0.4332 = 0.9332 Inverse Normal Problems eg. The weights of cans of soup filled by a particular machine are normally distributed with mean of 425 g and standard deviation of 3.2 g. If 6% of cans produced by the machine are rejected as being underweight, what is the largest weight that could be considered underweight. Let W be a random variable representing the weight of a can of soup produced by the machine. W N( = 425, = 3.2) y Want c such that P(W < c) = 0.06 P(Z < a) = 0.06 P(0 < Z < a) = 0.5 – 0.06 = 0.44 0.06 shade a c 0 425 z So, we look in the middle of the table for which Z value has a probability of 0.44. a = -1.555 x c 425 Now Z - 1.555 3.2 c = -1.555 3.2 + 425 = 420.02 g © T. D. Phillipps 2007 W Lincoln High School Mathematics Department Continuity Correction When the data presented is rounded we need to consider what the true values must be. To do this we use a continuity correction to go from the discrete rounded variable to a continuous variable. This involves considering what possible continuous values would end up getting rounded to the discrete value given. To find these values we look at the midpoint of consecutive values. For example: Discrete Probability P(x = 6) P(x > 2) P(x 2) P(x 10) P(x < 10) P(9 x < 15) Equivalent Continuous Probability P(5·5 < x < 6·5) P(x > 2·5) P(x > 1·5) P(x < 10·5) P(x < 9·5) P(8·5 < x < 14·5) Sum and Difference of Normally Distributed Random Variables If you have two independent normally distributed randon variables, then adding or subtracting them produces a new random variable which is also normally distributed. Rule: If A and B are independent random variables with means A and B, and standard deviation A and B respectively, then T = A + B is normally distributed with mean T = A + B and standard deviation T 2A 2B and D = A – B is normally distributed with mean T = A – B and standard deviation T 2A 2B . NB: In both cases you must add variances. eg. The weight of sausages produced by a particular machine can be considered to be distributed normally with a mean weight of 52 g and a standard deviation of 3 g. The sausages come in packs of 8. What is the probability the total weight of a packet of sausages exceeds 425 g? Let T be the total weight of a packet of 8 sausages. T = 52 + 52 + 52 + … + 52 = 8 52 = 416 g D = Z = 32 32 32 ... 32 8 32 72 8.485(4sf ) y 425 416 = 1.061 72 shade 416 P(T > 425) = P(Z > 1.061) = 0.5 – P(0 < Z < 1.061) = 0.5 – 0.3556 = 0.1444 0 425 z z T z © T. D. Phillipps 2007 Lincoln High School eg. Mathematics Department A machine has two critical components, A and B, that wear out independently of each other. The distributions of their lifetimes are normal with parameters: A: A = 1200 hours, A = 120 hours B: B = 1500 hours, B = 50 hours What is the probability that B wears out before A? Want P(B < A) = P(A – B > 0). D = 1200 – 1500 = -300 Let D = A – B. D = y 120 2 50 2 130 0 300 Z = = 2.308 130 shade -300 0 0 z P(D > 0) = P(Z > 2.308) = 0.5 – P(0 < Z < 2.308) = 0.5 – 0.4895 = 0.0105 eg. z D z The time it takes to do a regular service on a car at a particular workshop is said to be normally distributed with mean 55 minutes and standard deviation 6.2 minutes. The time it takes to change the filter at the same workshop is also normally distributed, with a mean of 24 minutes and standard deviation 3.4 minutes. The workshop charges $90 per hour for labour. Calculate the probability that a regular service with a filter change will have labour costs exceeding $120. Let R be the number of hours spent on a regular service and F be the number of hours spent on changing the filter. 55 24 90 118.50 and The labour cost is given by C = 90R + 90F, where C 90 60 60 2 2 6 .2 6.2 2 90 10.61 60 60 C 902 y Want P(C > 120). Z 120 118.5 0.141 10.61 P(C > 120) = P(Z > 0.141) = 0.5 – 0.0561 = 0.4439 shade 118.5 120 0 z z C z © T. D. Phillipps 2007