Normal Distribution Normal distributions are a family of distributions that have the same general shape. • • • • • They are symmetric with scores more concentrated in the middle than in the tails Normal distributions are sometimes described as bell shaped. The area under each curve is the same. The height of a normal distribution can be specified mathematically in terms of two parameters: the mean ( ) and the standard deviation ( ). The height (ordinate) of a normal curve is defined as: s m Equation f ( x) = 1 s 2p - ( x- m ) 2 e 2s 2 Features • • • • • It is bell-shaped It is symmetrical about the mean It extends from -∞ to +∞ The total area under the curve is 1 The maximum value of f(x) is 1 s 2p Approximately 95% of the distribution lies within two standard deviations from the mean. Approximately 99.9% of the distribution lies within three standard deviations from the mean. The shape depends on the value of s Definition The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1 Normal distributions can be transformed to standard normal distributions by the formula: z= X -m s where X is a score from the original normal distribution, m is the mean of the original normal distribution, and s is the standard deviation of original normal distribution. The standard normal distribution is sometimes called the z distribution. A z score always reflects the number of standard deviations above or below the mean a particular score is. For instance, if a person scored a 70 on a test with a mean of 50 and a standard deviation of 10, then they scored 2 standard deviations above the mean. Converting the test scores to z scores, an X of 70 would be: 70 - 50 z= =2 10 So, a z score of 2 means the original score was 2 standard deviations above the mean. Note that the z distribution will only be a normal distribution if the original distribution (X) is normal. Applying the formula will always produce a transformed distribution with a mean of zero and a standard deviation of one. However, the shape of the distribution will not be affected by the transformation. Using the chart • Need to know how many standard deviations you are from the mean. • Use z= x -m s Readings can be made to the left ‘P’ (Chart 0.5 +) P(Z <1.377) = Readings can be made to the left ‘P’ (Chart 0.5 +) P(Z < 1.377) = 0.9158 To the centre ‘Q’ P(0 < Z <1.377) = To the centre ‘Q’ P(0 < Z < 1.377) = 0.4158 Or to the right ‘R’ (Chart 0.5 - x) P( X >1.377) = Or to the right ‘R’ (Chart 0.5 - x) P( X >1.377) = 0.0842 Lengths of metal strips produced by a machine are normally distributed with a mean length of 150 cm and a standard deviation of 10 cm. • Find the probability that the length of a randomly selected strip is shorter than 165 cm. 165 -150 z= = 1.5 10 P( X <165) = P(Z <1.5) = 0.9332 Lengths of metal strips produced by a machine are normally distributed with a mean length of 150 cm and a standard deviation of 10 cm. • Find the probability that the length of a randomly selected strip is within 5 cm of the mean 150 155 -150 z= = 0.5 10 P(145 < X < 155) = P( Z < 0.5) = 2 ´ 0.1915 = 0.383 The time taken by the milkman to deliver to the High Street is normally distributed with a mean of 12 mins and standard deviation of 2 mins. He delivers milk every day. • Estimate the number of days during the year when he takes longer than 17 mins. 17 -12 z= = 2.5 2 12 P( X >17) = P(Z > 2.5) = 0.5 - 0.4938 = 0.0062 Two days The time taken by the milkman to deliver to the High Street is normally distributed with a mean of 12 mins and standard deviation of 2 mins. He delivers milk every day. • Estimate the number of days during the year when he takes less than ten mins. 10 -12 z= = -1 2 12 P( X <10) = P(Z <1) = 0.5 - 0.3413 = 0.1587 58 days The time taken by the milkman to deliver to the High Street is normally distributed with a mean of 12 mins and standard deviation of 2 mins. He delivers milk every day. • Estimate the number of days during the year when he takes between nine and 13 mins. 228 days 12 æ 9 -12 13 -12 ö Pç <z< ÷ = P(-1.5 < Z < 0.5) 2 ø è 2 = 0.6247 Inverse Normal The heights of female students at a particular school are normally distributed with a mean of 169 cm and a standard deviation of 9 cm • Given that 80% of these female students have a height less than h cm, find the value of h. • Given that 60% of these female students have a height greater than s cm, find the value of s. z= x -m s • z = 0.842 h -169 0.842 = 9 169 h h =169 + 9 ´ 0.842 =176.38 z= x -m s • z = 0.253 s -169 -0.253 = 9 s 169 h =169 - 9 ´ 0.253 =166.723 Batteries for a transistor radio have a mean life under normal usage of 160 hours, with a standard deviation of 30 hours. Assuming a normal distribution: • Calculate the percentage of batteries which have a life between 150 hours and 180 hours. 37.8% Batteries for a transistor radio have a mean life under normal usage of 160 hours, with a standard deviation of 30 hours. Assuming a normal distribution: • Calculate the range, symmetrical about the mean, within which 75% of the battery lives lie. 125.5, 194.5 The masses of boxes of oranges are normally distributed such that 30% of them are greater than 4.00 kg and 20% are greater than 4.53 kg. Estimate the mean and standard deviation of the masses. 3.13, 1.67 The speeds of cars passing a certain point on a motorway can be taken to be normally distributed. Observations show that of cars passing the point, 95% are travelling at less than85 kph and 10% are travelling at less than 55 kph. • Find the average speed of the cars passing the point. 68 kph The speeds of cars passing a certain point on a motorway can be taken to be normally distributed. Observations show that of cars passing the point, 95% are travelling at less than85 kph and 10% are travelling at less than 55 kph. • Find the proportion of cars that travel at more than 70 kph. 0.4282 Sometimes the normal distribution (a continuous distribution) is used to approximate situations that are really discrete. This occurs when data is measured to the nearest whole number. • The distribution takes on the shape of a normal distribution. In fact, the normal curve was instigated by De Moivre as an approximation to the Binomial. The discrete data is represented by its limits • E.g. 7 becomes the interval 6.5 < 7 < 7.5 Normal Approximation to the Binomial Distribution Notice that as N increases, the binomial distribution approximate to a normal distribution. Binomial distributions N = 5, p = 0.5 N = 5, p = 0.2 0.35 Binomial: N=5, p = 0.2 0.45 0.3 0.4 0.25 0.35 0.3 0.2 0.25 0.15 0.2 0.15 0.1 0.1 0.05 0.05 0 0 0 1 2 3 4 5 0 N = 10, p = 0.2 Binomial N=10, p = 0.2 1 2 3 4 5 N = 10, p = 0.5 Binomial: N= 10, p =0.5 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Binomial distributions N = 20, p = 0.2 0.25 0.2 N = 20, p = 0.5 0.18 0.2 0.16 0.14 0.15 0.12 0.1 0.08 0.1 0.06 0.05 0.04 0.02 0 0 0 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 N = 30, p = 0.2 3 0.16 0.18 4 5 6 7 8 9 10 11 12 13 14 15 16 17 N = 30, p = 0.5 0.14 0.16 0.12 0.14 0.1 0.12 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 The binomial distribution can be approximated by a normal distribution under the conditions np and n(1- p ) are greater than 5 Careful of the language • As a Binomial is a discrete distribution, a continuity correction is necessary. P( X £ 3) becomes P( X < 3.5) • P(at most 3) • P(fewer than 3) P( X < 3) becomes P( X < 2.5) • P(exactly 3) P( X = 3) becomes P(2.5 < X < 3.5) • P(more than 3) P( X > 3) becomes P( X > 3.5) • P(at least 3) P( X ³ 3) becomes P( X > 2.5) Example • It is given that 40% of the population support the Gambage Party. 150 members of the population are selected at random. Use a suitable approximation to find the probability that more than 55 out of these 150 support the Gambage Party. It is given that 40% of the population support the Gambage Party. 150 members of the population are selected at random. Use a suitable approximation to find the probability that more than 55 out of these 150 support the Gambage Party. m = 60 • • • • • Binomial distribution N = 150, p = 0.4 Np = 60 Np(1-p) = 90 Use a normal distribution s = 150 ´ 0.4 ´ 0.6 = 6 55.5 - 60 z= = -0.75 6 P( X > 55) = 0.7734 Normal approximation to the Poisson distribution Notice as values of l increase, the distribution becomes normally distributed. l l l l =2 =3 =5 = 10 As lambda increase, the normal approximation gets better. We use the criteria l >15 Poisson is a discrete distribution and hence we need to use a continuity correction The number of bacteria on a plate follows a Poisson distribution with a parameter 60. Find the probability that there are between 55 and 75 bacteria on a plate. The number of bacteria on a plate follows a Poisson distribution with a parameter 60. Find the probability that there are between 55 and 75 bacteria on a plate. l >15 The number of bacteria on a plate follows a Poisson distribution with a parameter 60. z1 = z2 = 55.5 - 60 60 74.5 - 60 60 = -0.589 = 1.872 P(55 < X < 75) = 0.6887 The number of bacteria on a plate follows a Poisson distribution with a parameter 60. A plate is rejected if less than 38 bacteria are found. If 2000 of such plates are reviewed, how many will be rejected? The number of bacteria on a plate follows a Poisson distribution with a parameter 60. z= 37.5 - 60 60 = -2.905 P( X < 38) = 0.00183 Number rejected = 4 Sums and differences of normally distributed random variables When two random variables are added, their sum is another random variable. E(T ) = E( X + Y ) = E( X ) + E(Y ) VAR(T ) = VAR( X + Y ) = VAR( X ) + VAR(Y ) Masses of a particular toy are normally distributed with mean 20g and standard deviation 2g. A random sample of 12 such articles is chosen. Find the probability that the total mass is greater than 230g. • Each toy mass is treated as an independent value. E(T ) = E( M1 + M 2 + … + M12 ) = 20 + 20 + … + 20 = 240g Masses of a particular toy are normally distributed with mean 20g and standard deviation 2g. A random sample of 12 such articles is chosen. Find the probability that the total mass is greater than 230g. • Each toy mass is treated as an independent value. E(T ) = E( M1 + M 2 + … + M12 ) = 20 + 20 + … + 20 = 240g VAR(T ) = VAR( M1 + M 2 + … M12 ) = 22 + 22 + … + 22 = 48 s (T ) = 48 = 6.928 Masses of a particular toy are normally distributed with mean 20g and standard deviation 2g. A random sample of 12 such articles is chosen. Find the probability that the total mass is greater than 230g. z= 230 - 240 48 = -1.443 P(T > 230) = 0.9255 When two random variables are subtracted, their sum is another random variable. E(T ) = E( X - Y ) = E( X ) - E(Y ) VAR(T ) = VAR( X - Y ) = VAR( X ) + VAR(Y ) A machine produces rubber balls whose diameters are normally distributed with a mean of 5.50 cm and standard deviation 0.08 cm. The balls are packed in cylindrical tubes whose inside diameters are normally distributed with mean 5.70 cm and standard deviation 0.12 cm. If a randomly selected ball is placed in a randomly selected tube, what is the probability that the clearance is between 0.05 cm and 0.25 cm. A machine produces rubber balls whose diameters are normally distributed with a mean of 5.50 cm and standard deviation 0.08 cm. The balls are packed in cylindrical tubes whose inside diameters are normally distributed with mean 5.70 cm and standard deviation 0.12 cm. E(C) = E(T ) - E(B) = 0.2 If a randomly selected VAR(T - B) = 0.082 + 0.122 = 0.0208 ball is placed in a randomly selected tube, what is the probability that the clearance is between 0.05 cm and 0.25 cm. A machine produces rubber balls whose diameters are normally distributed with a mean of 5.50 cm and standard deviation 0.08 cm. The balls are packed in cylindrical tubes whose inside diameters are normally distributed with mean 5.70 cm and standard deviation 0.12 cm. E(C) = E(T ) - E(B) = 0.2 If a randomly selected VAR(T - B) = 0.082 + 0.122 = 0.0208 ball is placed in a 0.05 - 0.2 randomly selected z1 = = -1.040 0.0208 tube, what is the 0.25 - 0.2 probability that the z2 = = 0.347 0.0208 clearance is between 0.05 cm and 0.25 cm. A machine produces rubber balls whose diameters are normally distributed with a mean of 5.50 cm and standard deviation 0.08 cm. The balls are packed in cylindrical tubes whose inside diameters are normally distributed with mean 5.70 cm and standard deviation 0.12 cm. E(C) = E(T ) - E(B) = 0.2 If a randomly selected VAR(T - B) = 0.082 + 0.122 = 0.0208 ball is placed in a 0.05 - 0.2 randomly selected z1 = = -1.040 0.0208 tube, what is the 0.25 - 0.2 probability that the z2 = = 0.347 0.0208 clearance is between 0.05 cm P(0.05 < C < 0.25) = 0.4865 and 0.25 cm. Multiples of Independent Normal Variables. E(aX + bY ) = aE( X ) + bE(Y ) VAR(aX + bY ) = a VAR( X ) + b VAR(Y ) 2 2 Great care must be taken in distinguishing between a sum of random variables and a multiple of a random variable. A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 • A bottle of each size is selected at random. Find the probability that the large bottle contains less than four times the amount in the small bottle. A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 -- ´4 • A bottle of each size is selected at random. Find the probability that the large bottle contains less than four times the amount in the small bottle. P(L - 4S < 0) A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 -- ´4 E(L - 4S ) = E(L) - 4E(S ) = 1012 - 4 ´ 252 = 4 P(L - 4S) < 0 A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 E(L - 4S ) = E(L) - 4E(S ) = 1012 - 4 ´ 252 = 4 VAR(L - 4S ) = VAR(L) + 42 VAR(S ) = 25 + 16 ´ 4 = 89 -- ´4 z= 0-4 = -0.424 89 P(L - 4S < 0) = 0.3358 A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 • One large bottle and four small bottles are selected at random. Find the probability that the amount in the large bottle is less than the total amount in the four small bottles. A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 • One large bottle and four small bottles are selected at random. Find the probability that the amount in the large bottle is less than the total amount in the four small bottles. A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 P(L < S1 + S2 + S3 + S4 ) = P(L - (S1 + S2 + S3 + S4 ) < 0) A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 E(L - (S1 + S2 + S3 + S4 ) = E(L) - E(S1 + S2 + S3 + S4 ) = 1012 -1008 = 4 A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 E(L - (S1 + S2 + S3 + S4 ) = E(L) - E(S1 + S2 + S3 + S4 ) = 1012 -1008 = 4 VAR(L - (S1 + S2 + S3 + S4 ) = VAR(L) + VAR(S1 + S2 + S3 + S4 ) = 25 + 4 + 4 + 4 + 4 = 41 A soft drinks manufacturer sells bottles of drinks in two sizes. The amount in each bottle is normally distributed. Mean (ml) Variance Small 252 4 Large 1012 25 z= 0-4 = -0.625 41 P(L - (S1 + … + S4 ) < 0) = 0.2661 Check that you know the difference in these two questions