Probability and Statistics Chapter 6 Using the Normal Model Normal Model is an ideal model. No data are ever perfectly normally distributed, but when data are roughly unimodal and symmetric, a Normal model can be helpful. Normal Model: 68-95-99.7 Rule 68% of the data is Within 1 SD of the mean 95% of the data is Within 2 SD of the mean 99.7% of the data is Within 3 SD of the mean When creating a Normal model, you need to know two things: the mean and the standard deviation. This is often written as N(mean, standard deviation). Sketch Normal Models using the 68-95-99.7 Rule Example 1: Birthweights of babies: N(7.6lb, 1.3lb) Sketch a model… Example 2: ACT scores at a specific college: N(21.2, 4.4) Sketch a model… Use the 68-95-99.7 Rule to answer the following questions Example 3: A radar unit is used to measure speeds of cars on a motorway. The speeds are normally distributed with N(90km/hr, 10km/hr). What is the probability that a car is traveling at more than 100 km/hr? Sketch a model. Realize that 100km/hr is z = 1; so more than that is 16% (Use the 68 rule) Example 4: The length of similar components produced by a company are approximated by N(5cm, 0.02cm). What is the probability that the length of a component is between 4.98 cm and 5.02 cm? Sketch a Normal model… Realize that you are between z = -1 and z = 1 so using the Rule, 68% What is the probability that the length of a component is between 4.96cm and 5.04 cm? Sketch a Normal model… Realize that you are between z= -2 and z = 2 so using the Rule, 95% Using the Normal Model and the 68-95-99.7 Rule to find percentages of data when you know the z-score: % of data with: -1 < z < 1 ______68%____________ % of data with -2 < z < 2 ______95%________ % of data with z < 1 _____84%______________ % of data with z > -1 _____84%_____________ % of data with z<-2 & z>2 _____5%_____________ What about when we consider intervals of z are not integers (exactly 1, 2, 3, standard deviations away from the mean)? If the z-scores are not integers, we use a table to compute percentages. USE NORMALCDF % of data with: z < -1.59 _____0.0559________________ % of data with z >2.57 ___0.0051___________ % of data with -0.47 < z < 1.17 ___0.5598_____________ If you don’t know the z-score, but are given the data range, you: i) Calculate the z-scores ii) Use the z-scores to find the percentage Example 5: The fuel efficiency of cars can be described by the normal model. The mean is 24mpg with a standard deviation of 6mpg. What percent of all cars get less than 15mpg? (15-24)/6 = -1.5 normalcdf(-99, -1.5) = 6.68% What percent of all cars get between 20 and 30 mpg? (20-24)/6 = 4/6 (30 – 24)/6 = 1 normalcdf(4/6, 1) = 9.38% What percent of cars get more than 40mpg? (40-24)/6 = 16/6 normalcdf(16/6, 99) = 0.383% Example 6: Birthweights of babies: N(7.6lb, 1.3lb) What percent of babies weigh greater than 10 pounds? (10 – 7.6)/1.3 = 1.846 normalcdf(1.846, 99) = 3.24% I was born weighing 5.81 pounds. What percentage of newborn babies weighed less than me? (5.81 – 7.6)/1.3 = -1.377 normalcdf(-99, -1.377) = 8.426% NEW STUFF: Going backwards: If you are given the percentage, you can go backwards and find the original data. The first question is done for you as an example, use it to answer the others. Example 7: Using the fuel efficiency example with N(24, 6). A car with fuel efficiency greater than 80% of the other cars will have what mpg? I want to find the z-score that has 80% of the Normal model below it. I look in my table, not on the edges, but in the middle to find the decimal closest to .8000. invNorm(0.8) = 0.8416 = (x – 24)/6 x = 29.0 mpg What is the mpg range for the middle 50% of the cars? invNorm(0.25) = -0.6745 = (x – 24)/6 x = 20.0 invNorm(0.75) = 0.6745 = (x – 24)/6 x = 28.0 Middle 50%: 20-28 What is the mpg that is in the bottom 30% of fuel efficiency? invNorm(0.3) = -0.5244 = (x – 24)/6 x = 20.9mpg Describe the fuel efficiency of the worst 20% of all cars? invNorm(0.2) = -0.846 = (x – 24)/6 x = 18.95mpg Example 8: Consider the birthweight example again N(7.6, 1.3) What weight corresponds to the third quartile (greater than 75%) invNorm(0.75) = 0.6745 = (x – 7.6)/1.3 x = 8.48 lbs What weight range corresponds to the middle 50%? invNorm(0.25) = -0.6745 = (x – 7.6)/1.3 x = 6.72 6.72 – 8.48 lbs What weights represents the extreme 2% (top 1% and bottom 1%)? invNorm(0.01) = -2.326 = (x – 7.6)/1.3 x = 4.58 lbs x < 4.58 lbs invNorm(0.99) = 2.326 = (x – 7.6)/1.3 x = 10.6 lbs x > 10.6 lbs If you are given the % and the data value itself, you can find the z-score and then solve to find a missing value of mean or standard deviation. If you are given a %: invNorm(%): tells you the z-score were that % of data is less than it. Ex: invNorm(0.84) = 1, because 84% of the data is less than the value where z = 1. From the z-score, you can find the original data point if you know the mean and SD of the normal model. If you already know the data point, you can find the mean or SD if one of them is unknown. Example 9: The ACT scores of admitted students for State University are approximated by N(22.6, 2.1). –Carlos knows that he must be in the top 25% in order to be admitted. What score must he obtain in order to be admitted? invNorm(0.75) = 0.6745 = (x – 22.6)/2.1 x = 24.0 –If he obtains this score, and applies to another university where the mean admitted score is 24.4. What is the standard deviation for this school if he is in the top 60%? invNorm(0.4) = -0.253 = (24-24.4)/SD SD = 1.68 (Note: You use 40% because 40% is to the left/less than him) Example 10: At a local orchard, the number of apples that fell from each tree were approximated by a normal model with a standard deviation of 10.2 apples. –20% of the trees had less than 100 apples fall from them. What is the mean of the normal model? invNorm(0.2) = -0.8416 = (100 – mean)/10.2 mean = 108.58 apples –How many apples fell from at least 75% of the trees? invNorm(0.25) = -0.6745 = (x – 108.58)/10.2 mean = 101.7 apples If you are given two data points and two corresponding %, you can find the mean and the standard deviation by solving a system of equations. Find the z-scores for each. Then set up two equations with two unknowns (mean, SD) Solve for the mean and SD of the normal model. Example 11: At State College, the middle 50% of the SAT scores for accepted students was in the range 1130 – 1290. What was the mean and standard deviation, assuming the scores were approximated by a normal model? invNorm(0.25) = -0.6745 = (1130 – mean)/SD -0.6745SD = 1130 - mean invNorm(0.75) = 0.6745 = (1290 – mean)/SD 0.6745SD = 1290 – mean (Use Systems of Equations) 0 = 2420 – 2*mean Mean = 1210 SD = 118.6 What score marked the top 90%? invNorm(0.1) = -1.28 = (x – 1210)/118.6 x = 1058 Example 12: At 32 years old Kobe Bryant is older than 68% of all NBA players. At 25 years old Lebron James is only older than 22% of all NBA players. Assuming the ages can be approximated by a normal model, what is the average age and the standard deviation of all NBA players? invNorm(0.68) = 0.4677 = (32 – mean)/SD invNorm(0.22) = -0.7722 = (25 – mean)/SD 0.4677SD = 32 - mean -0.7722SD = 25 – mean 1.2399SD = 7 (Subtract Eqn 2 from Eqn 1) SD = 5.65 yrs Mean = 29.36 years