THE NORMAL DISTRIBUTION CHAPTER 6 INTRODUCTION • The normal distribution is used often by researchers to determine normal intervals for specific medical tests • Many continuous variables have distributions that are bellshaped • This type of distribution is known as a bell curve or a Gaussian distribution (named for German mathematician Carl Friedrich Gauss) KEY TERMS • Symmetric distribution • When the data values are evenly distributed about the mean • Negatively or left-skewed distribution • Majority of data values fall to the right of the mean, mean is left of the median, mean and median are left of the mode • Positively or right-skewed distribution • Majority of data values fall to the left of the mean, mean is right of the median, mean and median are right of mode 6.1 – NORMAL DISTRIBUTIONS • A theoretical curve, called a normal distribution can be used to study many variables that are not perfectly normally distributed but are approximately normal • Mathematical expression for a normal distribution 𝒚= 𝑿−𝝁 𝟐 − 𝒆 𝟐𝝈𝟐 𝝈 𝟐𝝅 • Where 𝒆 ≈ 𝟐. 𝟕𝟏𝟖 𝝁 = 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒎𝒆𝒂𝒏 𝝅 ≈ 𝟑. 𝟏𝟒 𝝈 = 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 NORMAL DISTRIBUTION • Normal distribution • A continuous, symmetric, bell-shaped distribution of a variable • Shape and position of a normal distribution curve depend on two parameters, the mean and the standard deviation PROPERTIES OF NORMAL DISTRIBUTIONS 1. Normal distribution curve is bell-shaped 2. Mean, median, and mode are equal are located at center of the distribution 3. Normal distribution curve is unimodal 4. Curve is symmetric about the mean 5. Curve is continuous, no gaps or holes 6. Curve never touches the X-axis 7. Total area under a normal distribution curve is 1.00 or 100% 8. Area under the part of a normal curve that lies within 1 standard deviation of the mean is approx. 68%, within 2 standard deviations approx. 95%, and within 3 standard deviations, approx. 99.7% STANDARD NORMAL DISTRIBUTION • Standard normal distribution • A normal distribution with a mean of 0 and a standard deviation of 1 • All normally distributed variables can be transformed into the standard normally distributed variable by using the standard score formula: 𝒛= 𝒗𝒂𝒍𝒖𝒆−𝒎𝒆𝒂𝒏 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 = 𝑿−𝝁 𝝈 PROCEDURE FOR FINDING AREA UNDER NORMAL CURVE • The z value is actually the number of standard deviations that a particular X value is away from the mean. Table E in Appendix C gives the area under the curve for any z value Procedure: 1. To the left of any z value: 1. Look up the z value in the table and use the area given 2. To the right of any z value: 1. Look up the z value and subtract the area from 1 3. Between any two z values: 1. Look up both z values and subtract the corresponding areas EXAMPLES • 6–1 • Find the area to the left of z = 1.99 • 6–2 • Find the area to the right of z = -1.16 • 6–3 • Find the area between z = +1.68 and z = -1.37 EXAMPLES • A normal distribution curve can be used as a probability distribution curve for normally distributed variables • • 6–4 • • • • NOTE: In a continuous distribution, the probability of any exact z value is 0 A. 𝑃 0 < 𝑧 < 2.32 B. 𝑃 𝑧 < 1.65 C. 𝑃 𝑧 > 1.91 6–5 • Find the z value such that the area under the standard normal distribution curve between 0 and the z value is 0.2123 6.2 – APPLICATIONS OF THE NORMAL DISTRIBUTION • To solve problems by using the standard normal distribution, transform the original variable to a standard normal distribution variable by using the z score formula EXAMPLES • 6–6 • A survey by the National Retail Federation found that women spend on average $146.21 for the Christmas holidays. Assume the standard deviation is $29.44. Find the percentage of women who spend less than $160.00. Assume the variable is normally distributed. EXAMPLES • 6–7 • Each month, an American household generates an average of 28 pounds for newspaper for garbage or recycling. Assume the standard deviation is 2 pounds. If a household is selected at random, find the probability of its generating • • A. between 27 and 31 pounds per month B. more than 30.2 pounds per month EXAMPLE 6 – 9 • To qualify for a police academy, candidates must score in the top 10% on a general abilities test. The test has a mean of 200 and a standard deviation of 20. Find the lowest possible score to qualify. Assume the test scores are normally distributed. EXAMPLES • Formula for finding X • When you must find the value of X, you can use the following formula: 𝑿=𝒛∗𝝈+𝝁 • 6 – 10 • For a medical study, a researcher wishes to select people in the middle 60% of the population based on blood pressure. If the mean systolic blood pressure is 120 and the standard deviation is 8, find the upper and lower reading that would qualify people to participate in the study. 6.3 – CENTRAL LIMIT THEOREM • Sampling distribution of sample means • Distribution using the means computed from all possible random samples of a specific size taken from a population • Sampling error • The difference between the sample measure and the corresponding population measure due to the fact that the sample is not a perfect representation of the population PROPERTIES • When all possible samples of a specific size are selected with replacement from a population, the distribution of the sample means for a variables has two important properties: Properties of the Distribution of Sample Means 1. The mean of the sample means will be the same as the population mean 2. The standard deviation of the sample means will be smaller than the standard deviation of the population, and it will be equal to the population standard deviation divided by the square root of the sample size STANDARD ERROR OF THE MEAN • If all possible sample of size n are taken with replacement from the same population, then 𝝁𝑿 = 𝝁 and the standard deviation of the sample means equals 𝝈𝑿 = 𝝈 𝒏 • Standard error of the mean • Standard deviation of the sample means CENTRAL LIMIT THEOREM • The Central Limit Theorem • As the sample size n increases without limit, the shape of the distribution of the sample means take with replacement from a population with mean μ and standard deviation σ will approach a normal distribution. This distribution will have a mean μ and a standard deviation 𝜎 𝑛 NEW FORMULA FOR SAMPLE MEANS • 𝒛= 𝑿−𝝁 𝝈 𝒏 • Two important things to remember when using the Central Limit Theorem 1. When the original variable is normally distributed, the distribution of the sample means will be normally distributed, for any sample size n 2. When the distribution of the original variable might not be normal, a sample size of 30 or more is needed to use a normal distribution to approximate the distribution of the sample means. The larger the sample, the better the approximation will be EXAMPLES • 6 – 13 • A.C. Neilsen reported that children between the ages of 2 and 5 watch an average of 25 hours of television per week. Assume the variables is normally distributed and the standard deviation is 3 hours. If 20 children between the ages of 2 and 5 are randomly selected, find the probability that the mean of the number of hours they watch television will be greater than 26.3 hours. EXAMPLES • 6 – 14 • The average age of a vehicle registered in the United States is 8 years, or 96 months. Assume the standard deviation is 16 months. If a random sample of 36 vehicles is selected, find the probability that the mean of their age is between 90 and 110 months. EXAMPLES • 6 – 15 • The average number of pounds of meat that a person consumes per year is 218.4 pounds. Assume that the standard deviation is 25 pounds and the distribution is approximately normal. a. Find the probability that a person selected at random consumes less than 224 pounds per year b. If a sample of 40 individuals is selected, find the probability that the mean of the sample will be less than 224 pounds per year 6.4 – NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION • A normal approximation to a binomial distribution is best used when 𝒏 ∗ 𝒑 ≥ 𝟓 𝒂𝒏𝒅 𝒏 ∗ 𝒒 ≥ 𝟓 • In addition to this, a correction for continuity may be used in the normal approximation • Correction for continuity • Correction employed when a continuous distribution is used to approximate a discrete distribution • For any specific value of X, the boundaries of X in the binomial distribution must be used SUMMARY OF NORMAL APPROXIMATION TO BINOMIAL DISTRIBUTION Binomial Normal 𝑃(𝑋 = 𝑎) 𝑃 𝑎 − 0.5 < 𝑋 < 𝑎 + 0.5 𝑃(𝑋 ≥ 𝑎) 𝑃(𝑋 > 𝑎 − 0.5) 𝑃(𝑋 > 𝑎) 𝑃(𝑋 > 𝑎 + 0.5) 𝑃(𝑋 ≤ 𝑎) 𝑃(𝑋 < 𝑎 + 0.5) 𝑃(𝑋 < 𝑎) 𝑃(𝑋 < 𝑎 − 0.5) For all cases, 𝜇 = 𝑛 ∗ 𝑝, 𝜎 = 𝑛 ∗ 𝑝 ∗ 𝑞, 𝑛 ∗ 𝑝 ≥ 5, 𝑎𝑛𝑑 𝑛 ∗ 𝑞 ≥ 5 PROCEDURES Procedure for Normal Approximation to Binomial Distribution 1. Check to see whether the normal approximation can be used 2. Find the mean μ and the standard deviation σ 3. Write the problem in probability notation, using X 4. Rewrite the problem by using the continuity correction factor, and show the corresponding area under the normal distribution 5. Find the corresponding z values 6. Find the solution EXAMPLES • 6 – 16 • A magazine reported that 6% of American drivers read the newspaper while driving. If 300 drivers are selected at random, find the probability that exactly 25 say they read the newspaper while driving. EXAMPLES • 6 – 17 • Of the members of a bowling league, 10% are widowed. If 200 bowling league members are selected at random, find the probability that 10 or more will be widowed. EXAMPLES • 6 – 18 • If a baseball player’s batting average is 0.320 (32%), find the probability that they players will get at most 26 hits in 100 times at bat.