Normal Curve, Standard Deviation as Ruler, and Estimations about Populations AP Statistics Normal Model • When we take a sample and get real data, we display it as a histogram. • For example, the histogram to the right displays the distribution of 1000 IQs. The mean of the sample is about 100 and the standard deviation is approximately 15. Normal Model • All this distribution tells us is information about this single sample. • In statistics, we want to gain information about a population. • The information about our sample can be (under the right circumstances) used to gain information about the population Normal Model • As long as the sample was well obtained the histogram provides a good estimation of the shape, center and spread of the distribution. • This estimation of the population is illustrated with a MODEL. • If the sample is approximately symmetric the model we use is called the NORMAL MODEL Normal Model • The graph to the left is the Normal Model that represents the IQ scores of all adults. • It has a mean of 100 (highest point) and a standard deviation of 15. • This is called the standard normal model Normal Model • The standard normal curve is a model—a model of reality –not reality itself. • It is not perfect—but will give us a good general idea of how the population looks Histogram Real Data Sample Statistic Histogram Mean: y Standard Deviation: s x Normal Model Model of Population Population Parameter Normal Curve (or other) Mean: Standard Deviation: Histogram—real data Normal model—model of population Normal Model • Is always defined by the mean and standard deviation N , Normal Model Summary • Perfectly symmetric and unimodal • Continues infinitely in both directions. It is asymptotic to the horizontal axis as it approaches • It is defined by only two parameters, mean and standard deviation • Centered at mean • The “points of inflection” where the curve changes from curving downward to curving upward, occur at exactly 1 • The total area under the curve equals 1. Nearly Normal Condition • When we use the normal model, we are assuming that the data is normal (symmetric). • Therefore, before you use a normal model to help in your analysis, you need to check to make sure the data is basically normal—remember, real world data is not perfectly normal • Use: Nearly Normal Condition: The shape of the data’s distribution is unimodal and symmetric Nearly Normal Condition Two ways to check the Nearly Normal Condition 1. Make a histogram Also, look at mean and median 2. Make a Normal Probability plot If the distribution of the data is roughly Normal, the plot is roughly a diagonal straight line. Normal Probability Plot Normal Not Normal Normal Probability Plot—calculator • We are plotting the data along the y-axis in this example The Standard Deviation as a Ruler John recently scored a 113 on Test A. The scores on the test are distributed with a mean of 100 and a standard deviation of 10. Mary took a different test, Test B, and scored 263. The scores on her test are distributed with a mean of 250 and a standard deviation of 25. Which student did relatively better on his particular test? (A) John did better on his test (B) Mary did better on her test (C) They both performed equally well (D) It is impossible to tell since they did not take the same test (E) It is impossible to tell since the number of students taking the test is unknown. The Standard Deviation as a Ruler • We need to have a level playing field—we need a way to make the numbers mean the same thing. • We do that by using a z-score • A z-score measures how many standard deviations a point is from the mean • NO UNITS for a z-score y y z s Standard Deviation as Ruler The Standard Deviation as a Ruler • Standardizing data into z-scores does not change the shape of the data. • Standardizing data into z-scores does change the center by making the mean 0. • Standardizing data into z-scores does change the spread by making the standard deviation 1 • Positive z-score –data point above mean • Negative z-score—data point below mean Standard Deviation as a Ruler: Examples • The SATs have a distribution that has a mean of 1500 and a standard deviation of 250. Suppose you score a 1850. How many standard deviations away from the mean is your score? • Suppose your friend took the ACT, which scores are distributed with a mean of 20.8 and a standard deviation of 4.8. What score would your friend need to get in order to have done as well as you did on the SATs? Estimations About Populations • Many times, we use the Normal Model to make estimations or predictions about a population. • Later in the course, we will use them to determine just how “unusual” something is Estimations About Populations What percent of the population of adults would you expect to have an IQ between 70 and 130? Estimations About Populations Estimations About Populations Estimations About Populations The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. Approximately what percent of students will score between 400 and 600 on the verbal part of the exam? The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. Approximately what percent of students will score above 700? Estimations About Populations The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. Approximately what percent of students will score between 350 and 620? Estimations About Populations:Example The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. Approximately what percent of students will score between 350 and 620? Estimating About Populations Estimations About Populations The results of a placement test for an exclusive private school is normal, with a mean of 56 and a standard deviation of 12. Approximately what percent of students who take the test will score below a 40? Estimations About Populations The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. What is the lowest score someone could receive to be in the top 10% of all scores? Estimations about the Population Example The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. What is the lowest score someone could receive to be in the top 10% of all scores? Estimations About Populations The results of a placement test for an exclusive private school is normal, with a mean of 56 and a standard deviation of 12. The school will only accept students who score in the top 15% of all scores. What is the minimum score a student would need in order to be accepted? Estimations About Populations The verbal section of the SAT test is approximately normally distributed with a mean of 500 and a standard deviation of 100. What is the range of the middle 50% of data? Estimation About Populations What do you need to show? 1. Check Nearly Normal Condition 2. Draw normal curve model with proper notation (use parameter notation) 3. Find values you are looking for in model and shade in appropriate region 4. Convert to z-score 5. Find the area in the shaded region: P Value p 350 y 620 p 1 . 5 z 1 . 2 . 818 6. Interpret you results in context