AP STATS CH 9 Sampling Distributions Name:_______________ 1 AP Stats Section 9.1 Sampling Distributions Objective: Define parameters & statistics, sampling variability, and bias; describe sampling distributions Warm Up The average grade on a our final is 83.9% with a SD of 2.04%. Use 84 for simplicity. A: What percent of students will be more than of 4% away from the average? B:What is the z score of the student with a perfect exam? C: What grade would put you in the 90th percentile? Can we use the normal distribution to do this? Should you redo 1? D: If we want to apply the curve y=2+ 1.08x to the distribution. What would the new mean and variance be? E: Describe the distribution of our grades. Key Concept: Define: parameter Notation: mean of a population is ___________, proportion of a population is ______________. Define: statistic Notation: mean of a sample is________________, proportion of a sample is ________________. 2 **Statistics come from ___________________ and parameters come from _________________________. 1) A group of students are surveyed from the GNHS student body. The mean height of the 48 students was 68.31 inches. How would we describe the number 68.31 inches? (a) p 68.31 (b) 68.31 (c) p 68.31 (d) x 68.31 2) For a given sample or population, what range of values can p and p take on? Explain why. 3) How much on the average do American households earn? The government’s Population Survey contacted a sample of 113,146 households in March 2005. Their mean income in 2004 was x $60,528 . The population that the poll wants to draw conclusions about is all 113 million U.S. households. Define, in words, x and . 4) Is it safe to say the mean of all 113 million US households is approximately $60,528? Explain your reasoning. 5) A survey of 18 AP Stats students at GNHS resulted in 12 of the 18 stating they do NOT believe marijuana should be legalized. The survey was intended to draw conclusions about North AP Stats student’s opinions on the legalization of marijuana. Define, in words, p and p . 6) Suppose we took another sample of 18 North AP Stats students, would you expect to get 12/18 students stating they do NOT believe marijuana should be legal? Explain your reasoning. 3 7) This concept is known as Sampling Variability. Different samples will produce different results. 8) If samples produce different results, then how do we know what the true value of the parameter is? In other words, how could we be more confident in our estimate of the mean income for U.S. households? 9) A sampling distribution of a mean is the distribution (histogram) of values taken on by the sample mean in all possible samples of the same size from the same population. 10) Consider a population of people heights 5, 5, 6, 6, and 7 feet tall. Now suppose we took samples of size 2 without replacement. Construct a sampling distribution of sample means for all 10 samples where n =2. 4 1) A sampling distribution of a proportion is the distribution (histogram) of values taken on by p̂ in all possible samples of the same size from the same population. 2) Does a piece of buttered toast land butter side down more often than not? Lets simulate the idea of flipping a piece of toast 20 times and count the number of buttered sides that land face down. Use the calculator command randbin(20,.5,10)/20, in L1 A: what does the command do? B: Use the data to create a sample proportion histogram. C: Using what you know about binomials what is the expected proportion? Visual of Bias and Variability: 3) A sampling distribution used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated. 4) The variability of a statistic is described by the spread of its sampling distribution. This spread is determined by the sampling design and the size of the sample. High bias, low variability Low bias, high variability High bias, high variability Low bias, low variability 5 Section 9.2 Sample Proportions Objective: Find mean & std. dev. of samp. distrib. for proportions; establish rules for normal approximation Warm Ups What are the following symbols? x p p Key Concept: You can only use the standard deviation when you have a large population. 1) Rule for Using the Standard Deviation for a Sample Proportion When N 10n p (N = size of population p(1 p) n n = sample size) In other words if our sample (represented by n) was 10 people, then the population must consist of at least 100 people to use the standard deviation formula above. Key Concept: To use the normal approximation you must not be near the endpoints. 2) These are the same rules as Normal Approximation for a Binomial 1. np 10 2. n(1 p) 10 or nq 10 Key Concept: 3) Why is it beneficial for us to be able to use a Normal Approximation? 4) Example (modeled): - One way of checking the effect of undercoverage, nonresponse, and other sources of error in a sample survey is to compare the sample with known facts about the population. About 11% of American adults are black. - What would you expect the proportion p̂ of blacks in an SRS of 1500? - Why would it be unlikely that it is exactly .11? 6 - If a national sample contains only 9.2% blacks, should we suspect that the sampling procedure is somehow underrepresenting blacks? To answer this we will find the probability that a sample contains no more than 9.2% blacks when the population is 11% black. a) Conditions *Necessary on AP Exam*: b) Graph & z-score c) Interpretation: 5) Example (you try): A polling organization asks a SRS of 1500 first-year college students whether they applied for admission to any other college. In fact, 35% of all first year students applied to colleges besides the one they are attending. What is the probability that the random sample will give a result within 2 percentage points of the true value? a) b) c) 7 AP Stats: Section 9.3 Central Limit Theorem Objective: Observe/apply the CLT; construct sampling distributions from a skewed population Purpose: Find the prob. of obtaining a sample mean ; observe what happens to a sampling distribution as n changes Distribution of x from a non-normally distributed population. Sampling Pennies The distribution of the dates stamped on pennies is not normally distributed. In fact they are skewed to the left as most of the pennies in circulation are the ones minted recently. In the bucket, there are over 1000 pennies. The histogram below shows the number of pennies in this population that were minted in a particular year. Also included is the mean and standard deviation of the mint dates. Population 1990.02 11.798 8 Intro to Activity 1. Left column a. What is in the collection box? b. What is in the table? c. What is displayed in the graph? Pennies Activity 2. Describe the population of pennies we have shown on page 1. 3. In this activity we will be constructing sampling distributions. Predict what you think we will observe about the sampling distributions center, shape, and spread and how it will compare to the original distribution: a. Center – b. Shape – c. Spread – 4. Go to Mr. deGroh’s website. Under the Notes banner open the fathom file called “Fathom Penny Activity” from chapter 9 under the ch 9 docs banner. 5. Calculate the mean of the population of pennies. Right click on the histogram then choose “plot value”. Next type “mean(“ and then click ok. Calculate the standard deviation of the population by right clicking on the histogram again, selecting plot value, but this time type “s(“ then hit enter. Record the mean and standard deviation of the population below using proper notation. 6. Compare the parameters you just got from Fathom to the numbers and the distribution on page 1. 9 7. Use the center column for your responses to this prompt. Explain in context and detail what is in/represented in: a. The box called “Sample of Penny Population” b. The table called “Sample of Penny Population” c. The graph called “Sample of Penny Population” 8. Calculate the mean and standard deviation of this column. You do not have to record these values at this point. 9. Use the right column for your responses to this prompt. Explain in context and detail what is in/represented in: a. The box called “Measures from Sample of Penny Population” b. The table called “Measures from Sample of Penny Population” c. The graph called “Measures from Sample of Penny Population” 10. Calculate the mean and standard deviation of this column. You do not have to record these values at this point. 10 11. Right click on the collection box shown below and select “inspect collection” and then click on the tab “Sample” if the tab is not already selected. Change the cases from 2 to 4 and then click “sample more cases”. Describe in context what happened. 12. What does one of the dots in your dotplot represent? 13. Close the “inspection” window by clicking on the red x. Now right click on the collection box shown below, select “inspect collection”. Change the measures from 1 to 100. Next, click “collect more measures”. Describe in context what happened. 14. What does one of the dots in your dotplot represent? 15. Change the dotplots to histograms. Record the data in the box on the next page and then copy the histogram on your screen to the graph provided. Recall that by you can see the total number of observations within a bin in your histogram by moving the mouse over the bin, and then looking in the lower left hand corner of the screen. 11 n=4 = ________ ‘55 ‘60 ‘65 ‘70 ‘75 = ________ ‘80 ‘85 ‘90 95 ‘00 ‘04 16. Describe what the graph above displays? 17. Describe the distribution. 12 18. Increase the sample size of pennies from 4 to 9 by right-clicking on the “sample of penny population” icon and selecting “inspect collection”. Next select the sample tab and change the number of cases from 4 to 9. Now close that window and right-click on the “measures from sample of penny population” collection box and select “collect more measures”. Re-select the histogram graph to resize the graph. Transfer the histogram below. n=9 = ________ ‘55 ‘60 ‘65 ‘70 ‘75 = ________ ‘80 ‘85 ‘90 95 ‘00 ‘04 19. Describe what the graph above displays? 20. Describe the distribution. 13 21. Increase the sample size of pennies from 9 to 16 by right-clicking on the “sample of penny population” icon and selecting “inspect collection”. Next select the sample tab and change the number of cases from 9 to 16. Now close that window and right-click on the “measures from sample of penny population” collection box and select collect more measures. Re-select the histogram graph to resize the graph. Transfer the histogram below. n = 16 = ________ ‘55 ‘60 ‘65 ‘70 ‘75 = ________ ‘80 ‘85 ‘90 95 ‘00 ‘04 22. Describe what the graph above displays? 23. Describe the distribution. 14 24. Increase the sample size of pennies from 16 to 30 by right-clicking on the “sample of penny population” icon and selecting “inspect collection”. Next select the sample tab and change the number of cases from 16 to 30. Now close that window and right-click on the “measures from sample of penny population” collection box and select collect more measures. Re-select the histogram graph to resize the graph. Transfer the histogram below. n = 30 = ________ ‘55 ‘60 ‘65 ‘70 ‘75 = ________ ‘80 ‘85 ‘90 95 ‘00 ‘04 25. Describe what the graph above displays? 26. Describe the distribution. 15 27. Increase the sample size of pennies from 30 to 50 by right-clicking on the “sample of penny population” icon and selecting “inspect collection”. Next select the sample tab and change the number of cases from 30 to 50. Now close that window and right-click on the “measures from sample of penny population” collection box and select collect more measures. Re-select the histogram graph to resize the graph. Transfer the histogram below. n = 50 = ________ ‘55 ‘60 ‘65 ‘70 ‘75 = ________ ‘80 ‘85 ‘90 95 ‘00 ‘04 18. Describe what the graph above displays? 19. Describe the distribution. 16 20. How do the shapes of the sampling distributions compare to the population? Is this what you expected to see? 21. Look back to each of the distributions. Describe changes you see in the shapes of the distributions and explain what this means. 22. Describe what happened to the values of the sample mean and standard deviation compared to the population parameters. 17 AP Stats Section 9.3 CLT and Sampling Means 1) The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population. If we take a fixed number of trials the distribution we find is only an approximation of the true sampling distribution. What were some of the issues with the penny activity? If the sample statistic is the sample mean, then the distribution is called the sampling distribution of sample means. Sample 3 Sample 4 x4 Sample 1 x1 Sample 5 x5 x3 Sample 2 x2 Sample 6 x6 The sampling distribution consists of the values of the sample means, x 1 , x 2 , x 3 , x 4 , x 5 , x 6 . 2) Properties of Sampling Distributions of Sample Means a) The mean of the sample distribution of x, x , is equal to the population mean. x = b) The standard deviation of the sample means, x , is equal to the population standard deviation, divided by the square root of n. x = The standard deviation of the sampling distribution of the sample means is called the standard error of the means. Why is this different from the population’s standard deviation? 18 3) The Central Limit Theorem If a sample of size n 30 is taken from a population with any shaped distribution that has a mean = and standard deviation = , If the population itself is normally distributed, with mean = and standard deviation = , x x xx x x x x x x x x x x x the sample means will have a normal distribution. the sample means will have a normal distribution for any sample size n. xx x x x x x x x x x x x x In either case, the sampling distribution of sample means has a mean equal to the population mean. μx μ Mean of the sample means The sampling distribution of sample means has a standard deviation equal to the population standard deviation divided by the square root of n. σx σ n Standard deviation of the sample means This is also called the standard error of the mean. Finding The Mean and Standard Error 4) 38 samples of fully grown magnolia bushes were randomly selected from a larger population and were found to have a mean height of 8 feet and a standard deviation of 0.7 feet. a) Find the mean and standard error of the mean of the sampling distribution. b) Find the probability that the mean height of the 38 bushes is less than 7.8 feet. 19 Probability and the Normal Distribution 5) The average on a statistics test was 78 with a standard deviation of 8. If the test scores are normally distributed, find the probability that the mean score of 25 randomly selected students is between 75 and 79. Probabilities of x and x 6) The population mean salary for auto mechanics is = $34,000 with a standard deviation of = $2,500. Find the probability that the mean salary for a randomly selected sample of 50 mechanics is greater than $35,000. 7) The population mean salary for auto mechanics is = $34,000 with a standard deviation of = $2,500. Find the probability that the salary of one randomly selected mechanic is greater than $35,000 if the population mean salary for auto mechanics is approximately normal. 20