Section 9.1-9.2 Sampling Distributions and Sample Proportions Sampling Distribution Sampling Distribution ~ distribution of values taken by the statistic in all possible samples of the same size from the same population A sample distribution is DIFFERENT than the sampling distribution. Describing the sampling distribution shape, center, spread, outliers Parameter ~ describes population Statistic ~ describes sample Unbiased Statistic ~ mean of sampling distribution is equal to the true value of the parameter being estimated WHAT MAKES A STATISTIC A POOR ESTIMATOR OF A PARAMETER? HIGH VARIABILITY – The small samples lead to a larger spread in the sampling distribution of the statistic giving less certainty about the value of the true parameter. HIGH BIAS – Poor sampling methods create unrepresentative samples so that the center of the sampling distribution is not equal to the true value of the parameter. WHY? And what does that look like? HOW DO WE AVOID HIGH BIAS??? USE APPROPRIATE SAMPLING PROCEDURES THAT WE LEARNED IN PREVIOUS CHAPTERS!!! HOW DO WE AVOID HIGH VARIABILITY??? First, understand that sampling variability occurs when the value of a statistic varies in repeated random sampling So, to avoid high variability of a statistic, which is described by the spread of its sampling distribution, use larger samples for smaller spread. As long as the population is at least 10 times larger than the sample, the spread of the sampling distribution is approximately the same for any population size. WHY DOES THE POPULATION SIZE NOT REALLY MATTER MUCH??? Even more, why does a sample of size 260 serve a population of 2600 just as well as a population of 26,000? If the population is small, then outliers are going to have a greater impact on the sampling process by creating greater variability in the sampling distribution. The size of the sample is what impacts the sampling variability so a statistic from a sample of 260 Walton students is just as precise as a statistic from a sample of 260 from all East Cobb high school students. Of course, this is assuming one important fact. Which is? THE SAMPLES MUST BE RANDOM! SECTION 9.2 Sample Proportions The sample proportion pˆ is a statistic pˆ = # of successes / total sample size Sampling distribution of a sampling proportion: choose an SRS of size n from a large population with population proportion p having some characteristic of interest. Let pˆ be the proportion of the sample having that characteristic. Then: 1) sampling distribution of pˆ is approximately normal and is closer to a normal distribution when the sample size n is large 2) the mean of the sampling distribution is exactly p 3) the standard deviation of the sampling distribution is p(1 p) n Rules of Thumb 1) 2) Use the recipe for standard deviation of pˆ only when the population is at least 10 times as large as the sample We will use the normal approximation to the sampling distribution of pˆ for values of n and p that satisfy: np≥10 and n(1-p)≥10 Standard Deviation Behavior What will make the size of the standard deviation, p(1 p) , change? n If the sample size goes up the standard deviation goes down. If the sample size goes down, standard deviation goes up. How would we cut the standard deviation in half? Increase the sample size by multiplying by 4. EXAMPLE (pg. 477 # 9.15) The Gallup Poll once asked a random sample of 1540 adults, “Do you happen to jog?” Suppose that in fact 15% of all adults jog. a) Find the mean and standard deviation of the proportion pˆ of the sample who jog. (Assume the sample is an SRS.) μ = p = 0.15 ; σ = (.15)(.85) 0.0091 1540 b) Explain why you can use the formula for the standard deviation of pˆ in this setting. The population (assumed to be US citizens) is certainly more than 10 times larger than the sample. c) Check that you can use normal approximation for the distribution of pˆ . EXAMPLE (pg. 477 # 9.15) (cont’) c) (answer) np = 231, n(1-p) = 1309 ; these are both ≥ 10 d) Find the probability that between 13% and 17% of the sample jog. normalcdf(0.13,0.17,0.15,√(0.15*0.85/1540)) ≈0.9721 e) What sample size would be required to reduce the standard deviation of the sample proportion to one-half the value found in a)? 1540 times 4 = 6160