7.2 Sampling Distribution of the Sample Mean Objectives To know the shape, mean, and standard error of the sampling distribution of the sample mean. To use the sampling distribution of find probabilities. To know the effect of a small sample size. Shape, Center, and Spread of the Sampling Distribution of x The following symbols are used to describe the parameters of a sampling distribution: Mean Standard Deviation Size Population Parameter Sample Statistic Sampling Distribution x x s x N n or SE (standard error) The distribution of the sample means of a population has the following important properties: Center x This says the population mean is the same as the mean of the sampling distribution of the mean, regardless of the shape of the parent population and the size of the random samples. x is therefore an unbiased estimator of the population mean. Spread x n The standard deviation of the sampling distribution is also called the standard error of the mean. It equals the population standard deviation divided by the square root of the sample size, n. The larger the sample size, the smaller the standard error (spread) of the sampling distribution. This property can only be used when you randomly sample with replacement, or when you randomly sample without replacement and the sample size is less than 10% of the population size – which is most of the time. In the rare cases when it can’t be used then the following formula is used instead: x 1 n N n N 1 7.2 Sampling Distribution of the Sample Mean Shape The shape of the sampling distribution will be approximately normal if the population from which the samples are taken is approximately normal. If the parent population is not normal, the sampling distribution shape becomes more normal as n increases. (In practice if n 30 then the sampling distribution is assumed to be approximately normal). This remarkable property is called the Central Limit Theorem (CLT). Central Limit Theorem: if the size n of the sample is sufficiently large ( n 30 ) then the distribution of the sample means will approximate a normal distribution regardless of the shape of the population. Uniform Bimodal N 16,000 4.996 2.882 N 16,000 5.002 4.242 n 2 x 5.01 x 2.037 n 2 x 4.977 x 2.999 n 10 x 5.015 x 0.911 n 3 x 4.946 x 2.449 n 80 x 4.989 x 0.322 n 30 x 5.032 x 0.722 Population Distribution Simulated Sampling Distribution of Sample Means for different sample sizes 2 7.2 Sampling Distribution of the Sample Mean If the central limit theorem didn’t exist, it would not be possible to use statistics. We would be unable to reliably estimate a parameter like the mean by using an average derived from a much smaller sample. This would all but shut down research in the social sciences and the evaluation of new drugs since these depend on statistics. It would invalidate the use of polls and completely alter the nature of marketing research not to mention politics. Finding Probabilities Involving Sample Means The above properties can be used in conjunction with finding probabilities using the normal distributions. Example: P10 page 438 Last January 1, Jenny thought about buying individual stocks. Over the next year, the mean of the percentage increases in individual stock prices is 6.5% and the standard deviation of these percentage increases is 12.8%. The distribution of price increases is approximately normal. a. If Jenny had picked one stock at random, what is the probability that it would have gone down in price? b. If Jenny had picked four stocks at random, what is the probability that their mean percentage increase would be negative? c. If Jenny had picked eight stocks at random, what is the probability that their mean percentage increase would be between 8% and 10%. 3 7.2 Sampling Distribution of the Sample Mean Shape, Center, and Spread of the Sampling Distribution of the Sum of a Sample Sometimes instead of a problem involving the average of a sample, you are dealing with the total number in the sample. E.g. what is the probability that there are 30 or fewer children in a random sample of 20 families in the United States? You can convert this to an average number of children by dividing by 20, and then use the properties of the sampling distribution of the mean. A second way to do this is to use the properties of the sampling distribution of the sum of a sample. Center sum n Spread sum n x n n n Shape The shape of the sampling distribution will be approximately normal if the population from which the samples are taken is approximately normal. If the parent population is not normal, the sampling distribution shape becomes more normal as n increases. Example: P13 page 439. The distribution of the number of motor vehicles per household in the United States is roughly symmetric, with mean 1.7 and standard deviation 1.0. a. If you pick 15 households at random, what is the probability that they have at least 30 motor vehicles among them? b. If you pick 20 households at random, what is the probability that they have between 25 and 30 motor vehicles among them? 4 7.2 Sampling Distribution of the Sample Mean