Chapter 11: Random Sampling and Sampling Distributions If we draw n=9 values (X) from a population that we know is normally distributed with a mean of 100 and a standard deviation of 15 (like IQs), what can we say about the distribution of the mean of these values? It is not surprising that the mean of these values will be around the mean of the population. But what about the variability? In general, when we’re referring to the probability distribution of a statistic, we’re talking about that statistic’s sampling distribution. If we’re talking about how sample means are distributed, then we’re talking about the sampling distribution of the mean. (See the demo) If we repeatedly draw n samples from a normal distribution with mean m and standard deviation s, then the mean of these samples will also be normally distributed with a mean: uX uX And standard deviation: sX sX Population Mean: 100.00, SD: 15.00 n Means of 5000 samples of size 9: 99.87, SD 5.004 Central Limit Theorem: The sampling distribution of the mean tends toward a normal distribution even if the population is not normally distributed. The sampling distribution becomes more normal for larger sample sizes. And the means and standard deviations of the sampling distribution of the mean are still: uX uX and sX sX Population Mean: 34.96, SD: 48.36 n Means of 5000 samples: 34.97, SD 16.07 Now that we know about the sampling distribution of the means, we can use Table A (normal distribution) to calculate the probability of observing a specific mean (or greater). Example: Suppose the IQ of the population is distributed normally with a mean of 100 and a standard deviation of 15. If we draw 16 people at random from the population, what is the probability that the mean IQ of this sample will be greater than 107? Example: Suppose the IQ of the population is distributed normally with a mean of 100 and a standard deviation of 15. If we draw 16 people at random from the population, what is the probability that the mean IQ of this sample will be greater than 107? u X u X 100 sX sX n The z-score for 107 is therefore z X u X sx 107 100 1.867 3.75 15 3.75 16 Relative frequency Answer: We know that the sampling distribution of the mean with n=16 will have a mean and standard deviation of: 90 100 IQ 110 The area under the normal distribution above z=1.86 is 0.0314 So there is a less than 5% chance of observing a sample mean greater than 107. Example: (IQ’s again) With a sample size of 100, what is the probability of observing a mean IQ that is less than 99? Example: (IQ’s again) With a sample size of 100, what is the probability of observing a mean IQ that is less than 99? u X u X 100 sX sX n 15 1.5 100 The z score for a sample mean of 99 is: z X u X sx 99 100 0.667 1.5 Relative frequency Answer: Like before: 94 96 98 100 IQ 102 104 106 The area below -0.667 is the same as the area above +0.667, which is 0.2514 So there is only a 25% chance of obtaining a sample mean more than one point below the population mean. Example: Suppose the height of the population of men has a mean of 70 inches and a standard deviation of 2.8 inches. If we sample 25 men from the population, what is the mean height that corresponds to the 95th percentile point (P95?) Example: Suppose the height of the population of men has a mean of 70 inches and a standard deviation of 2.8 inches. If we sample 25 men from the population, what is the mean height that corresponds to the 95th percentile point (P95?) Answer: The sampling distribution of the mean has a mean and standard deviation of: u X u X 70 sX 2.8 sX 0.56 n 25 z X u X sx , X zs x u X X (1.65)(0.56) 70 70.9 Relative frequency The z-score for the upper 5% of the normal distribution is 1.65. 68 69 70 height 71 So there is a 5% chance of observing a mean of 70.9 or more inches. 72