Lecture Notes

Psych 5500/6500 The Sampling Distribution of the Mean Fall, 2008 Sampling Distribution of the Mean The 'sampling distribution of the mean’ (SDM): the population of all the sample means you could get if you sampled a certain number of scores from a certain population. For Example In a previous semester I asked the students to draw a sample from a deck of playing cards. Original Population from Which the Sample Was Drawn 4 cards of each type (jacks counted as ’11’, queens as ’12’, kings as ’13’). This is a graph of individual scores in the population (i.e. ‘Y’). The mean of the population of playing cards is μY=7 and its standard deviation is σY=3.74. Note the population is not normally distributed, and the exact values of μ and σ are known (not estimated). Sample Means When N=4 The students were asked to sample four cards and find the mean of the sample. Not surprisingly they obtained many different sample means. Sample means from when n=4: 6.75, 5.5, 7.25, 7.5, 6.25, 8, 9.75, 7, 9.5, 4.25, 3.75, 9, 5.25, 7.75, 7.5, 11.5, 2.25, 9.25, 3.5, 8.75, 5.5, 6, 7.25, 7.75, 9.75, 7, 9.5, 7.5, 6.5, 9.25, 7.25, 7.25, 9.5 SDM for N=4 This is a graph of the 23 sample means (rounded off to the nearest whole number). We are starting to see the shape of the sampling distribution of the mean when n=4. Note that the mean of the sample means looks to be around ‘7’ (the mean of the original population, which is why the sample mean is an unbiased estimate of the population mean). Sample Means When N=8 The students were then asked to sample eight cards and find the sample mean. Sample means from when n=8: 6.25, 8.25, 5.75, 5.38, 6.63, 7.5, 7.5, 9.13, 8.38, 5.63, 6.13, 5.88, 8.13, 7.75, 7.13, 4.7, 5.63, 6.63, 9.13, 5.88, 5.88, 5.13, 8.63, 6.13, 7.5, 9.13, 8.13, 7.63, 6.75, 7.88, 7.38, 7.50, 7.85 SDM for N=8 Again, this is a graph of the sample means for when N=8. And again, the mean of the sample means looks to be the same as the mean of the original population (7). Comparisons The next three slides show the three graphs. Note the following: 1. While the population from which we sampled was not normally distributed, the graphs of the sample means begin to look more like normal curves. 2. The variance of the sample means is less than the variance of the original population, as n moves from 4 to 8, the variance of the sample means decreases (the sample mean is a ‘consistent’ estimate of the population mean). Original Population from Which the Sample Was Drawn This is a graph of individual scores (Y). SDM for N=4 This is a graph of sample means (when n=4). SDM for N=8 This is a graph of the sample means for when N=8. Short Cut The preceding approach for finding the sampling distribution of the mean would actually require that we obtain an infinite number of sample means to arrive at a true picture of the population of sample means we could obtain if we sampled a certain number of scores from a certain population (i.e. the SDM). This is a good way to introduce the concept of SDM but we need a short cut for actually producing an SDM... 1) The Shape of the SDM You can count on the SDM being normally distributed if either of the following two conditions are met. 1. The SDM will be normally distributed if the population you sampled from is normally distributed. 2. The SDM will be normally distributed (even if the population you sampled from is not) if the N of your sample is large enough (Central Limit Theorem). Rule of thumb: N ≥ 30 2) The Mean of the SDM The mean of the population of sample means equals the mean of the population from which you sample (that is why the sample mean is an ‘unbiased’ estimate of the population mean). μY  μY 3) The Standard Deviation of the SDM The standard deviation of the sample means is less than the standard deviation of the population from which you sampled, as the means will vary less than the scores do. σY σY  N σ Y is also knownas the ' standarderror of the mean' , can you figure out whyit is calledthat? Example: Original Population Let’s say the population is normally distributed, which means that the SDM will be normally distributed as well. SDM for N=4  Y  Y  60 σY 16 σY   8 N 4 SDM for N=64  Y   Y  60 σY 16 σY   2 N 64 Probability and the SDM When the SDM is normally distributed we can answer certain types of questions. The following slides take us through a typical question from the homework assignment. Question We will begin by repeating a process learned in an earlier lecture. We are sampling from a population that is normally distributed with a mean of 55 and a standard deviation of 10. What is the probability of drawing a score from that population that is between 50 and 60? p(50  Y  60)? Original Population Step 1: draw and label the population. Original Population Step 2: shade in the area of question. Original Population Step 3: compute the z scores and look up the area under the normal curve. The probability of obtaining a single score between 50  Y  60 = .1915+.1915 = .3830 p=.3830 Question Now we are going to ask a new question. If we sample nine scores from that population, what is the probability of obtaining a sample mean that is between 50 and 60? p(50  Y  60)? SDM for N=9 Step 1: draw the sampling distribution of the mean, which is the population of all the sample means we could get if we sample 9 scores from the original population. We know the SDM is normally distributed, its mean is the same as the mean of the population, and we can compute the standard deviation of the curve (‘standard error’). Note this is a population of sample means. SDM for N=9 Step 2: shade in the area of question. SDM & Standard Score To figure out the shaded area of the normal curve we need to change the sample means of 50 and 60 to standard scores. As always, the standard score will be the ‘raw’ score on the graph (this is a graph of sample means) – the mean of the graph (the mean of the sample means) divided by the standard deviation of the graph (the standard deviation of the sample means, a.k.a. the ‘standard error’) z Y - Y Y SDM for N=9 z Y - Y Y 60  55 50 - 55   1.5 and z   1.5 3.33 3.33 Step 3: compute the z scores and look up the area under the normal curve. The probability of obtaining a sample mean between 50 and 60 = .4332+.4332 p=.8664 Looking Back When we sampled one score from a normal population that had μ=55 and σ=10 there was a 38.3% chance that the score would be within 5 of the population mean. When we sampled 9 scores from that population there was a 86.64% chance that the sample mean would be within 5 of the population mean. 1-tail and 2-tail p values We are very close to doing some statistical analyses to test specific hypothesis. The next step is to play with scenarios such as: You sample 36 scores from a population that has a μ=80 and σ=12. For what value of the sample mean is there only a 5% chance that you would obtain a sample mean that is that far or farther above the population mean? To set up the problem first draw the population you will be sampling from, and then the SDM (population of sample means for N=36). We don’t know if the population is normally distributed, do we know if the SDM is? Formulas z Y - Y Y and Y  (z)( Y )  Y What sample mean would be 1.65 standard deviations above the mean on this curve? Y  (z)( Y )  Y  (1.65)(2)  80  83.3 Conditional Probability Let’s think of it as a conditional probability. p(Y  83.3| sampling36 scoresfroma population with a  of 80 and of 12)  .05 Another Example You sample 36 scores from a population that has a μ=80 and σ=12. For what value of the sample mean is there only a 5% chance that you would obtain a sample mean that is that far or farther below the population mean? Conditional Probability p(Y  76.7 | sampling36 scoresfroma population with a  of 80 and of 12)  .05 Final Example You sample 36 scores from a population that has a μ=80 and σ=12. For what values of the sample mean is there only a 5% chance that you would obtain a sample mean that is that far or farther away from the population mean (in either direction)? For a normal curve the z scores that cut off a total of the 5% most extreme scores (in both directions) are: Conditional Probability p(Y  76.08 or Y  83.92 | sampling36 scores froma populationwith a  of 80 and of 12)  .05

Lecture Notes

Related documents

Products

Support

Lecture Notes

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib