1 Independent-Samples T-Test: In-class example General issue: Is there anything a person with Alzheimer’s disease can do to improve their memory for their own personal history? Specific research question: Will persons with AD who have been exposed on a regular basis to cues for their personal history rate their own memories as better than those of persons with AD who have not been exposed to these memory cues. Experiment Construct PowerPoint presentation that contain family photographs and audio recollections. After six weeks, give participants a measure of their perceived memory function. Experimental Group (Memory Cues) ------------------------ Control Group (No memory cues) ----------------------- N1 = 10 M1 = 16.2 S1 = 2.49 N2 = 10 M2 = 9.9 S2 = 2.33 M1 – M2 = 16.2 – 9.9 = 6.3 Is this a difference that occurred by chance or is this a difference that is there because the IV did something to make the people in group 1 different from the people in group 2? HO: Persons with AD in the memory cues group do not have perceived memory scores which are significantly greater than the perceived memory scores of persons with AD in the control group. (The means for the two groups are different from each other by chance) H1: Persons with AD in the memory cues group have perceived memory scores which are significantly greater than the perceived memory scores of persons with AD in the control group. (The means are different from each other due to the effects of the independent variable) REMEMBER: The alternative hypothesis is the prediction of the researcher. The null hypothesis is the opposite of the prediction. α = .05 This is a situation where there is no way of knowing for sure whether the null hypothesis is true or whether the alternative hypothesis is true. We will never be able to “prove” that either of them is true or false. 2 However, we can know exactly what the odds are that the null hypothesis is true. We can know exactly what the odds are that the means for the two groups are only different from each other by chance. When we set our alpha level at .05, we’re saying that we’re only willing to reject the null hypothesis if we can show that there is less than a five percent chance that it’s true. How can we know if the odds are less than 5% that HO is true? We can do this using a test called an independent-samples t-test. The number we’re making our decision about is the difference between the two samples means. We can use SPSS to convert this number from a number in raw score units into a number in standard score units. The symbol for this number is a value for t. Just like with a value for chi-square, if we get a value for t of zero, it tells us that the number we get from our experiment (the difference between the means) is exactly equal to what we’d expect to get if the null hypothesis were true. If we get a difference between the means that’s a little bit different from zero, it’s pretty likely that it’s just different from zero by chance. The further t is from zero (in either direction), the less likely it is that the two sample means are that different from each other just by accident. At some point the value for t is far enough away from zero so that we get to the point where the odds that it’s different from zero just by change become less than 5%. That’s the point where we ought to reject the null hypothesis. The critical values for t table tells us how far from zero a value for t has to be in order to reject the null hypothesis. To know how to write a decision rule we need to think about the question we’re trying to answer. In this example the researcher predicted that the mean for group 1 will be greater than the mean for group 2. In other words, they’re predicting that the difference between the means will be in a particular direction. This is known as a one-tailed test. It only makes sense to reject the null hypothesis if the mean for group 1 really is greater than the mean for group 2. This means than we’re predicting that the difference between the means will be a positive number. In turn, this means that we’re predicting that the value for t will be a positive number. Decision rule: If t ≥ +1.73, reject HO. Now we just need to use SPSS to find the observed value for t. According to SPSS, the observed value for t is +5.85. 3 5.85 is greater than 1.73, so our decision is to reject the null hypothesis. Because we’ve rejected the null hypothesis our conclusion is what we’ve already written as the null hypothesis. That makes our conclusion: Persons with AD in the memory cues group have perceived memory scores which are significantly greater than the perceived memory scores of persons with AD in the control group, t (18) = 5.85, p < .05. Now, let’s say that we phrased our question differently. Suppose that the researcher was only willing to bet that mean for the group getting the memory cues would be different from the mean for the control group. Now the researcher’s prediction is that the mean for group 1 could be either greater than or less than the mean for group 2. This version of the test is called a two-tailed test. The alternative hypothesis is the prediction of the researcher, so it is now: Persons with AD in the memory cues group have perceived memory scores which are significantly different than the perceived memory scores of persons with AD in the control group. The null hypothesis is now: Persons with AD in the memory cues group do not have perceived memory scores which are significantly different than the perceived memory scores of persons with AD in the control group. If these are the null and alternative hypotheses, then the value for t could be significant if it is far enough above or below zero. This means that there are now two critical values for t and the decision rule would look like this: If t ≤ -2.10 of if t ≥ +2.10, reject HO. The observed value for t is the same (+5.85), so our decision is still to reject the null hypothesis. Our decision is to accept our new alternative hypothesis and conclude that: Persons with AD in the memory cues group have perceived memory scores which are significantly different than the perceived memory scores of persons with AD in the control group, t (18) = 5.85, p < .05. Deciding between a one-tailed and a two-tailed test If the prediction of the researcher is that one sample mean will be different from the other one, the test is two-tailed. If the prediction is that one mean will be greater than the other mean, the test is one tailed. If the prediction is that one mean will be less than the other one, the test is onetailed. 4 Where the observed and critical values for t come from Above, we used a t-test to show us whether the odds were less than 5% or not that the null hypothesis was true. But how can a t-test help us in knowing whether these odds are less than 5% or not? Statisticians can help us in this situation because they can show us what differences between means are likely to look like if we were to keep doing the same experiment over and over when the null hypothesis is true (i.e., when the memory cues don’t help with memory). Pretend that you do the experiment and that you can know for sure that the independent variable doesn’t have an effect. Working with the memory cues doesn’t change people at all in terms of how good they perceive their memory to be. If this is the case what would you guess the means for the two groups should look like in relation to each other. You expect them to be the same number. That’s because the experimenter didn’t do anything to make one group any different from another group in terms of their memory function. But do the means for the two groups have to be equal to each other? No, the mean of a sample is just an estimate – it’s not the real thing. So even in a situation where two means should be the same number, you can’t really expect them to be. They’ll almost certainly be at least a little bit different from each other just by chance. So how far do the means have to be from each other for us to bet that they’re not that different by accident? That’s where the statisticians can help us. Imagine you do the experiment once when the null hypothesis is true. There’s no reason, other than chance, for the two means to different from each other, but let’s say that we get a mean of 11.0 for group 1 and a mean of 10.5 for group 2. That gives us a difference between the means of +.5. We can put that number where it goes on the scale of possible differences between the mean. That’s one difference between the means that was collected when the null hypothesis is true. Now pretend that you do the same experiment all over again. The null hypothesis is still true. This time the mean of group 1 is 8.5 and the mean of group 2 is 9.0. That makes the difference between the two means equal to -.5. Here’s where it goes. Now pretend that we did the same experiment over and over. Hundreds of times. Thousands of times! We can now see how these differences between means pile up around zero. Half are bigger than zero, just by accident. Half are less than zero, just by accident. But the average of all of these number is exactly equal to the number we’d expect it to be if the memory cues don’t work – zero. 5 Across thousands of different experiments, here’s what differences between means look like if the null hypothesis is true. Here’s the difference between means we got from our experiment/ Are we willing to believe that our difference between means belongs with these other ones or not? Is it believable that our difference between means belongs with these other ones. If not, we’re deciding that our difference between means must have been collected when the null hypothesis is false. The name for this collection of differences between sample means is the sampling distribution of the difference between means. It’s a collection of a very large number of differences between means obtained when the null hypothesis is true. We get one difference between means from each hypothetical experiment we could do when the null hypothesis is true. The way the test works, every number in this hypothetical collection is converted to a standard score. In other words, whenever we get a difference between means we convert it to a standard score. You know what has to happen to convert any number into a standard score. Take that number and divide it by the average of all the number. In this case the average of all the numbers is always going to be zero. This tells us how far one difference between the means is from zero. The value is known as the standard error of the mean. From SPSS we learn that the standard error of the mean for this data set is 1.08. Now divide this by the average amount that differences between means deviate from zero. Now we’ve got our standard score. And the symbol for this standard score is a value for t. (6.3 – 0)/1.08 = 5.85 The critical value represents far away from a standard score of zero you have to hit the start of the least likely 5% of numbers that make up the curve. With a onetailed test, we put all 5% of that bet on one side of the curve. With a two-tailed test we split that 5% in half and put 2 ½% on one side and 2 ½% on the other side. t=