P201 Lecture Notes14 Two Population Tests

advertisement
P2010 Two Population Tests
Prequel: Ways of conducting Research involving two populations
Suppose I’ve developed a new pain reliever that I want to compare with Tylenol. The research will involve
administration of the pain reliever, a waiting period for it to take effect, the administration of a standard pain
(forcing the participants to listen to a statistics lecture on sampling distributions), then administration of an
“Absence of Pain” questionnaire, with high scores indicating little pain felt. So higher scores are better.
The research can be conducted in three different ways.
Independent groups Design
Two separate (independent; not paired or matched) groups are used.
One group receives the new pain reliever- it's the experimental group.
The other receives the standard pain reliever - it's the control group.
Pretest scores
Matched participants Design
78
77
75
74
73
70
68
67
60
58
57
57
56
The groups consist of matched pairs - each person has a "matched" twin
in the other group.
Matching is performed with respect to one or more pretest variables related to
the dependent variable.
One group receives the new pain reliever- it's the experimental group.
The other receives the standard pain reliever - it's the control group.
Participants as their own controls Design
One group is identified.
Participants in the group are given the treatment at one time.
They're given the control condition at some other time.
Statistical analyses: looking ahead.
The independent groups design requires the independent groups t-test.
The matched groups and participants as their own controls designs are known
collectively as correlated groups designs. They require the correlated groups t-test or as SPSS calls it,
the dependent samples t-test.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 1
2/5/2016
Independent samples
Moving from one sample to two independent samples Module 19
One sample
Observed mean – Expected mean
Observed mean – Expected mean
t = ------------------------------ = -----------------------------S
S2
----N
N
Two independent samples – equal sample sizes
Observed difference in means – Expected difference in means
t = ----------------------------------------------------------S12 + S22
----------N
Observed mean 1 – Observed mean 2 – 0
t = ------------------------------------S12 + S22
-----------N
Two independent samples – unequal sample sizes
Observed mean 1 – Observed mean 2 – 0
t = --------------------------------------------------(N1-1)S12 + (N2-1)S22
1
+
1
----------------------- ----N1-1 + N2-1
N1
N2
The equal sample-sizes formula is simpler than the unequal sample sizes formula.
But since we NEVER compute t-statistics by hand, the distinction between them is irrelevant in the
computer age.
Since the unequal sample sizes formula yields the same number as the equal sample sizes formula when
sample sizes happen to be equal the computer programs that do the computations for us always use the
unequal sample sizes formula.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 2
2/5/2016
Independent Groups t-test: Overview
Situation
Two independent populations or research conditions.
Independent means that there has been no matching of individual participants from the two conditions.
Null Hypothesis:
Alternative Hypothesis:
Population means are equal.
Population means are not equal.
Sumbols:
Symbols:
µ1 = µ2
µ1≠ µ2 or µ1<> µ2
Test Statistic: Independent Groups t-statistic
Where:
X-bar1 and
X-bar2 are the means
of the two samples
(
S21 and
S22 are the variances
(standard deviations
squared) of the
samples
N1 and N2 are the sample sizes
Distribution of t if the null hypothesis is true: T distribution with mean 0 and df = N1-1 + N2-1 = N1+N2-2
The denominator of the formula is called the standard error of mean differences.
(N1-1)S21 + (N2-1)S22
----------------------------N1-1 + N2-1
is the pooled variance of the samples, it’s also called a weighted average.
The above t statistic is called the equal variances assumed t formula. It is used when it can be assumed
that the population variances are equal.
If you’re not sure whether or not the population variances are equal, SPSS prints an alternative t-statistic
that does not require the equal variances assumption. You can use whichever one you wish. More on that
later.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 3
2/5/2016
Independent Groups t-test Example Problem
Based on the example given in Minium, et. al. p. 251. A student is interested in whether fragrances
enhance memory. He has participants read a passage from a text. Half the participants read the passage in
the presence of a pleasant but unfamiliar fragrance. The other half read the passage with no experimenter
provided scent present. One week later, all participants are brought back to the lab, and are given
a test of their memory for facts from the passage they had read. The Scent group was given the test on a
sheet of paper scented with the same fragrance they had experienced when reading the passage. The other
group was given the test with no experimenter-provided scent. The interest was in a comparison of
performance of the two groups.
Start here on 10/30/12
The data are as follows . . .
_
id scentgrp recall
1
0
2
0
3
0
4
0
5
0
6
0
7
0
8
0
9
0
10
0
11
0
12
0
13
0
14
0
15
0
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
30
1
Number of cases
30
16
22
22
22
26
25
20
16
23
20
21
26
16
16
25
29
23
22
32
24
26
21
26
25
20
25
27
36
32
read:
This is how data are supposed to be entered
for the independent groups t-test.
Note that ALL scores are in the same column
of the SPSS data editor.
For this test, we do NOT put the scores in
two different columns.
To tell SPSS which group each score belongs
to, we create a separate GROUP column and
put numbers (0 vs 1 or 1 vs 2) in it to
identify the group.
30
Number of cases listed:
30
The menu sequence for the independent groups t is
Analyze -> Compare Means -> Independent Samples T Test . . .
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 4
2/5/2016
The Hypothesis Testing Answer Sheet for the Independent Groups t-test
Describe the population or populations whose characteristics are being investigated.
Population of number of words recalled associated with a scent and
a population of number of words recalled that were not associated with a scent.
Mean of scented score pop = mean of unscented score pop
Null Hypothesis:_____________________________________________________________________
Formally state the
Mean of scented score pop = mean of unscented score pop
Alternative Hypothesis:______________________________________________________________
Give the name and the formula of the test statistic that will be employed to test the null hypothesis.
Independent groups t statistic
What significance level will you use to separate "likely" value from "unlikely" values of the test statistic?
0.05
Hint: .05 is a popular choice.________________________________________________________________________________
What is the value of the test statistic
computed from your data?
-3.027 (See below.)
___________________________________________________________________
What is the probability of a value as extreme as the
.005 (See “Sig.” field below.)
above value if the null hypothesis were true, i.e., the p-value?______________________________________________________
What is your conclusion?
Do you reject or not reject the null hypothesis?
Reject the null.
_____________________________________________________________
What are the upper and lower limits of a 95% confidence interval appropriate for the problem? Present them in a sentence, with
standard interpretive language.
The probability is .95 that the interval -8 to -1.55 contains the difference in population means.
We can be 95% sure that the difference in population means is between -8 and -1.55.
State the implications of your conclusion for the problem you were asked to solve. That is, relate your statistical conclusion to the
problem.
Recall associated with scents is apparently better than recall with no scent associated with it.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 5
2/5/2016
Analyze -> Compare Means -> Independent-Samples T Test
Gr oup S tatis tics
recall
sce ntgrp
.00 No scent
N
Me an
Std . Deviatio n Std . Erro r Me an
21 .4000
4.2 5609
1.0 9892
15
1.0 0 Scent
15
26 .2000
4.4 2719
1.1 4310
Independent Sam ples Test
Levene' s Test for E qual ity of
Va riance s
F
recall
Eq ual va riances assume d
Sig .
.00 1
.97 8
Eq ual va riances no t assumed
t-te st for Equa lity o f Me ans
t
-3.0 27
df
-3.0 27
27. 957
28
Sig . (2-t ailed ) Me an Differe nce
.00 5
-4.8 0000
.00 5
-4.8 0000
Std . Erro r
Dif feren ce
1.5 8565
95% Co nfide nce In terva l of
the Diffe rence
1.5 8565
Lower
-8.0 4806
Up per
-1.5 5194
-8.0 4828
-1.5 5172
Reading the independent groups t output.
Three tests of significance are presented in the table.
The first is a test that compares the variances of the two groups. It’s the F on the very left side of the table.
The result of this test determines which of the following two t-tests is to be used. Decision tree
No
Is p-value for F
<= .05?
Yes
Interpret the “Equal
Variances assumed” t
Interpret the “Equal
Variances not
assumed” t
We test for equality of variance first. If p > .05, they’re equal, so we used the equal variances t.
If p < = .05, we reject the null hypothesis of equal variances and use the equal variances not assumed t.
In this case, the p value for the variances test is .978 which is > .05, so we retain the hypothesis of equal
variances and use the equal variances t.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 6
2/5/2016
Correlated Groups t-test: Overview
Situation
Two correlated research conditions.
Correlated means that either 1) participants have been matched or 2) each participant serves in both
conditions.
Null Hypothesis:
Alternative Hypothesis:
Population means are equal
Population means are not equal.
Test Statistic: Correlated Groups t Statistic
t=
X1 - X2
D
=
S21 + S22 - 2rS1S2
SD
N
N
Start here on Thursday, 11/1/12
where
X1
= Mean of the sample from the first population
X2
= Mean of the sample from the second population
S1
= Standard deviation of the sample from the first population.
S2
= Standard deviation of the sample from the second population.
N
= Number of pairs.
The subtraction in the
denominator of the correlated
groups t results in a correction that
makes it bigger than the
independent groups t for the same
data.
This means that it’s more likely to
be significant.
This statistic is more powerful to
detect pop mean differences than
is the independent groups t.
r must be positive for this to
happen
r
= Pearson Product Moment Correlation Coefficient between the pairs.
`D
= The mean of the paired difference scores.
SD
= The standard deviation of the paired difference scores.
The correlated groups t statistic has a T distribution with degrees of freedom (df) = N-1.
The denominator of the formula is called the standard error of mean differences.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 7
2/5/2016
Correlated Groups t-test Example problem.
An I/O psychologist is interested in using scores on a personality test to predict job performance. But it’s
possible that personality test scores can be faked. In order to determine whether people actually fake
personality tests, she gives a Conscientiousness scale to a group of college students under instructions to
respond honestly. Then she gives them an equivalent scale under instructions that the 3 highest scorers on
the scale will receive a gift certificate to a local mall. The conscientiousness scale is embedded in a battery
of scales, so it’s not too obvious that the researcher is interested in scores on that scale specifically.
The hypothetical data for 30 participants is presented below . . .
id hscore iscore
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
6.0
4.5
4.6
5.0
3.3
4.1
5.3
3.3
4.7
3.3
3.6
3.2
3.8
3.9
6.2
5.3
4.1
5.7
4.5
4.7
4.3
3.1
4.0
3.0
2.7
4.0
4.5
3.6
3.8
6.9
4.7
5.1
5.8
4.9
4.6
4.7
4.3
3.8
4.2
5.1
4.1
4.1
3.3
4.6
6.5
5.7
4.8
5.5
6.0
4.2
4.4
2.5
5.2
5.2
4.8
4.9
5.9
4.4
4.7
6.3
Number of cases read:
For the correlated groups t-test, the scores
must be put in different columns of the data
editor window.
Pairing of the scores must be maintained –
each row has members of the same pair.
30
Number of cases listed:
30
The menu sequence for the correlated groups t is
Analyze -> Compare means -> Paired Samples T test . . .
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 8
2/5/2016
Describe the population or populations whose characteristics are being investigated.
Population of conscientiousness scores obtained under instructions to respond honestly and the population conscientiousness
scores under incentive to fake positive.
Mean of the H population C scores = Mean of I population
Null Hypothesis:_____________________________________________________________________
Formally state the
Mean of the H population C scores <> Mean of I population
Alternative Hypothesis:______________________________________________________________
Give the name and the formula of the test statistic that will be employed to test the null hypothesis.
Correlated groups (dependent samples)(correlated samples) t statistic
What significance level will you use to separate "likely" value from "unlikely" values of the test statistic?
.05
Hint: .05 is a popular choice.________________________________________________________________________________
What is the value of the test statistic
computed from your data?
3.123
___________________________________________________________________
What is the probability of a value as extreme as the
.004
above value if the null hypothesis were true, i.e., the p-value?______________________________________________________
What is your conclusion?
Do you reject or not reject the null hypothesis?
Reject the null hypothesis
_____________________________________________________________
What are the upper and lower limits of a 95% confidence interval appropriate for the problem? Present them in a sentence, with
standard interpretive language.
The probability is .95 that the difference in the population means is between .18 and .84.
We can be 95% confident that the population mean is between .18 and .84.
State the implications of your conclusion for the problem you were asked to solve. That is, relate your statistical conclusion to the
problem.
Apparently participants faked positively on the conscientiousness scale when given a modest incentive to do so.
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 9
2/5/2016
T-Test
Pa ired S amples S tatis tics
Pa ir 1
Me an
4.3 052
hscore
iscore
N
30
4.8 072
Std . Deviatio n Std . Erro r Me an
1.0 1407
.18 514
30
.85 582
.15 625
Pa ired Samples Corre lations
N
Pa ir 1
hscore & isco re
30
Co rrelat ion
.56 3
Sig .
.00 1
Pa ired S amples Test
Pa ired Differe nces
95 % Co nfide nce I nterval of
the Diffe rence
Pa ir 1
hscore - iscore
Me an
Std . Deviatio n Std . Erro r Me an
-.5 0196
.88 483
.16 155
Biderman’s P201 Handouts
Lo wer
-.8 3236
Up per
-.1 7155
Topic 13: Two Population Tests - 10
t
-3. 107
df
2/5/2016
29
Sig . (2-t ailed )
.00 4
Two sample t-tests example problems
1. A psychologist has devised a new method of teaching a foreign language. She chooses 30 persons who
have never spoken French and then places 15 of them in a regular college-level French class. The other 15
students are taught using her new method. The results are below. Set up and conduct the appropriate test.
The dependent variable is the no. of questions answered correctly on a standardized examination covering
knowledge of French.
Old method
New method
Mean
55.3
54.3
S.D.
12.3
11.2
N
15
15
2. Suppose you have been put in charge of evaluating the design of the packaging for a new product your
firm is marketing. You select 12 persons and have them evaluate both the old design and the new design.
Six persons see the new design first. The other six see the old design first. The products are evaluated on a
variety of measures. Our interest here is on the responses on an overall, summary scale of favorability to
the product. The data are presented below:
Person: 1 2 3 4 5 6 7 8 9 10 11
Old
32 35 44 49 19 23 30 30 28 48 34
New
29 34 38 45 22 21 29 30 22 41 31
12
38
35
3. Suppose you're investigating the effects of temperature on performance on the job. You select two work
areas in which employees perform the same tasks. In one of the areas, you arrange to have the ambient
temperature set to 78° F. In the other area, you arrange to have it set equal to 70° F. In both areas, workers
wear fairly heavy protective clothing. The results are as follows. The dependent variable was a measure of
output on a scale of from 0 to 20.
70° F:
78° F:
11 13 15 14 15 18 17 12 13 19 20 13 15
10 9 8 12 15 16 13 14 18 12 13 11 10 6 13
4. In an attempt to assess the effect of placing police cars at key places on the interstate system, a researcher
puts a police car on the highway and records speeds just prior to motorists' seeing the car and just after. The
results are as follows. Before speeds are first.
65-67 63-58 70-60 72-72 55-55 57-56 45-48 79-59 60-57 58-54 59-60 68-62 64-58
Biderman’s P201 Handouts
Topic 13: Two Population Tests - 11
2/5/2016
Download