sampling distribution of differences between two

RMTD 404
Lecture 7
Two Independent Samples t Test
Probably the most common experimental design involving two groups is one in
which participants are random assigned into treatment and control
conditions.
Because such observations are likely to have minimal dependence between
treatment and control participants (although it is possible—people might
know one another or there may be unintentional similarities among
subgroups within each condition), observations are said to be independent
between the two groups.
Hence, along with having the desire to compare the means of two groups and
not knowing the population variance, this design involves observations of
two independent samples. The appropriate statistic would then be the two
independent samples t-test.
2
The relevant sampling distribution for this test is the sampling
distribution of differences between two means (  X1   X 2 ). If we
were to create such a distribution, we would collect many pairs of
samples (one from each group), compute the mean for each of the
two samples, and record the difference of these means. The sampling
distribution would be the distribution of a very large number of
differences between the two means.
Obviously, our interest is in determining whether it is likely that the two
samples came from different populations. Hence, the most common
null hypothesis is that  X1   X 2 or that  X1   X 2  0 for the twotailed case. The null hypothesis says that the means for the two
samples are from the same population, or there is no difference
between the means of the two samples in the population.
3
The variance of the sampling distribution of the differences between two means,
on the other hand, is obtained from the variance sum law, which states that
the variance of a sum or difference of two independent variables is equal to
the sum of their variances.
 X2
2
2




X1
X2
1 X 2
Note that this is only true when the variables are independent.
This law does not apply to the standard error of the two matched samples pairs ttest. In that case, the variance equals the sum the variances and two times
the product of the standard deviations and the correlation between the two
variables.
 X2
2
2




 2  X1 X 2
X1
X2
1 X 2
4
Applying the variance sum law to the sampling distribution of the differences
between two means, we have the following.
 X2
 X2
 X2
2
2
1
2






X1
X2
1 X 2
N1
N2
The final point about the sampling distribution of the differences between two
means concerns its mean. The sum or difference of two independent normal
distributions is itself normally distributed with a mean equal to the sum or
difference in the means of the respective distributions. Hence, we know that
the sampling distribution of the difference between two means is normally
distributed with a mean equal to the difference between the populations
means (i.e.,
 X1   X 2   X1   X),2 given a large sample.
5
And, we can extend our t statistic in the same way we did in the previous
example. We are interested in comparing an observed statistic to the
parameter it estimates, and we know the standard error of that statistic.
Hence, we can write a t-test for two independent samples as follows.
t
 X1  X 2    X1   X 2   X1  X 2    X1   X 2 

s 2X  s 2X
1
2
s 2X
1
N1

s 2X
2
N2
Incidentally, if we know the two population variances, we can use the z statistic
version of the test.
z
 X1  X 2    X1   X 2 
 X2
1
N1

 X2
2
N2
6
 X1 or X 2
Given that we typically test a null hypothesis that
and
2 of the t-statistic equation.
we can drop the
out
1
t
 X1   X,2  0
 X1  X 2 
s 2X
1
N1

s 2X
2
N2
It is important to note that this statistic is only valid when:
(a) the observations are independent between samples,
(b) the populations are normally distributed or the sample size is large
enough to rely on the central limit theorem for normalizing the sampling
distribution,
(c) the sample sizes are equal (N1 = N2), and
 X2   X2 ). X2
(d) the variances of the two populations are equal (
1
2
7
Recall that the shape of the t distribution changes as sample size increases, and
we depict the shape of the distribution based on the concept of degrees of
freedom.
Note that for the two independent samples t-test, there are two sources of
degrees of freedom—one for the variance of each sample. Hence, the
degrees of freedom for the two independent samples t-test is simply the sum
of the degrees of freedom of the two variances that are being estimated.
df1  N1  1
df 2  N 2  1
dftotal  df1  df 2  N1  1  N 2  1  N1  N 2  2
8
Graphically, here’s what we do:
null population
 X1   X 2   X1   X 2  0
X1  X 2
t
 X1 X 2
tCV
p
9
Example using SPSS:
IV: Gender(1=Male; 2=Female)
DV: Reading Achievement
Group Statistics
Follow-up
Reading std score
COMPOSITE SEX
MALE
FEMALE
N
Mean
50.4083
51.3812
117
138
Std. Deviation
10.37854
9.03615
Std. Error
Mean
.95950
.76921
Independent Samples Test
Levene's Test for
Equality of Variances
F
Follow-up
Reading std score
Equal variances
assumed
Equal variances
not assumed
5.490
Sig.
.020
t-test for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-.800
253
.424
-.97287
1.21585
-3.36734
1.42160
-.791
231.910
.430
-.97287
1.22976
-3.39580
1.45006
10
When the sample sizes are unequal, we must pool the variances in the formula
for the standard error by weighting each variance by its degrees of freedom.
We do this weighting because the simple average of the two variances would
cause the smaller sample to have more influence per person than would be
the case for the larger sample. We call this weighted average the pooled
variance estimate.
s 2p 
 N1  1s 2X
When N1 = N2:
1
  N 2  1s 2X
N1  N 2  2
s 2p 

2
 N1  1 s X2   N 2  1 s X2
1
N1  N 2  2
 N  1 (s X2
1
 s X2 2 )
2( N  1)

2

 N  1 (s X2
2

s
X2 )
1
2N  2
s X2 1  s X2 2
2
Hence, the correction (pooling) makes no difference when the sample sizes are
equal. As a result, you can always use the correction for unequal sample
sizes.
11
We then substitute the pooled variance estimate into the t statistic.
t
 X1  X 2  
s 2p
N1

s 2p
N2
 X1  X 2 
 1
1 

s 2p 

 N1 N 2 
Note that all the pooled variance estimate accomplishes is weighting each
variance by its sample size so that all observations in the study make an
equal contribution to the magnitude of the standard error.
* SPSS always uses the pooled variance.
12
Two Independent Samples t Test: (Unequal  2 )
When the population variances are equal (a.k.a. homogeneous variances), the
pooled variance estimate allows us to average sample variances that have
different values to produce an estimate of the population variance.
But, when the population variances are unequal (a.k.a. heterogeneous
variances), we cannot simply average the variances because the distribution
of such a t statistic does not follow a t distribution.
The most common solution to this problem was developed by Welch and
Satterthwaite. Their correction adjusts the degrees of freedom for the t-test
via the following formula.




2
2
2
 s


1  s2 

 


N
N
1
2


df   int   2
2
2 2 


  s1 
 s2  
N  
  N1 
2





 N1  1 N 2  1 

 
 

2
2
2
 
  s1
s2 
 
  N  N 
1
2

 
or int   2
2
2

  2

2 

s
s


  1 

 2 






  N1 
 N2   



 
N

1
N

1
 
2
 1
13
The Welch-Sattherwaite correction adjusts the degrees of freedom to be a value
between the degrees of freedom for the smaller sample and the degrees of
freedom for the combined sample.
An obvious question is How do you know whether to use this correction? The
problem is that we never know whether the population variances are equal
because we cannot observe those parameters.
The answer to the question is we compare the two sample variances to
determine the likelihood that they could have been produced by
populations with the same variance—another hypothesis testing problem.
2
2
H0:
. 1   2
In SPSS, we look at the Levene’s test to decide whether the two sample variances
are from the same population.
14
If we reject this null hypothesis, we conclude that the sample variances could not
have come from a common population, and we use the Sattherwaite correction for
the t-test’s degrees of freedom. If we retain the null hypothesis, we conclude that
the sample variances could have come from a common population, and we use
the equal variance degrees of freedom for the t-test.
This is the example we saw earlier:
Independent Samples Test
Levene's Test for
Equality of Variances
F
Follow-up
Reading std score
Equal variances
assumed
Equal variances
not assumed
5.490
Sig.
.020
t-test for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-.800
253
.424
-.97287
1.21585
-3.36734
1.42160
-.791
231.910
.430
-.97287
1.22976
-3.39580
1.45006
15
To summarize, we’ve mentioned three assumptions that need to be considered
when using a t-test.
1.
Normality of the sampling distribution of the differences. For equal
sample sizes, violating this assumption has only a small impact. If the
distributions are skewed, then serious problems arise unless the
variances are similar.
2.
Homogeneity of variance. For equal sample sizes, violating this
assumption has only a small impact.
3.
Equality of sample size. When sample sizes are unequal and variances
are non-homogeneous, there are large differences between the
assumed and true Type I error rates.
16
Confidence Intervals
We’ve discussed two approaches to interpreting a hypothesis test so far: (a)
determine whether the p-value is smaller than the a that you’ve chosen and
(b) determine whether the observed statistic is more extreme than the
critical value.
A third and slightly different approach to interpreting a hypothesis test is referred
to as constructing a confidence interval around a parameter estimate. Note
that the examples we have used so far all determine the degree to which a
point estimate of a parameter could have come from a particular null
distribution.
The notion of confidence intervals approaches the issue from a slightly different
angle. Instead of determining the likelihood that the observed statistic
could have come from a null population, we could construct an interval
around the point estimate that identifies the range of parameter values that
could have produced that statistic.
17
Let’s look at an example of a confidence interval. Below is the formula for a tstatistic.
t
X  0 X  0

sX
sX
N
Once we’ve collected data, we know the values of the sample mean, the sample
standard deviation, and the sample size.
The only unknowns are the parameter (μ0) and the t value. Typically, we solve for
t by plugging in a null value for μ0. But what if we plugged in a null value for t
instead? Then we could solve for μ0. The best way to do this is to identify a
critical value from the t distribution that is interesting to you. Doing so allows
us to identify the range of parameters that could have produced the
observed mean and would not have led us to reject the null hypothesis.
18
Graphically, here’s what I’m saying:
X  t cv s X
X
X  t cv s X
a
a
L
X   L  tcv s X  U  tcv s X
U
19
Hence, we can create confidence intervals around an observed statistic (point
estimate) that would encompass parameter values that were likely to have
produced the statistic. Strictly speaking, we are not estimating the range of
population parameters because only a single population parameter produced
the observed statistic. Rather, we are estimating how much the observed
statistic could vary due to random variation, and hence, we are estimating
the range of parameter values that could have produced the observed
statistic, given it’s likely variation due to sampling error.
The general formula for creating such a confidence interval is.
CI1a %  parameter   statistic  critical value1-α%  SEstatistic
This equation is read as the (100 x α)% confidence interval (CI) around the
observed statistic equals the observed statistic plus and minus the critical
value associated with p and the standard error of the statistic. In the case of
a mean when the population variance is unknown.
CI1a %    X  tcv1a %  SEX
20
An example—suppose we wanted to create a 95% confidence interval around an
observed mean LUC GRE score of 565 when the standard deviation equals
75. Hence, we first obtain the appropriate critical t-value resulting in an
alpha of .025 (a two-tailed test—half of the alpha goes into each tail)—1.96.
The standard error of the mean equals 4.33:
75
 4.33
300
Plugging the known values into the CI equation gives us:
CI 95%     565  1.96  4.33
LL  556.51
UL  573.49
Hence, we can say that population means ranging from 556.5 to 573.5 could have
produced the observed mean of 565 95% of the time due to sampling error.
We would write this as:
CI95%    :556.51    573.49
21
We can draw similar confidence intervals around the difference between two
means
CI1a % 1  2    X1  X 2   t p  s X1  X 2
Look at the CI in the example from SPSS output:
Independent Samples Test
Levene's Test for
Equality of Variances
F
Follow-up
Reading std score
Equal variances
assumed
Equal variances
not assumed
Sig.
5.490
.020
t-test for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-.800
253
.424
-.97287
1.21585
-3.36734
1.42160
-.791
231.910
.430
-.97287
1.22976
-3.39580
1.45006
Let’s try this out:
CI.U = -.97287 - 1.22976*1.96
CI.U
[1] -3.38
CI.L = -.97287 + 1.22976*1.96
CI.L
[1] 1.44
22
Some practice in R…
23
Review…
24