How to Perform and Interpret T Tests

advertisement
How to Perform T Tests
Introduction
To compare a continuous outcome variable in two groups, t-test is often the most
appropriate statistic test. There are two types of t tests, two sample t test and paired
sample t test. Depending on whether the sample data are paired or independent, the
appropriate test must be chosen.
For the pooled test the data are independent random samples, so that every observation is
independent of every other observation, whereas in the paired test the paired data may be
dependent, frequently being observations on the same individual. This can be detailed as
following:
Independent-samples t test (two-sample t test):
This is used to compare the means of one variable for two groups of cases. As an
example, a practical application would be to find out the effect of a new drug on blood
pressure. Patients with high blood pressure would be randomly assigned into two groups,
a placebo group and a treatment group. The placebo group would receive conventional
treatment while the treatment group would receive a new drug that is expected to lower
blood pressure. After treatment for a couple of months, the two-sample t test is used to
compare the average blood pressure of the two groups. Note that each patient is measured
once and belongs to one group.
Test: Suppose we have two independent random samples, X1, X2.... , Xn from a
distribution, and Y1, Y2 ... Yn from a
inferences about the difference
distribution. We wish to make
in the population means. Write X and Y for the
sample means and SX 2 and SY2 and for the (unbiased) sample variances. If we can
assume the unknown variances are equal,
estimated using
say, the common variance can be
The resulting test statistic,
has thetm+n-2 distribution. This is the test statistic for the pooled t-test.
Example for the Independent-Samples T Test using spss
To illustrate this procedure, consider the data shown on Table 1 below. Twenty patients
suffering from high blood pressure were randomly selected and assigned to two separate
groups. One group called the placebo group were given conventional treatment and the
other group called newdrug were given a new drug. The aim was to investigate whether
the new drug will reduced blood pressure.
Table 1: Blood pressure data
placebo group New drug group
71
90
79
95
69
67
98
120
91
89
85
92
89
100
75
82
78
79
80
85
Below is the spss T test output listing :
The output listing starts with a table of statistics for the two groups followed by another
table showing the mean difference between the two groups and some other statistics. One
of the assumption underlying the use of t test is the equality of variance, the Levene test
for homogeneity (equality) of variance is included in the table. Provided the F value is
not significant (p > 0.05), the variances can be assumed to be homogeneous and the
Equal Variance line values for the t test be used. If p < 0.05, then the equality of variance
assumption has been violated and the t test based on the separate variance estimates
(Unequal Variances) should be used.
In this case, the Levene test is not significant, so the t value calculated with the pooled
variance estimate (Equal Variance) is appropriate. With a 2-Tail Sig (i.e. p-value) of
0.130 (i.e. 13%), the difference between means is not significant.
Paired-samples t test (dependent t test):
This is used to compare the means of two variables for a single group. The procedure
computes the differences between values of the two variables for each case and tests
whether the average differs from zero. For example, you may be interested to evaluate the
effectiveness of a mnemonic method on memory recall. Subjects are given a passage
from a book to read, a few days later, they are asked to reproduce the passage and the
number of words noted. Subjects are then sent to a mnemonic training session. They are
then asked to read and reproduce the passage again and the number of words noted. Thus
each subject has two measures, often called, before and after measures.
An alternative design for which this test is used is a matched-pairs or case-control study.
To illustrate an example in this situation, consider treatment patients. In a blood pressure
study, patients and control might be matched by age, that is, a 64-year-old patient with a
64-year-old control group member. Each record in the data file will contain response
from the patient and also for his matched control subject.
Test: Suppose now we have observations (X1, Y1), (X2, Y2) ... (Xn, Yn) occurring as
independent pairs, as often arises in before-after situations, such as, is a diet or medical
treatment effective? The X’s and Ys may not be independent; they are frequently
observations
on the same subject. However the differences D1 = X1- Y1 , D2 = X2 - Y2 are
independent. If they can
also be assumed to be normally distributed with common mean and variance, so that the
Di are independent
then inferences about d can be based on the test statistic
,where SD is the sample standard deviation of the differences.
Example for the Paired-Samples T Test using spss
As mentioned above, paired-samples t test is used to compare the means of two variables
for a single group. To illustrate this procedure, consider the data shown on Table 2 below.
Subjects were given a passage to read and ask to reproduce it on a later date. Subjects
were then sent to a mnemonic training session and after the training, subjects were given
the same passage and asked to reproduce it on a later date. The table show the number of
words recalled by subjects before and after the mnemonic training session.
Table 2: Number of words recalled
Before mnemonic training After mnemonic training
204
223
393
412
391
402
265
285
326
353
220
243
423
443
342
340
480
582
464
490
Below is the spss T test output listing:
The output listing starts with a table of statistics for the two variables (see below).
The next table from the output listing gives the correlation between the two variables
which is 0.975.
The last table from the output listing contains the t-value (3.013) and the 2-tail p-value
(0.015). The 95% confidence interval of 6.60 to 46.40 is also shown on the table. Since
the p-value of 0.015 is less than 0.05 the difference between the means is significant. In
other words, sending subjects to mnemonic training session improves their memory recall.
Assumptions underlying the use of t test
Before looking at the details of how to perform and interpret a t test, it is good idea to
understand the assumption underlying the use of t test.
The two crucial assumptions need to be checked before applying t tests are:
1. The outcome variable comes from a population with a normal distribution.
2. The variance of the outcome variable is the same in the groups
To check the assumptions, one needs to go through a diagnostic procedure which is
comprised of a collection of statistical procedures. The diagnostic tools include box plot,
histogram and f-test.
Box plot is very useful in finding if the groups are skewed or not. In addition, it also
provides information about outliers.
Histogram is the simplest way of checking whether the data comes from a normal
distribution. However a histogram is unlikely to look exactly normal, especially if the
sample size is small. The figure blow shows a histogram indicating data with a normal
distribution.
To test the equal variance, folded f test is often used. The test procedure is outlined below.
Consider testing
Let
Define test statistic
Reject Ho if F > c for some critical value c.
In most statistics packages it is possible to check whether the variance of the outcome
variable is the same in each group. Some statistics packages will automatically test for
equality of variances within the t-test procedure and will give two versions of the t-test,
equal variances assumed and equal variances not assumed. If the test for equality of
variances is not statistically significant, then the variances can be assumed to be equal,
and so the equal variances version of the t-test may be used. Otherwise, the unequal
variances version of the t-test will be required.
Although it is assumed that the data has been derived from a population with normal
distribution and equal variance, with moderate violation of the assumption, you can still
proceed to use the t test provided the following is adhering to:
1. The samples are not too small.
2. The samples do not contain outliers.
3. The samples are of equal or nearly equal size.
However, if the sample seriously violates the assumption then Nonparametric Tests
should be used. Nonparametric tests do not carry specific assumptions about population
distributions and variance.
Importance of the assumptions
An incorrect assumption in t test can lead to a false conclusion.
To check the relationship between paired and pooled t test, considering the x and y
sample means and variance, note that
where r is the sample correlation between the X and Y values. So that
If m = n then SP2 = (SX2 +SY2)/2 and
Substituting gives
if the pooled test is valid, then so is the paired. For if every X1 is paired with a Y, namely
Y1, then for all i, Xi - Yi is normal, the pairs are independent, E[Xi -
-
say, and var(Xi -Yi) = s2 , say. The assumptions noted in the pooled test are all satisfied.
To show this relationship, an interesting example has been illustrated by Rayner can help
to understand why each is preferred over the other in appropriate circumstance.
The example is about the depth of murk (the rich highly organic type of soil in the
Florida Everglades). In May 1972 several plots in the everglades were staked out marked,
and depth of muck at each location measured. This was repeated in October 1978. A
portion of the data (measured in inches) is given below:
Plot
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#
1972 34.5 44.0 37.5 27.0 37.0 40.0 47.2 35.2 44.0 40.5 17.0 29.5 31.5 35.0 44.0
1978 31.5 37.9 35.5 23.0 34.5 31.1 46.0 31.0 35.2 37.2 24.7 25.8 29.0 36.8 36.5
These data was firstly analyzed using paired t-test to see if there is sufficient evidence to
indicate that there was a significant loss in the average muck depth between 1972 and
1978. Then the same data was analyzed using a pooled t-test. Again give a p-value, and
the result was compared with the paired t-test.
It was found that the p-value is less than 0.5% for the paired test, and between 2.5% and
5% for the pooled test. Base on the p-values, if testing at the 1% level, one will find that
the paired test will significant while the pooled test will be not significant. The reason for
the inconsistence is that for the pooled test the data are independent random samples, so
that every observation is independent of every other observation, whereas in the paired
test the paired data may be dependent, frequently being observations on the same
individual. The key point is whether the data are dependent or not, and so, is the paring
appropriate?
If the 1972 values are regarded as the x values and the 1978 as the y, we find X = 36.927,
Y =33.047, SX2 = 40.889, SY2= 35.517. This gives SP2 =38.203 and t_Pooled = 1.719.
Further calculation gives the sample covariance
= 33.989, and
the sample correlation as 0.892.
Substituting in the relationship above gives t_Paired =5.176.This is confirmed by direct
calculation, for D = 3.880 and SD = 2.903.
If the pooled test really is appropriate, then the sample correlation should be close to zero,
and numerically t Pooled and t Paired will be approximately equal.
The problem is that T paired will be referred to n - 1 degrees of freedom, while T_pooled
will be referred to twice this number. The t tables shows that a particular t value may be
significant at say the 5% level for 2(n - 1) degrees of freedom, but not significant with (n
- 1) degrees of freedom. This means that the pooled test will be more critical of the data
than the paired test in that it will have smaller p values. The pooled test is more likely
than the paired test to detect alternatives from the null hypothesis. In other words, this
test has more power.
If the data consist of some observations that are correlated and others that are
uncorrelated, there are three common methods to analyze ‘combined’ data of this type.
The first one is to perform the t-test for two independent samples, which assumes no
correlation among the observations under treatments 1 and 2, using all of the data. This
approach is often called as the uncorrected t-test.
Another approach would be to ignore the paired observations and perform the t-test for
two independent samples, after deleting the correlated data. this approach is called the
unpaired t-test.
The third method is to ignore the treatment 1 and 2 observations that are independent and
perform the paired t-test, after deleting the uncorrelated data. This approach is called the
paired t-test.
Unlike these three methods, Looney etc. proposed a method using asymptotic results.
This method analyze the ‘combined’ samples of the correlated and uncorrelated data that
makes use of all the data and takes into account the correlation between the paired
observations. This approach is called by the author as the corrected z-test.
Suppose there is a random sample of n1 subjects exposed to treatment 1 that is
independent of a random sample of n2 subjects exposed to treatment 2. X1--Xn and Y1-Yn
denote the observed values for the independent subjects exposed to treatment 1 and
treatment 2, respectively.
Suppose also that there are n>3 subjects for which there are paired observations under
treatments 1 and 2. Let U1--Un and V1—Vn denote the observed values for treatments 1
and 2, respectively, for the paired subjects. Thus, the x-observations are independent of
the y-, u- and v-observations; the y-observations are independent of the x-, u- and vobservations; and the u- and v-observations are assumed to be correlated.
Let M1 denote the sample mean for all treatment 1subjects; that is, the mean of all x- and
u-values combined, and let M2 denote the sample mean for all treatment 2 subjects; that
is, the mean of all y- and v-values combined. Let S12 denote the sample variance for all
treatment 1 subjects, and let S22 denote the variance for all treatment 2 subjects.
Thus
where
denotes the population variance of the treatment 1 observations,
population variance of the treatment 2 observations, and
denotes the
denotes the population
covariance of the paired observations, that is, the u- and v-values.
Define
where
Under the null hypothesis, Zcorr has a limiting N(0; 1) distribution by the central limit
theorem and consistency of the estimators S12, S22and SUV. Therefore, the standard
normal distribution will be used to calculate an approximate p-value for the observed
value of Zcorr
Note that if n=0, that is, there are no paired observations, and then Zcorr reduces to the
two sample z-test. If n1 =n2 =0, that is, all the observations are paired, then Zcorr reduces
to the paired z-test.
Download