How to Perform T Tests Introduction To compare a continuous outcome variable in two groups, t-test is often the most appropriate statistic test. There are two types of t tests, two sample t test and paired sample t test. Depending on whether the sample data are paired or independent, the appropriate test must be chosen. For the pooled test the data are independent random samples, so that every observation is independent of every other observation, whereas in the paired test the paired data may be dependent, frequently being observations on the same individual. This can be detailed as following: Independent-samples t test (two-sample t test): This is used to compare the means of one variable for two groups of cases. As an example, a practical application would be to find out the effect of a new drug on blood pressure. Patients with high blood pressure would be randomly assigned into two groups, a placebo group and a treatment group. The placebo group would receive conventional treatment while the treatment group would receive a new drug that is expected to lower blood pressure. After treatment for a couple of months, the two-sample t test is used to compare the average blood pressure of the two groups. Note that each patient is measured once and belongs to one group. Test: Suppose we have two independent random samples, X1, X2.... , Xn from a distribution, and Y1, Y2 ... Yn from a inferences about the difference distribution. We wish to make in the population means. Write X and Y for the sample means and SX 2 and SY2 and for the (unbiased) sample variances. If we can assume the unknown variances are equal, estimated using say, the common variance can be The resulting test statistic, has thetm+n-2 distribution. This is the test statistic for the pooled t-test. Example for the Independent-Samples T Test using spss To illustrate this procedure, consider the data shown on Table 1 below. Twenty patients suffering from high blood pressure were randomly selected and assigned to two separate groups. One group called the placebo group were given conventional treatment and the other group called newdrug were given a new drug. The aim was to investigate whether the new drug will reduced blood pressure. Table 1: Blood pressure data placebo group New drug group 71 90 79 95 69 67 98 120 91 89 85 92 89 100 75 82 78 79 80 85 Below is the spss T test output listing : The output listing starts with a table of statistics for the two groups followed by another table showing the mean difference between the two groups and some other statistics. One of the assumption underlying the use of t test is the equality of variance, the Levene test for homogeneity (equality) of variance is included in the table. Provided the F value is not significant (p > 0.05), the variances can be assumed to be homogeneous and the Equal Variance line values for the t test be used. If p < 0.05, then the equality of variance assumption has been violated and the t test based on the separate variance estimates (Unequal Variances) should be used. In this case, the Levene test is not significant, so the t value calculated with the pooled variance estimate (Equal Variance) is appropriate. With a 2-Tail Sig (i.e. p-value) of 0.130 (i.e. 13%), the difference between means is not significant. Paired-samples t test (dependent t test): This is used to compare the means of two variables for a single group. The procedure computes the differences between values of the two variables for each case and tests whether the average differs from zero. For example, you may be interested to evaluate the effectiveness of a mnemonic method on memory recall. Subjects are given a passage from a book to read, a few days later, they are asked to reproduce the passage and the number of words noted. Subjects are then sent to a mnemonic training session. They are then asked to read and reproduce the passage again and the number of words noted. Thus each subject has two measures, often called, before and after measures. An alternative design for which this test is used is a matched-pairs or case-control study. To illustrate an example in this situation, consider treatment patients. In a blood pressure study, patients and control might be matched by age, that is, a 64-year-old patient with a 64-year-old control group member. Each record in the data file will contain response from the patient and also for his matched control subject. Test: Suppose now we have observations (X1, Y1), (X2, Y2) ... (Xn, Yn) occurring as independent pairs, as often arises in before-after situations, such as, is a diet or medical treatment effective? The X’s and Ys may not be independent; they are frequently observations on the same subject. However the differences D1 = X1- Y1 , D2 = X2 - Y2 are independent. If they can also be assumed to be normally distributed with common mean and variance, so that the Di are independent then inferences about d can be based on the test statistic ,where SD is the sample standard deviation of the differences. Example for the Paired-Samples T Test using spss As mentioned above, paired-samples t test is used to compare the means of two variables for a single group. To illustrate this procedure, consider the data shown on Table 2 below. Subjects were given a passage to read and ask to reproduce it on a later date. Subjects were then sent to a mnemonic training session and after the training, subjects were given the same passage and asked to reproduce it on a later date. The table show the number of words recalled by subjects before and after the mnemonic training session. Table 2: Number of words recalled Before mnemonic training After mnemonic training 204 223 393 412 391 402 265 285 326 353 220 243 423 443 342 340 480 582 464 490 Below is the spss T test output listing: The output listing starts with a table of statistics for the two variables (see below). The next table from the output listing gives the correlation between the two variables which is 0.975. The last table from the output listing contains the t-value (3.013) and the 2-tail p-value (0.015). The 95% confidence interval of 6.60 to 46.40 is also shown on the table. Since the p-value of 0.015 is less than 0.05 the difference between the means is significant. In other words, sending subjects to mnemonic training session improves their memory recall. Assumptions underlying the use of t test Before looking at the details of how to perform and interpret a t test, it is good idea to understand the assumption underlying the use of t test. The two crucial assumptions need to be checked before applying t tests are: 1. The outcome variable comes from a population with a normal distribution. 2. The variance of the outcome variable is the same in the groups To check the assumptions, one needs to go through a diagnostic procedure which is comprised of a collection of statistical procedures. The diagnostic tools include box plot, histogram and f-test. Box plot is very useful in finding if the groups are skewed or not. In addition, it also provides information about outliers. Histogram is the simplest way of checking whether the data comes from a normal distribution. However a histogram is unlikely to look exactly normal, especially if the sample size is small. The figure blow shows a histogram indicating data with a normal distribution. To test the equal variance, folded f test is often used. The test procedure is outlined below. Consider testing Let Define test statistic Reject Ho if F > c for some critical value c. In most statistics packages it is possible to check whether the variance of the outcome variable is the same in each group. Some statistics packages will automatically test for equality of variances within the t-test procedure and will give two versions of the t-test, equal variances assumed and equal variances not assumed. If the test for equality of variances is not statistically significant, then the variances can be assumed to be equal, and so the equal variances version of the t-test may be used. Otherwise, the unequal variances version of the t-test will be required. Although it is assumed that the data has been derived from a population with normal distribution and equal variance, with moderate violation of the assumption, you can still proceed to use the t test provided the following is adhering to: 1. The samples are not too small. 2. The samples do not contain outliers. 3. The samples are of equal or nearly equal size. However, if the sample seriously violates the assumption then Nonparametric Tests should be used. Nonparametric tests do not carry specific assumptions about population distributions and variance. Importance of the assumptions An incorrect assumption in t test can lead to a false conclusion. To check the relationship between paired and pooled t test, considering the x and y sample means and variance, note that where r is the sample correlation between the X and Y values. So that If m = n then SP2 = (SX2 +SY2)/2 and Substituting gives if the pooled test is valid, then so is the paired. For if every X1 is paired with a Y, namely Y1, then for all i, Xi - Yi is normal, the pairs are independent, E[Xi - - say, and var(Xi -Yi) = s2 , say. The assumptions noted in the pooled test are all satisfied. To show this relationship, an interesting example has been illustrated by Rayner can help to understand why each is preferred over the other in appropriate circumstance. The example is about the depth of murk (the rich highly organic type of soil in the Florida Everglades). In May 1972 several plots in the everglades were staked out marked, and depth of muck at each location measured. This was repeated in October 1978. A portion of the data (measured in inches) is given below: Plot 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # 1972 34.5 44.0 37.5 27.0 37.0 40.0 47.2 35.2 44.0 40.5 17.0 29.5 31.5 35.0 44.0 1978 31.5 37.9 35.5 23.0 34.5 31.1 46.0 31.0 35.2 37.2 24.7 25.8 29.0 36.8 36.5 These data was firstly analyzed using paired t-test to see if there is sufficient evidence to indicate that there was a significant loss in the average muck depth between 1972 and 1978. Then the same data was analyzed using a pooled t-test. Again give a p-value, and the result was compared with the paired t-test. It was found that the p-value is less than 0.5% for the paired test, and between 2.5% and 5% for the pooled test. Base on the p-values, if testing at the 1% level, one will find that the paired test will significant while the pooled test will be not significant. The reason for the inconsistence is that for the pooled test the data are independent random samples, so that every observation is independent of every other observation, whereas in the paired test the paired data may be dependent, frequently being observations on the same individual. The key point is whether the data are dependent or not, and so, is the paring appropriate? If the 1972 values are regarded as the x values and the 1978 as the y, we find X = 36.927, Y =33.047, SX2 = 40.889, SY2= 35.517. This gives SP2 =38.203 and t_Pooled = 1.719. Further calculation gives the sample covariance = 33.989, and the sample correlation as 0.892. Substituting in the relationship above gives t_Paired =5.176.This is confirmed by direct calculation, for D = 3.880 and SD = 2.903. If the pooled test really is appropriate, then the sample correlation should be close to zero, and numerically t Pooled and t Paired will be approximately equal. The problem is that T paired will be referred to n - 1 degrees of freedom, while T_pooled will be referred to twice this number. The t tables shows that a particular t value may be significant at say the 5% level for 2(n - 1) degrees of freedom, but not significant with (n - 1) degrees of freedom. This means that the pooled test will be more critical of the data than the paired test in that it will have smaller p values. The pooled test is more likely than the paired test to detect alternatives from the null hypothesis. In other words, this test has more power. If the data consist of some observations that are correlated and others that are uncorrelated, there are three common methods to analyze ‘combined’ data of this type. The first one is to perform the t-test for two independent samples, which assumes no correlation among the observations under treatments 1 and 2, using all of the data. This approach is often called as the uncorrected t-test. Another approach would be to ignore the paired observations and perform the t-test for two independent samples, after deleting the correlated data. this approach is called the unpaired t-test. The third method is to ignore the treatment 1 and 2 observations that are independent and perform the paired t-test, after deleting the uncorrelated data. This approach is called the paired t-test. Unlike these three methods, Looney etc. proposed a method using asymptotic results. This method analyze the ‘combined’ samples of the correlated and uncorrelated data that makes use of all the data and takes into account the correlation between the paired observations. This approach is called by the author as the corrected z-test. Suppose there is a random sample of n1 subjects exposed to treatment 1 that is independent of a random sample of n2 subjects exposed to treatment 2. X1--Xn and Y1-Yn denote the observed values for the independent subjects exposed to treatment 1 and treatment 2, respectively. Suppose also that there are n>3 subjects for which there are paired observations under treatments 1 and 2. Let U1--Un and V1—Vn denote the observed values for treatments 1 and 2, respectively, for the paired subjects. Thus, the x-observations are independent of the y-, u- and v-observations; the y-observations are independent of the x-, u- and vobservations; and the u- and v-observations are assumed to be correlated. Let M1 denote the sample mean for all treatment 1subjects; that is, the mean of all x- and u-values combined, and let M2 denote the sample mean for all treatment 2 subjects; that is, the mean of all y- and v-values combined. Let S12 denote the sample variance for all treatment 1 subjects, and let S22 denote the variance for all treatment 2 subjects. Thus where denotes the population variance of the treatment 1 observations, population variance of the treatment 2 observations, and denotes the denotes the population covariance of the paired observations, that is, the u- and v-values. Define where Under the null hypothesis, Zcorr has a limiting N(0; 1) distribution by the central limit theorem and consistency of the estimators S12, S22and SUV. Therefore, the standard normal distribution will be used to calculate an approximate p-value for the observed value of Zcorr Note that if n=0, that is, there are no paired observations, and then Zcorr reduces to the two sample z-test. If n1 =n2 =0, that is, all the observations are paired, then Zcorr reduces to the paired z-test.