t-test

advertisement
Illustration Section 6.1 with R for data of Example 6.1
We would like to test whether an exam is too difficult, i.e. whether the location
of underlying distribution F of exam results is smaller than 6 or not.
> x=c(3.7,5.2,6.9,7.2,6.4,9.3,4.3,8.4,
6.5,8.1,7.3,6.1,5.8) #exam results
Estimate location:
> mean(x)
[1] 6.553846
> median(x)
[1] 6.5
Based on these numbers, it seems OK, but more precise investigation needed.
Two (equivalent) ways:
1) by performing a test:
2) by computing confidence interval.
First investigate shape of distribution with respect to normality, symmetry,
outliers to see which test/conf.int could be appropriate:
>
>
>
>
>
par(mfrow=c(2,2))
hist(x,prob=T)
qqnorm(x)
symplot(x)
boxplot(x)
These plots show that the underlying distribution F could
very well be symmetric and even normally distributed, and that we do not see
outliers. However, the number of data is small, so we should not draw too strong
conclusions.
1) Performing a test:
H0: location smaller than 6
H1: location larger
t-test: parameter of location is mean μ
Test with level α=0.05
H0: μ < = 6
H1: μ > 6
#Compute values for test:
> t.test(x,mu=6,alt="g")
One Sample t-test
data: x
t = 1.2569, df = 12, p-value = 0.1164
alternative hypothesis: true mean is greater than 6
95 percent confidence interval:
5.768463
Inf
sample estimates:
mean of x
6.553846
Conclusion: ??
Sign test: parameter of location is median m
Test with level α=0.05
H0: m < = 6
H1: m > 6
#Compute value of test statistic:
> sum(x>6)
[1] 9
#Check for values equal to 6:
> sum(x==6)
[1] 0
#no observations equal to 6
#Compute values for test:
> binom.test(9,length(x),alt="g")
#which p?
Exact binomial test
data: 9 and length(x)
number of successes = 9, number of trials = 13, p-value =
0.1334
alternative hypothesis: true probability of success is
greater than 0.5
95 percent confidence interval:
#what is this?
0.4273807 1.0000000
sample estimates:
probability of success
0.6923077
Conclusion: ??
Signed rank test or symmetry test of Wilcoxon: parameter of location is
point of symmetry m
Test with level α=0.05
H0: m < = 6
H1: m > 6
#Compute values for test:
> wilcox.test(x,mu=6,alt="g")
Wilcoxon signed rank test
data: x
V = 64, p-value = 0.1082
alternative hypothesis: true location is greater than 6
Conclusion: ??
NB. R computes the statistic V+
For confidence interval include additional parameter conf.int=TRUE
Which test to be preferred?
t-test: p-value = 0.12
sign test: p-value = 0.13
signed rank test: p-value = 0.11
In this case, for reasonable values of α the conclusions of the tests are the same
and the p-values are not very different. Based on the plots, the t-test and signed
rank test probably are the best here. However, in view of the small number of
observations, normality is perhaps a too strong assumption. Moreover, in the
context of the data--exam grades-- normality is not very realistic. Hence, for
these data the result of the signed rank test probably should be trusted most.
2) Computing confidence interval
Based on test: interval contains those values of location for which H0 is not
rejected.
t.test and wilcox.test ( …., conf.int=T) give interval corresponding to test;
for sign test you have to check step by step which values in H0 are not rejected.
Given a confidence interval, when is exam OK?
Illustration for Section 6.2.: check signif. level and power:
i) 1000 times a sample of size 100 from N(0,1) was generated and counted how
many times the 3 tests rejected H0: μ =0 while testing with sign. level α=0.05.
What do you expect?
ii)
a. 1000 times a sample of size 100 from N(0.1,1) was generated and counted how
many times the 3 tests rejected H0: μ =0.
What do you expect?
b. 1000 times a sample of size 100 from N(0.2,1) was generated and counted how
many times the 3 tests rejected H0: μ =0.
What do you expect?
i) 1000 times a sample of size 100 from N(0,1) was generated and counted how
many times the 3 tests rejected H0: μ =0 while testing with sign. level α=0.05.
t-test sign test wilcoxon
49
45
40 # number of times (out of 1000) the test rejected H_0
55
55
57 # (once more)
So in about 0.05 % of the cases H0 is rejected when H0 is true.
All test have correct significance level of 0.05!
ii)
a. 1000 times a sample of size 100 from N(0.1,1) was generated and counted how
many times the 3 tests rejected H0: μ =0.
t-test sign test wilcoxon
259
174
256 # number of times (out of 1000) the test rejected H0
272
175
261 # (once more)
So t-test and Wilcoxon reject H0 more often than sign test when H0 is not true.
Power of t-test and Wilcoxon better for normal distribution;
t-test is best, but Wilcoxon is almost as good.
This is according to the theory (see Table 6.1 for N0,1)):
are(t-test,sign test)= π/2=1.57; are(wilcoxon, sign test) =3/2=1.5; are(ttest,wilcoxon)=π/3=1.05
b. 1000 times a sample of size 100 from N(0.2,1) was generated and counted how
many times the 3 tests rejected H0: μ =0.
t-test sign test wilcoxon
632 467
625 # number of times (out of 1000) the test rejected H0
639 462
627 # (once more)
In this case the data came from a distribution that was shifted further away
from H0 than in a), namely with a shift of 0.2 instead of 0.1.
We see that the tests reject more often than in a). They can indeed distinguish
the alternative from H0 better than in a).
Download