Document 15846098

advertisement
Exercise #4
碩專二甲 NA0C0003 程方麗
Application Activity with Confidence Intervals
1. 2 groups, scores on three tests
Confidence intervals for the mean differences
a. Aptitude test (37 points possible): (-1.57, .99)
b. Grammaticality judgment test (200 points possible) : (-5.62, .76)
c. Phonemic distribution task (96 points possible) : (-10.8, -.001)
Q: Are the groups statistically different from each other on any of the tests?
Q: What can you say about the precision of the estimates of mean difference?
Q: Which test do you think has the largest effect size?
Zero falls within the area in the “Aptitude test” and “Grammaticality judgment test”,
except in the “Phonemic distribution task”. Therefore, there are no statistically
significant differences in the “Aptitude test” and “Grammaticality judgment test” between
the two groups. But there is statistically significant difference in “Phonemic distribution
task”. In terms of the distance from zero, Phonemic distribution task is the farthest from
zero, implying that the Phonemic distribution task is likely to have the largest effect size.
However, with the farthest distance from zero, the Phonemic distribution task implies that
there is least confidence in estimating precisely. In terms of confidence intervals
reflected from the three tests, there is statistically significant difference shown in the
Phonemic distribution task between the two groups.
As far as the precision of estimation, the width of the Phonemic distribution task
is .114 (nearly 11 points CI divided by 96), the width of the Grammaticality judgment test
is .035 (nearly 7 points CI divided by 200), and the width of Aptitude test is .081 (nearly
3 points CI divided by 37). Obviously, the Phonemic distribution task has the largest
interval with the farthest distance away from zero, so it shows the slightest probability of
precisely estimating compared with the other two tests. In this vein, the Grammaticality
judgment test with a narrower width (.035) than the Aptitude test whose width is .081
tends to have a more precise estimation.
As it is, the greater the CI is, the larger the effect size is. Therefore, the Phonemic
distribution task has the largest effect size.
2. mean difference between groups PTP-NP = 3.42, the mean difference between OLP-NP =
3.38, and the mean difference between OLP-PTP= 0.14
Q: Are the groups statistically different from each other on any of the tests?
Q: What can you say about the precision of the estimates of mean difference?
Q: Which test has the largest effect size?
As the Figure 4.3 shows, confidence interval for the PTP-NP roughly falls within the
range (-0.10~0.00); confidence interval for the OLP-NP lies within the range
(-0.10~0.001); confidence interval for the OLP-PTP roughly lies within the range
(-0.037~0.038). Confidence interval for the PTP-NP lies centrally to the left side of zero,
meaning the statistic result doesn’t reject null hypothesis and no statistic difference exists.
Likewise, confidence interval for the OLP-NP also lies almost centrally to the left side of
zero, because only very slight distance (0.001) lies away from zero; in terms of
confidence intervals, both are one-tail direction with no or very slight distance away from
zero. As for the OLP-PTP, zero lies almost in the middle from the right side and from
the left side with two tailed direction. In other words, the regions of rejecting the null
hypothesis from right and left sides are almost the same with alpha value .025 in the 95%
confidence interval. Synthetically speaking, the groups are not statistically different
from each other on any of the tests.
As the Figure 4.3 shows, confidence interval for the PTP-NP roughly falls within the
range (-0.10~0.00) with the width .292 (3.42 divided by nearly 1point); confidence
interval for the OLP-NP lies within the range (-0.10~0.001) with the width .295(3.38
divided by nearly 1 point); confidence interval for the OLP-PTP roughly lies within the
range (-0.037~0.038) with the width .536 (0.075 divided by 0.14). Among the pairs of
comparison, the OLP-PTP reflects that zero lies nearly in the middle distance from the
right and the left within the area, implying no statistically significant difference exists
between OLP and PTP. Besides, this pair OLP-PTP has a wider width (0.536) than the
other two pairs(PTP-NP 0.292 and OLP-NP 0.295 respectively), implying that, in a sense,
the pair OLP-PTP has less precise estimation than the other two groups with fairly precise
estimation.
In terms of effect size, the greater the distance from zero is, the bigger effect size
there might be. Therefore, the PTP-NP pair and the OLP-NP pair are expected to have
bigger effect size than the OLP-PTP pair.
3. Variant 1a: DeKeyser (2000) found a statistical correlation between age of arrival and
scores on the grammatically judgment test ( r = -.62, n=57, p < .001)
Variant 1b: Flege, Yrnl-Komshian, and Liu (1999) found a statistical correlation between
age of arrival and pronunciation scores (r = -.89, n = 264, p< .001).
Variant 2a: DeKeyser (2000) found a correlation between age of arrival and scores on the
grammatically judgement test (95% CI : -.76, -.42).
Variant 2b : Flege, Yrnl-Komshian, and Liu (1999) found a statistical correlation between
age of arrival and pronunciation scores (95% CI : -.92, -.87).
Q: What do the confidence intervals tell you that P-values can’t?
The value of R square (-.89) is larger than the value of R square (-.62), implying
that Variant 1b has bigger effect size than Variant 1a. However, no further bits of
information are provided except that there is statistical difference and the extent of the
effect size. We don’t know whether such data is sufficient to lend support to the
practical application in reality. Then, 95% CI needs to be considered.
Since Variant 2b has a narrower 95% CI range (0.05) than Variant 2a with 95% CI
range (0.34), it is held that Variant 2b has a more precise estimation and can be inferred
that variant 2b is more likely to also cause practical difference in reality.
P-value reflects only whether there is significant difference between two groups,
while confidence intervals reflect not only whether there is significant difference but also
whether the estimation is precise as well as how much the effect size is. By considering
confidence intervals, one may infer the results of the effect of a specific treatment on the
sampled participants in a study are more likely to be generalized to other populations
outside the study. Therefore, the confidence intervals can help us judge the
practicability of a specific treatment, for statistical difference reflected from p-value
generated from a study with a limited number of samples alone is not sufficient for the
treatment in a study to be believed to be able to be generalized or be put into practice, if
the statistic results of the study lack more precise estimation.
Download