Chapter 5: Solutions to Supplementary Exercises 5.S.1 (a) The null

advertisement

Chapter 5: Solutions to Supplementary Exercises

5.S.1

(a)

Source of variation

Between Species

Error

Sums of squares

587

1003

Degrees of

Freedom

2

27

Mean Squares

293.5

37.15

F

7.9

Table 5.S.1.2: Full ANOVA table for the Impatiens species data

(b)

The null hypothesis is that all the long-run species means are the same. The alternative is that not all of them are the same. Symbolically:

H o

:







H

1

: Not all

 i

 where i labels the species.

If we choose a significance level of

=0.05, the critical F value (0.05, 2, 27) is

3.35; since the test statistic F=7.9 is much larger than this, we reject the null hypothesis and conclude that not all the long-run species means are the same.

√ √

Note the critical value for 30 degrees of freedom is 2.04, so using a table which did not give 27 degrees of freedom would not give a very different answer. The studentised range value for comparing 3 means and using 30 for the error degrees of freedom is 3.49.

Course Manual: PG Diploma in Statistics, TCD, Base Module

[ ⁄

] √ ⁄

Parviflora

12

LSD

Glandulifera

18.2

Balfourii

22.8

HSD

The LSD analysis shows Parviflora to give a lower response than the other two species, which cannot be separated statistically from each other (as indicated by the underlining).

The HSD analysis suggests that Parviflora gives a lower response than Balfourii, but it cannot be separated statistically from Glandulifera and neither can

Glandulifera be separated from Balfourii.

(c) Type I error means erroneously rejecting the null hypothesis. LSD is essentially the same as carrying out multiple t-tests, each with significance level

=0.05 (= probability of Type I error). This significance level applies to each test on its own, but the overall probability of Type I error for the family of comparisons (multiple tests) will be considerably higher. HSD protects against Type I error by specifying a significance level which applies to the family of comparisons.

Type II error means accepting the null hypothesis (or, if you prefer, failing to reject it) when it is false and should be rejected. The widening of the intervals in going from LSD to HSD means that observed differences between sample means are less likely to be declared statistically significant, and therefore, there is a greater probability of Type II error when using HSD.

(d) The statistical model is discussed on pages 6, 7, while the validation of this model is illustrated separately for the different examples of Chapter 5 (e.g., page

12 for the laboratory comparison example).

Course Manual: PG Diploma in Statistics, TCD, Base Module

5.S.2

Two-sample T for Speed-B vs Speed-A

N Mean StDev SE Mean

Speed-B 11 78.37 2.05 0.62

Speed-A 11 75.51 1.63 0.49

Difference = mu (Speed-B) - mu (Speed-A)

Estimate for difference: 2.86364

95% CI for difference: (1.21237, 4.51490)

T-Test of difference = 0 (vs not =): T-Value = 3.62 P-Value = 0.002

DF = 20

Both use Pooled StDev = 1.8565

Table 5.S.2.3: A Minitab analysis of the pellet process development data

One-way ANOVA: Yield versus Speed

Source DF SS MS F P

Speed 1 45.10 45.10 13.09 0.002

Error 20 68.93 3.45

Total 21 114.03

Table 5.S.2.4: ANOVA analysis of the spheroniser data

Table 5.S.2.3 reproduces the two-sample analysis first shown on page 42 of

Chapter 2; Table 5.S.2.4 shows the corresponding one-way ANOVA analysis of the same data.

Note the correspondences between the analyses. The two-sample test statistic is t=3.62; when this is squared it gives the F-statistic of the ANOVA table

(3.62

2

=13.09 apart from rounding error). Note that both outputs give the same pvalue of 0.002. The critical t-value for a two-tailed t-test with a significance level of 0.05 is 2.086, which when squared gives 4.351 which is the critical value for an

F-test with 1 and 20 degrees of freedom, when using a significance level of 0.05.

Thus, there is a one-to-one correspondence between the calculated test statistics and also between their critical values: the tests are equivalent.

Since the two tests are equivalent, it does not matter which one we carry out.

However, once a difference is found it is best practice to report a confidence interval for the long-run difference. This follows naturally from the t-test, as the two calculations involve the same quantities. For this reason, we do not usually carry out ANOVA analyses when we have only two groups.

Course Manual: PG Diploma in Statistics, TCD, Base Module

Download