# discussion4

```Discussion 4
Stephanie Chan
February 10, 2015
1
Problem 18.17 Winding Speeds.
In a completely randomized design to study the eect of the speed of winding
thread (1: slow; 2: normal; 3: fast; 4: maximum) onto 75-yard spools, 16
runs of 10,000 spools, each were made at each of the four winding speeds.
The response variable is the number of thread breaks during the production
run. The results (in time order) are as follows.
i 1 2 3
1 4 3 2
2 7 6 4
3 12 6 14
4 17 15 7
j
...
...
...
...
...
14 15 16
2 3 4
4 7 6
13 10 14
19 9 23
Since the responses are counts, the researcher was concerned about the
normality and equal variances assumptions of the ANOVA model (16.2)
1.1
Part a.
Obtain the tted values and residuals for ANOVA model (16.2)
## Problem 18.2
## Import the data
## set the directory to whatever directory you keep the data
setwd(&quot;~/Dropbox/school/sta106-2015/discussion4/&quot;)
y = mydata[,1]
x = mydata[,2]
stripchart(y~x,pch=16,method=&quot;stack&quot;)
1
## part(a)
x = as.factor(x)
fit = aov(y~x)
# fitted values
fitted(fit)
1
3.5625
11
3.5625
21
5.8750
31
5.8750
2
3
3.5625 3.5625
12
13
3.5625 3.5625
22
23
5.8750 5.8750
32
33
5.8750 10.6875
4
3.5625
14
3.5625
24
5.8750
34
10.6875
5
6
7
3.5625 3.5625 3.5625
15
16
17
3.5625 3.5625 5.8750
25
26
27
5.8750 5.8750 5.8750
35
36
37
10.6875 10.6875 10.6875
2
8
3.5625
18
5.8750
28
5.8750
38
10.6875
9
3.5625
19
5.8750
29
5.8750
39
10.6875
10
3.5625
20
5.8750
30
5.8750
40
10.6875
41
42
43
44
45
46
47
48
49
50
10.6875 10.6875 10.6875 10.6875 10.6875 10.6875 10.6875 10.6875 16.5625 16.5625
51
52
53
54
55
56
57
58
59
60
16.5625 16.5625 16.5625 16.5625 16.5625 16.5625 16.5625 16.5625 16.5625 16.5625
61
62
63
64
16.5625 16.5625 16.5625 16.5625
# residuals
residuals(fit)
1
0.4375
11
-1.5625
21
1.1250
31
1.1250
41
-3.6875
51
-9.5625
61
-0.5625
1.2
2
3
4
5
6
7
-0.5625 -1.5625 -0.5625 0.4375 0.4375 -0.5625
12
13
14
15
16
17
0.4375 0.4375 -1.5625 -0.5625 0.4375 1.1250
22
23
24
25
26
27
-3.8750 3.1250 -0.8750 -0.8750 3.1250 -2.8750
32
33
34
35
36
37
0.1250 1.3125 -4.6875 3.3125 1.3125 -0.6875
42
43
44
45
46
47
-4.6875 1.3125 0.3125 -4.6875 2.3125 -0.6875
52
53
54
55
56
57
3.4375 -3.5625 -5.5625 -0.5625 8.4375 -5.5625
62
63
64
2.4375 -7.5625 6.4375
8
2.4375
18
0.1250
28
2.1250
38
-1.6875
48
3.3125
58
7.4375
Part b
Prepare suitable residual plots to study whether or not the error variances
are equal for the four winding speeds. What are your ndings?
## part (b) Residual plots
r = residuals(fit)
stripchart(r~x,method=&quot;stack&quot;,pch=16)
3
9
1.4375
19
-1.8750
29
0.1250
39
1.3125
49
0.4375
59
1.4375
10
0.4375
20
0.1250
30
-1.8750
40
6.3125
50
-1.5625
60
4.4375
1.3
Part c
Test by means of the Brown-Forsythe test whether or not the treatment error
variances are equal; uses α = .05. What is the p-value of the test? Are your
results consistent with the diagnosis in part b?
## part (c) Brown Forsythe test
# install.packages(&quot;lawstat&quot;)
library(lawstat)
levene.test(y,x)
:
: modified robust Brown-Forsythe Levene-type test based on the absolute
: deviations from the median
:
4
: data: y
: Test Statistic = 9.5416, p-value = 3.04e-05
Test by means of the Hartley test whether or not the treatment error
variance are equal; use α = .05.
## Hartley Test
by(y,x,var)
x: 1
[1] 1.195833
-----------------------------------------------------------x: 2
[1] 3.983333
-----------------------------------------------------------x: 3
[1] 10.49583
-----------------------------------------------------------x: 4
[1] 28.92917
# install packages(&quot;SuppDists&quot;)
library(SuppDists)
Hstar = 28.929/1.195
Hcrit = qmaxFratio(0.95,15,4)
pval = 1-pmaxFratio(Hstar,15,4)
# can also use Table B.10
Hstar
Hcrit
pval
[1] 24.20837
[1] 3.998907
[1] 0.0003999325
1.4
Part d
For each winding speed, calculate Ȳi. and si . Examine the relations found
in the table on page 791 and determine the transformation that is most
appropriate here. What do you conclude?
5
## part (d)
means = by(y,x,mean)
sds = by(y,x,sd)
sds^2/means
sds/means
sds/means^2
&gt; means = by(y,x,mean)
&gt; means
x: 1
[1] 3.5625
-----------------------------------------------------------x: 2
[1] 5.875
-----------------------------------------------------------[x: 3
[1] 10.6875
-----------------------------------------------------------x: 4
[1] 16.5625
&gt; sds = by(y,x,sd)
&gt; sds
x: 1
[1] 1.093542
-----------------------------------------------------------x: 2
[1] 1.995829
-----------------------------------------------------------x: 3
[1] 3.239727
-----------------------------------------------------------x: 4
[1] 5.378584
&gt; sds^2/means
x: 1
6
[1] 0.3356725
-----------------------------------------------------------x: 2
[1] 0.6780142
-----------------------------------------------------------x: 3
[1] 0.9820663
-----------------------------------------------------------x: 4
[1] 1.746667
&gt; sds/means
x: 1
[1] 0.3069591
-----------------------------------------------------------x: 2
[1] 0.3397156
-----------------------------------------------------------x: 3
[1] 0.3031324
-----------------------------------------------------------x: 4
[1] 0.3247447
&gt; sds/means^2
x: 1
[1] 0.08616395
-----------------------------------------------------------x: 2
[1] 0.05782393
-----------------------------------------------------------x: 3
[1] 0.02836326
-----------------------------------------------------------x: 4
[1] 0.01960723
7
i
1
2
3
4
transformation
1.5
si 2
Y&macr;i.
0.3357
0.6780
0.9820
1.7467
√
Y
si
Y&macr;i.
0.3070
0.3397
0.3031
0.3247
log Y
si
2
Y&macr;i.
0.0862
0.0578
0.0284
0.0196
1
Y
Part e.
Use the Box-Cox procedure to nd an appropriate power transformation of
Y. Evaluate SSE for the values of λ given in Table 18.6. Does λ = 0 a
logarithmic tranformation appear to be reasonable based on the Box-Cox
procedure?
## part(e)
library(MASS)
fit = aov(y~x)
boxcox(fit)
8
1.6
extra
If you want to do analysis with log transformed data
ynew = log(y)
stripchart(ynew~x,method=&quot;stack&quot;,pch=16)
9
fitnew = aov(ynew~x)
rnew = residuals(fitnew)
stripchart(rnew~x,method=&quot;stack&quot;,pch=16)
10
2
Interaction Plot
This is slightly modied from what I did in discussion today Create a fake
data set with factor 1 having 3 levels and factor 2 having 2 levels. Use rnorm
to generate randomly distributed normal values
# make data for interaction plot
set.seed(0) # only so we have the same random values each time
val = rnorm(6)
f1 = factor(c(1,1,1,2,2,2))
# equivalent to factor(rep(c(1,2),each=3))
f2 = factor(c(1,2,3,1,2,3))
# equivalent to factor(rep(c(1,2,3),times=2))
# help(rep) for more details
11
# check to make sure your data: val, f1, f2 all match properly
f1f2
1
2
3
1 1.2629543 -0.3262334 1.3297993
2 1.2724293 0.4146414 -1.5399500
interaction.plot(f1,f2,val)
interaction.plot(f2,f1,val)
You can show the plots from two sides
12
```