Nicole Meyer

advertisement
STAT 3910/4910
MIDTERM #1
Winter, 2007
Thanks to Nicole Meyer and Xingyan Bai
Question 1
a)
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
sex
Male
Female
Male
Female
Female
Male
Male
Female
Female
Male
Female
Male
Male
Female
Male
Female
Male
Male
Female
Female
Male
Female
Female
Male
Male
Female
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Female
Female
Male
Male
Female
Female
Female
Female
Female
Female
Female
Female
Male
Male
Male
actual
215
155
195
145
110
155
155
114
135
180
140
145
220
132
208
135
180
155
152
126
155
135
125
155
160
130
150
185
200
180
180
173
170
170
110
150
140
135
170
170
116
160
122
140
122
155
120
135
195
180
185
1
ideal
diff
190
135
155
130
100
170
155
110
135
171
130
155
200
120
190
130
165
145
135
120
170
125
110
190
150
105
150
185
190
200
180
135
135
125
103
140
135
125
175
170
116
155
120
125
122
135
115
130
185
170
175
25
20
40
15
10
-15
0
4
0
9
10
-10
20
12
18
5
15
10
17
6
-15
10
15
-35
10
25
0
0
10
-20
0
38
35
45
7
10
5
10
-5
0
0
5
2
15
0
20
5
5
10
10
10
STAT 3910/4910
MIDTERM #1
Winter, 2007
Obs
sex
actual
ideal
diff
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
Female
Female
Female
Female
Female
Female
Male
Male
Female
Female
Female
Female
Female
Female
Female
Male
Female
Female
Male
Female
Female
Female
Female
Male
Female
Female
Female
Male
Female
Female
Male
Female
Male
Male
Female
Male
Male
Male
Male
Female
Female
Female
Female
Female
Female
Female
Female
Male
Female
Female
Male
180
155
140
105
110
100
170
200
125
130
138
135
131
109
118
160
118
120
185
120
136
180
150
142
130
118
112
135
166
102
185
175
190
220
130
135
163
160
155
110
130
103
135
105
125
110
155
160
103
121
145
140
126
130
103
100
95
185
180
115
125
130
120
125
110
110
170
125
110
185
115
125
165
140
145
125
115
108
135
140
100
175
155
165
195
120
135
178
160
150
100
130
105
125
125
120
100
150
175
101
118
145
40
29
10
2
10
5
-15
20
10
5
8
15
6
-1
8
-10
-7
10
0
5
11
15
10
-3
5
3
4
0
26
2
10
20
25
25
10
0
-15
0
5
10
0
-2
10
-20
5
10
5
-15
2
3
0
2
STAT 3910/4910
MIDTERM #1
Winter, 2007
Obs
sex
actual
ideal
diff
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
Female
Female
Female
Female
Female
Male
Female
Female
Male
Male
Male
Male
Female
Female
Female
Male
Female
Female
Female
Female
Female
Female
Female
Female
Female
Male
Female
Female
Female
Female
Female
Male
Female
Male
Female
Female
Female
Male
Male
Female
Female
Male
Female
Female
Female
Female
Male
Female
Female
Female
Female
90
133
122
126
140
190
140
110
117
175
175
230
140
125
113
115
185
150
140
120
165
130
187
150
142
168
128
110
145
135
120
140
125
190
110
135
135
190
165
140
145
175
120
138
125
130
208
120
160
115
145
85
125
120
120
130
200
125
100
123
200
170
225
130
112
113
130
135
125
120
110
140
120
160
120
135
163
125
99
138
125
125
150
115
175
105
125
130
190
170
130
130
180
110
140
115
120
190
120
145
100
130
5
8
2
6
10
-10
15
10
-6
-25
5
5
10
13
0
-15
50
25
20
10
25
10
27
30
7
5
3
11
7
10
-5
-10
10
15
5
10
5
0
-5
10
15
-5
10
-2
10
10
18
0
15
15
15
3
STAT 3910/4910
MIDTERM #1
Obs
sex
actual
ideal
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
Male
Male
Female
Female
Male
Female
Female
Female
Female
Male
Female
Female
Female
Female
Male
Female
Male
Male
Male
Male
Female
Male
Female
Female
Male
Female
Male
Female
Female
235
175
135
130
190
120
145
130
111
195
97
160
139
115
187
130
203
215
150
220
175
170
140
130
180
125
158
130
118
200
170
125
120
190
110
125
120
111
200
100
140
135
115
185
125
203
210
150
200
120
170
100
120
180
120
150
130
118
4
Winter, 2007
diff
35
5
10
10
0
10
20
10
0
-5
-3
20
4
0
2
5
0
5
0
20
55
0
40
10
0
5
8
0
0
STAT 3910/4910
MIDTERM #1
Winter, 2007
b)
St u d e n t ' s
i deal
we i g h t ,
pounds
230
220
210
200
190
180
170
160
150
140
130
120
110
100
90
80
90
100
110
120
130
140
150
160
Se l f - r e p o r t e d
Ma l e
or
F e ma l e
F e ma l e
170
180
we i g h t ,
190
200
210
220
230
240
pounds
Ma l e
From the scatter plot we can see that males in general have higher ideal and actual weights than
females. Ideal weight and Actual weight are almost linear for both Males and Females and with
the same slope. Males also have more spread in ideal weight by actual weight than females do.
c)
TEST NORMALITY OF ACTUAL WEIGHT OF COLLEGE STUDENTS
Variable:
The UNIVARIATE Procedure
actual (Self-reported weight, pounds)
Test
Shapiro-Wilk
Kolmogorov-Smirnov
Cramer-von Mises
Anderson-Darling
Tests for Normality
--Statistic--W
D
W-Sq
A-Sq
0.959184
0.123175
0.420082
2.375144
-----p Value-----Pr
Pr
Pr
Pr
<
>
>
>
W
D
W-Sq
A-Sq
<0.0001
<0.0100
<0.0050
<0.0050
The null hypotheses for the above 4 tests are all that the data is normally distributed. With a
significance level of .05 and a p-value of .01 or lower, we conclude that the actual weight of
college students is not normally distributed. From the Q-Q plot below, we can visually see that
the data is not normally distributed as the plot isn’t linear. The plot is roughly convex.
5
STAT 3910/4910
MIDTERM #1
Winter, 2007
250
S
e
l
f
r
e
p
o
r
t
e
d
w
e
i
g
h
t
,
225
200
175
150
125
p
o
u
n
d
s
100
75
- 3
- 2
- 1
0
No r ma l
d)
6
Qu a n t i l e s
1
2
3
STAT 3910/4910
MIDTERM #1
Winter, 2007
From the above boxplots, it can be seen that the mean difference between actual and ideal
weights for Female is higher than for Male. But the difference for Male is more widely
distributed (a bigger box).
Question 2
a)
DATA ON LIGHTING AND MYOPIA
Light
Nearsight
Count
Darkness
Darkness
Darkness
Nightlight
Nightlight
Nightlight
Full Light
Full Light
Full Light
b)
No myopia
Myopia
High myopia
No myopia
Myopia
High myopia
No myopia
Myopia
High myopia
155
15
2
153
72
7
34
36
5
TWO-WAY TABLE FOR LIGHTING AND MYOPIA
The FREQ Procedure
Table of Light by Nearsight
Light
Nearsight
Frequency
‚No myopi‚Myopia ‚High myo‚
‚a
‚
‚pia
‚
ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Darkness
‚
155 ‚
15 ‚
2 ‚
ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Nightlight ‚
153 ‚
72 ‚
7 ‚
ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Full Light ‚
34 ‚
36 ‚
5 ‚
ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total
342
123
14
c)
Total
172
232
75
479
TEST OF LIGHTING DISTRIBUTION
The FREQ Procedure
Test
Cumulative
Cumulative
Light
Frequency
Percent
Percent
Frequency
Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Darkness
172
35.91
35.00
172
35.91
Nightlight
232
48.43
50.00
404
84.34
Full Light
75
15.66
15.00
479
100.00
Chi-Square Test
for Specified Proportions
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square
0.4858
DF
2
Pr > ChiSq
0.7843
Sample Size = 479
The Chi-Square Test above uses the null hypothesis that the data follow the given distribution.
With a p-value of .78 and a significance level of .05 we fail to reject that the above data follows
the tested distribution. The actual percentages are all within 2% of the tested percentages.
d)
7
STAT 3910/4910
Co u n t
MIDTERM #1
Winter, 2007
S UM
160
150
140
130
120
110
100
90
80
70
60
50
40
30
20
10
0
H
i
g
h
m
y
o
p
i
a
M
y
o
p
i
a
N
o
H
i
g
h
m
y
o
p
i
a
M
y
o
p
i
a
m
y
o
p
i
a
Da r k n e s s
Ful l
N
o
m
y
o
p
i
a
Li ght
H
i
g
h
m
y
o
p
i
a
M
y
o
p
i
a
Ni g h t l i g h t
N
o
Ne a r s i g h t
m
y
o
p
i
a
Li ght
From the side-by-side bar chart above, it can be seen that in the three groups of Light, Full Light
makes the lowest relative frequency of No Myopia later in childhood, while the Darkness has the
highest relative frequency of No Myopia later. Therefore we observe that Nightlight and Full
Light increases the occurrences of Myopia later in childhood. And the stronger the light, the
worse the problem it would be.
e) i)
TWO-WAY TABLE FOR DARKNESS & MYOPIA
The FREQ Procedure
Table of Darkness by Myopia
Darkness
Myopia
Frequency‚
Percent ‚
Row Pct ‚
Col Pct ‚No
‚Yes
‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Yes
‚
155 ‚
17 ‚
‚ 32.36 ‚
3.55 ‚
‚ 90.12 ‚
9.88 ‚
‚ 45.32 ‚ 12.41 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
No
‚
187 ‚
120 ‚
‚ 39.04 ‚ 25.05 ‚
8
Total
172
35.91
307
64.09
STAT 3910/4910
MIDTERM #1
Winter, 2007
‚ 60.91 ‚ 39.09 ‚
‚ 54.68 ‚ 87.59 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total
342
137
479
71.40
28.60
100.00
e) ii)
Statistics for Table of Darkness by Myopia
Statistic
DF
Value
Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square
1
46.0412
<.0001
Likelihood Ratio Chi-Square
1
51.6049
<.0001
Continuity Adj. Chi-Square
1
44.6222
<.0001
Mantel-Haenszel Chi-Square
1
45.9451
<.0001
Phi Coefficient
0.3100
Contingency Coefficient
0.2961
Cramer's V
0.3100
Under the Chi-Square Test for Independence, the null hypothesis is that the categorical variables
are independent. With a significance level of .05 and a p-value of <.0001, we reject the null
hypothesis. Therefore, darkness and myopia are dependent.
e) iii)
Estimates of the Relative Risk (Row1/Row2)
Type of Study
Value
95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio)
5.8509
3.3732
10.1485
Cohort (Col1 Risk)
1.4794
1.3355
1.6389
Cohort (Col2 Risk)
0.2529
0.1577
0.4055
Sample Size = 479
The relative risk of myopia of having slept in darkness to having not slept in darkness is
estimated at 0.253 with a 95% confidence interval of (0.158, 0.406). Therefore, sleeping in
darkness in infancy reduces the risk of myopia.
e) iv) Children who slept in darkness are 0.253 times as likely to develop myopia than children
who did not sleep in darkness.
f)
PARTIAL DATA ON LIGHTING AND MYOPIA
Light
Darkness
Darkness
Nightlight
Nightlight
Full Light
Nearsight
Myopia
High myopia
Myopia
High myopia
Myopia
Table of Darkness by Nearsight
Darkness
Nearsight
Count
Darkness
15
2
72
7
36
Yes
Yes
No
No
No
Frequency‚
Percent ‚
Row Pct ‚
Col Pct ‚Myopia ‚High myo‚
‚
‚pia
‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
9
Total
STAT 3910/4910
Full Light
MIDTERM #1
High myopia
5
No
Winter, 2007
Yes
‚
15 ‚
2 ‚
17
‚ 10.95 ‚
1.46 ‚ 12.41
‚ 88.24 ‚ 11.76 ‚
‚ 12.20 ‚ 14.29 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
No
‚
108 ‚
12 ‚
120
‚ 78.83 ‚
8.76 ‚ 87.59
‚ 90.00 ‚ 10.00 ‚
‚ 87.80 ‚ 85.71 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total
123
14
137
89.78
10.22
100.00
Estimates of the Relative Risk (Row1/Row2)
Type of Study
Value
95% Confidence Limits
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Case-Control (Odds Ratio)
0.8333
0.1697
4.0921
Cohort (Col1 Risk)
0.9804
0.8160
1.1779
Cohort (Col2 Risk)
1.1765
0.2878
4.8098
Sample Size = 137
For Children with myopia, those who slept in darkness are 1.177 times as likely to develop high
myopia than children who did not sleep in darkness. Sleeping in darkness during infancy does
not significantly reduce the risk of high myopia for children with myopia. The confidence
interval contains the value of 1, which means the equal risk.
SAS Code
/* ----------------------------------------------------------------Question 1
------------------------------------------------------------------*/
/* a */
DATA weight;
INFILE 'C:\Documents and Settings\Nicole Meyer\Desktop\tst1_qn1.txt';
INPUT sex$ actual ideal diff;
LABEL
sex = "Male or Female"
actual = "Self-reported weight, pounds"
ideal = "Student's ideal weight, pounds"
diff = "Difference between actual and ideal weights";
RUN;
PROC PRINT DATA=weight;
RUN;
/* b */
PROC GPLOT DATA=weight;
TITLE "SCATTER PLOT FOR IDEAL WEIGHT BY ACTUAL WEIGHT";
PLOT ideal*actual=sex;
SYMBOL1 V=circle COLOR=black;
SYMBOL2 V=square COLOR=black;
RUN;
/* c */
PROC UNIVARIATE DATA=weight NORMAL;
TITLE "TEST NORMALITY OF ACTUAL WEIGHT OF COLLEGE STUDENTS";
VAR actual;
HISTOGRAM actual / NORMAL;
QQPLOT actual;
10
STAT 3910/4910
MIDTERM #1
Winter, 2007
RUN;
/* d */
PROC SORT DATA=weight;
BY sex;
RUN;
Proc Boxplot data = weight;
Title "Boxplot of Weight Difference by Sex";
Plot diff*sex;
RUN;
/* ----------------------------------------------------------------Question 2
------------------------------------------------------------------*/
/* a */
PROC FORMAT;
VALUE $Lightfmt
1 = 'Darkness'
2 = 'Nightlight'
3 = 'Full Light';
VALUE $Nearsightfmt
3 = 'No myopia'
2 = 'Myopia'
1 = 'High myopia';
RUN;
DATA children;
INPUT Light$ Nearsight$ Count;
/* -------------- e --------------- */
IF Light = '1' THEN Darkness = 'Yes';
ELSE Darkness = 'No';
IF Nearsight = '3' THEN Myopia = 'No ';
ELSE Myopia = 'Yes';
/* -------------------------------- */
FORMAT
FORMAT
Light $Lightfmt.;
Nearsight
$Nearsightfmt.;
DATALINES;
1 3 155
1 2 15
1 1 2
2 3 153
2 2 72
2 1 7
11
STAT 3910/4910
MIDTERM #1
3 3 34
3 2 36
3 1 5
;
RUN;
PROC PRINT DATA=children;
TITLE "DATA ON LIGHTING AND MYOPIA";
ID Light;
RUN;
/* b */
PROC FREQ DATA=children ORDER=DATA;
TITLE "TWO-WAY TABLE FOR LIGHTING AND MYOPIA";
WEIGHT Count;
TABLES Light*Nearsight / NOCOL NOROW NOPCT;
RUN;
/* c */
PROC FREQ DATA=children ORDER=DATA;
TITLE 'TEST OF LIGHTING DISTRIBUTION';
WEIGHT Count;
TABLES Light / TESTP=(35 50 15);
RUN;
/* d */
PROC GCHART DATA=children;
TITLE "Barchart for Night lighting and Nearsightness";
VBAR nearsight / group=light sumvar=count;
RUN;
/* e(i)(ii) */
PROC FREQ DATA=children ORDER=DATA;
TITLE "TWO-WAY TABLE FOR DARKNESS & MYOPIA";
Weight Count;
TABLES Darkness*Myopia / CHISQ MEASURES ;
RUN;
/* f */
PROC FORMAT;
VALUE $Lightfmt
1 = 'Darkness'
2 = 'Nightlight'
3 = 'Full Light';
VALUE $Nearsightfmt
1 = 'No myopia'
2 = 'Myopia'
3 = 'High myopia';
RUN;
DATA MyopiaChildren;
INPUT Light$ Nearsight$ Count;
IF Light = '1' THEN Darkness = 'Yes';
ELSE Darkness = 'No';
FORMAT
Light $Lightfmt.;
12
Winter, 2007
STAT 3910/4910
FORMAT
Nearsight
MIDTERM #1
$Nearsightfmt.;
DATALINES;
1 2 15
1 3 2
2 2 72
2 3 7
3 2 36
3 3 5
;
RUN;
PROC PRINT DATA=MyopiaChildren;
TITLE "PARTIAL DATA ON LIGHTING AND MYOPIA";
ID Light;
RUN;
PROC FREQ DATA=MyopiaChildren ORDER=DATA;
Weight Count;
TABLES Darkness*Nearsight / CHISQ MEASURES ;
RUN;
13
Winter, 2007
Download