STAT 3910/4910 MIDTERM #1 Winter, 2007 Thanks to Nicole Meyer and Xingyan Bai Question 1 a) Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 sex Male Female Male Female Female Male Male Female Female Male Female Male Male Female Male Female Male Male Female Female Male Female Female Male Male Female Male Male Male Male Male Female Female Female Female Female Female Female Male Male Female Female Female Female Female Female Female Female Male Male Male actual 215 155 195 145 110 155 155 114 135 180 140 145 220 132 208 135 180 155 152 126 155 135 125 155 160 130 150 185 200 180 180 173 170 170 110 150 140 135 170 170 116 160 122 140 122 155 120 135 195 180 185 1 ideal diff 190 135 155 130 100 170 155 110 135 171 130 155 200 120 190 130 165 145 135 120 170 125 110 190 150 105 150 185 190 200 180 135 135 125 103 140 135 125 175 170 116 155 120 125 122 135 115 130 185 170 175 25 20 40 15 10 -15 0 4 0 9 10 -10 20 12 18 5 15 10 17 6 -15 10 15 -35 10 25 0 0 10 -20 0 38 35 45 7 10 5 10 -5 0 0 5 2 15 0 20 5 5 10 10 10 STAT 3910/4910 MIDTERM #1 Winter, 2007 Obs sex actual ideal diff 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 Female Female Female Female Female Female Male Male Female Female Female Female Female Female Female Male Female Female Male Female Female Female Female Male Female Female Female Male Female Female Male Female Male Male Female Male Male Male Male Female Female Female Female Female Female Female Female Male Female Female Male 180 155 140 105 110 100 170 200 125 130 138 135 131 109 118 160 118 120 185 120 136 180 150 142 130 118 112 135 166 102 185 175 190 220 130 135 163 160 155 110 130 103 135 105 125 110 155 160 103 121 145 140 126 130 103 100 95 185 180 115 125 130 120 125 110 110 170 125 110 185 115 125 165 140 145 125 115 108 135 140 100 175 155 165 195 120 135 178 160 150 100 130 105 125 125 120 100 150 175 101 118 145 40 29 10 2 10 5 -15 20 10 5 8 15 6 -1 8 -10 -7 10 0 5 11 15 10 -3 5 3 4 0 26 2 10 20 25 25 10 0 -15 0 5 10 0 -2 10 -20 5 10 5 -15 2 3 0 2 STAT 3910/4910 MIDTERM #1 Winter, 2007 Obs sex actual ideal diff 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 Female Female Female Female Female Male Female Female Male Male Male Male Female Female Female Male Female Female Female Female Female Female Female Female Female Male Female Female Female Female Female Male Female Male Female Female Female Male Male Female Female Male Female Female Female Female Male Female Female Female Female 90 133 122 126 140 190 140 110 117 175 175 230 140 125 113 115 185 150 140 120 165 130 187 150 142 168 128 110 145 135 120 140 125 190 110 135 135 190 165 140 145 175 120 138 125 130 208 120 160 115 145 85 125 120 120 130 200 125 100 123 200 170 225 130 112 113 130 135 125 120 110 140 120 160 120 135 163 125 99 138 125 125 150 115 175 105 125 130 190 170 130 130 180 110 140 115 120 190 120 145 100 130 5 8 2 6 10 -10 15 10 -6 -25 5 5 10 13 0 -15 50 25 20 10 25 10 27 30 7 5 3 11 7 10 -5 -10 10 15 5 10 5 0 -5 10 15 -5 10 -2 10 10 18 0 15 15 15 3 STAT 3910/4910 MIDTERM #1 Obs sex actual ideal 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 Male Male Female Female Male Female Female Female Female Male Female Female Female Female Male Female Male Male Male Male Female Male Female Female Male Female Male Female Female 235 175 135 130 190 120 145 130 111 195 97 160 139 115 187 130 203 215 150 220 175 170 140 130 180 125 158 130 118 200 170 125 120 190 110 125 120 111 200 100 140 135 115 185 125 203 210 150 200 120 170 100 120 180 120 150 130 118 4 Winter, 2007 diff 35 5 10 10 0 10 20 10 0 -5 -3 20 4 0 2 5 0 5 0 20 55 0 40 10 0 5 8 0 0 STAT 3910/4910 MIDTERM #1 Winter, 2007 b) St u d e n t ' s i deal we i g h t , pounds 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 90 100 110 120 130 140 150 160 Se l f - r e p o r t e d Ma l e or F e ma l e F e ma l e 170 180 we i g h t , 190 200 210 220 230 240 pounds Ma l e From the scatter plot we can see that males in general have higher ideal and actual weights than females. Ideal weight and Actual weight are almost linear for both Males and Females and with the same slope. Males also have more spread in ideal weight by actual weight than females do. c) TEST NORMALITY OF ACTUAL WEIGHT OF COLLEGE STUDENTS Variable: The UNIVARIATE Procedure actual (Self-reported weight, pounds) Test Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling Tests for Normality --Statistic--W D W-Sq A-Sq 0.959184 0.123175 0.420082 2.375144 -----p Value-----Pr Pr Pr Pr < > > > W D W-Sq A-Sq <0.0001 <0.0100 <0.0050 <0.0050 The null hypotheses for the above 4 tests are all that the data is normally distributed. With a significance level of .05 and a p-value of .01 or lower, we conclude that the actual weight of college students is not normally distributed. From the Q-Q plot below, we can visually see that the data is not normally distributed as the plot isn’t linear. The plot is roughly convex. 5 STAT 3910/4910 MIDTERM #1 Winter, 2007 250 S e l f r e p o r t e d w e i g h t , 225 200 175 150 125 p o u n d s 100 75 - 3 - 2 - 1 0 No r ma l d) 6 Qu a n t i l e s 1 2 3 STAT 3910/4910 MIDTERM #1 Winter, 2007 From the above boxplots, it can be seen that the mean difference between actual and ideal weights for Female is higher than for Male. But the difference for Male is more widely distributed (a bigger box). Question 2 a) DATA ON LIGHTING AND MYOPIA Light Nearsight Count Darkness Darkness Darkness Nightlight Nightlight Nightlight Full Light Full Light Full Light b) No myopia Myopia High myopia No myopia Myopia High myopia No myopia Myopia High myopia 155 15 2 153 72 7 34 36 5 TWO-WAY TABLE FOR LIGHTING AND MYOPIA The FREQ Procedure Table of Light by Nearsight Light Nearsight Frequency ‚No myopi‚Myopia ‚High myo‚ ‚a ‚ ‚pia ‚ ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Darkness ‚ 155 ‚ 15 ‚ 2 ‚ ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Nightlight ‚ 153 ‚ 72 ‚ 7 ‚ ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Full Light ‚ 34 ‚ 36 ‚ 5 ‚ ƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 342 123 14 c) Total 172 232 75 479 TEST OF LIGHTING DISTRIBUTION The FREQ Procedure Test Cumulative Cumulative Light Frequency Percent Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Darkness 172 35.91 35.00 172 35.91 Nightlight 232 48.43 50.00 404 84.34 Full Light 75 15.66 15.00 479 100.00 Chi-Square Test for Specified Proportions ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 0.4858 DF 2 Pr > ChiSq 0.7843 Sample Size = 479 The Chi-Square Test above uses the null hypothesis that the data follow the given distribution. With a p-value of .78 and a significance level of .05 we fail to reject that the above data follows the tested distribution. The actual percentages are all within 2% of the tested percentages. d) 7 STAT 3910/4910 Co u n t MIDTERM #1 Winter, 2007 S UM 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 H i g h m y o p i a M y o p i a N o H i g h m y o p i a M y o p i a m y o p i a Da r k n e s s Ful l N o m y o p i a Li ght H i g h m y o p i a M y o p i a Ni g h t l i g h t N o Ne a r s i g h t m y o p i a Li ght From the side-by-side bar chart above, it can be seen that in the three groups of Light, Full Light makes the lowest relative frequency of No Myopia later in childhood, while the Darkness has the highest relative frequency of No Myopia later. Therefore we observe that Nightlight and Full Light increases the occurrences of Myopia later in childhood. And the stronger the light, the worse the problem it would be. e) i) TWO-WAY TABLE FOR DARKNESS & MYOPIA The FREQ Procedure Table of Darkness by Myopia Darkness Myopia Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚No ‚Yes ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Yes ‚ 155 ‚ 17 ‚ ‚ 32.36 ‚ 3.55 ‚ ‚ 90.12 ‚ 9.88 ‚ ‚ 45.32 ‚ 12.41 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ No ‚ 187 ‚ 120 ‚ ‚ 39.04 ‚ 25.05 ‚ 8 Total 172 35.91 307 64.09 STAT 3910/4910 MIDTERM #1 Winter, 2007 ‚ 60.91 ‚ 39.09 ‚ ‚ 54.68 ‚ 87.59 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 342 137 479 71.40 28.60 100.00 e) ii) Statistics for Table of Darkness by Myopia Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 46.0412 <.0001 Likelihood Ratio Chi-Square 1 51.6049 <.0001 Continuity Adj. Chi-Square 1 44.6222 <.0001 Mantel-Haenszel Chi-Square 1 45.9451 <.0001 Phi Coefficient 0.3100 Contingency Coefficient 0.2961 Cramer's V 0.3100 Under the Chi-Square Test for Independence, the null hypothesis is that the categorical variables are independent. With a significance level of .05 and a p-value of <.0001, we reject the null hypothesis. Therefore, darkness and myopia are dependent. e) iii) Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 5.8509 3.3732 10.1485 Cohort (Col1 Risk) 1.4794 1.3355 1.6389 Cohort (Col2 Risk) 0.2529 0.1577 0.4055 Sample Size = 479 The relative risk of myopia of having slept in darkness to having not slept in darkness is estimated at 0.253 with a 95% confidence interval of (0.158, 0.406). Therefore, sleeping in darkness in infancy reduces the risk of myopia. e) iv) Children who slept in darkness are 0.253 times as likely to develop myopia than children who did not sleep in darkness. f) PARTIAL DATA ON LIGHTING AND MYOPIA Light Darkness Darkness Nightlight Nightlight Full Light Nearsight Myopia High myopia Myopia High myopia Myopia Table of Darkness by Nearsight Darkness Nearsight Count Darkness 15 2 72 7 36 Yes Yes No No No Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚Myopia ‚High myo‚ ‚ ‚pia ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ 9 Total STAT 3910/4910 Full Light MIDTERM #1 High myopia 5 No Winter, 2007 Yes ‚ 15 ‚ 2 ‚ 17 ‚ 10.95 ‚ 1.46 ‚ 12.41 ‚ 88.24 ‚ 11.76 ‚ ‚ 12.20 ‚ 14.29 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ No ‚ 108 ‚ 12 ‚ 120 ‚ 78.83 ‚ 8.76 ‚ 87.59 ‚ 90.00 ‚ 10.00 ‚ ‚ 87.80 ‚ 85.71 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 123 14 137 89.78 10.22 100.00 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 0.8333 0.1697 4.0921 Cohort (Col1 Risk) 0.9804 0.8160 1.1779 Cohort (Col2 Risk) 1.1765 0.2878 4.8098 Sample Size = 137 For Children with myopia, those who slept in darkness are 1.177 times as likely to develop high myopia than children who did not sleep in darkness. Sleeping in darkness during infancy does not significantly reduce the risk of high myopia for children with myopia. The confidence interval contains the value of 1, which means the equal risk. SAS Code /* ----------------------------------------------------------------Question 1 ------------------------------------------------------------------*/ /* a */ DATA weight; INFILE 'C:\Documents and Settings\Nicole Meyer\Desktop\tst1_qn1.txt'; INPUT sex$ actual ideal diff; LABEL sex = "Male or Female" actual = "Self-reported weight, pounds" ideal = "Student's ideal weight, pounds" diff = "Difference between actual and ideal weights"; RUN; PROC PRINT DATA=weight; RUN; /* b */ PROC GPLOT DATA=weight; TITLE "SCATTER PLOT FOR IDEAL WEIGHT BY ACTUAL WEIGHT"; PLOT ideal*actual=sex; SYMBOL1 V=circle COLOR=black; SYMBOL2 V=square COLOR=black; RUN; /* c */ PROC UNIVARIATE DATA=weight NORMAL; TITLE "TEST NORMALITY OF ACTUAL WEIGHT OF COLLEGE STUDENTS"; VAR actual; HISTOGRAM actual / NORMAL; QQPLOT actual; 10 STAT 3910/4910 MIDTERM #1 Winter, 2007 RUN; /* d */ PROC SORT DATA=weight; BY sex; RUN; Proc Boxplot data = weight; Title "Boxplot of Weight Difference by Sex"; Plot diff*sex; RUN; /* ----------------------------------------------------------------Question 2 ------------------------------------------------------------------*/ /* a */ PROC FORMAT; VALUE $Lightfmt 1 = 'Darkness' 2 = 'Nightlight' 3 = 'Full Light'; VALUE $Nearsightfmt 3 = 'No myopia' 2 = 'Myopia' 1 = 'High myopia'; RUN; DATA children; INPUT Light$ Nearsight$ Count; /* -------------- e --------------- */ IF Light = '1' THEN Darkness = 'Yes'; ELSE Darkness = 'No'; IF Nearsight = '3' THEN Myopia = 'No '; ELSE Myopia = 'Yes'; /* -------------------------------- */ FORMAT FORMAT Light $Lightfmt.; Nearsight $Nearsightfmt.; DATALINES; 1 3 155 1 2 15 1 1 2 2 3 153 2 2 72 2 1 7 11 STAT 3910/4910 MIDTERM #1 3 3 34 3 2 36 3 1 5 ; RUN; PROC PRINT DATA=children; TITLE "DATA ON LIGHTING AND MYOPIA"; ID Light; RUN; /* b */ PROC FREQ DATA=children ORDER=DATA; TITLE "TWO-WAY TABLE FOR LIGHTING AND MYOPIA"; WEIGHT Count; TABLES Light*Nearsight / NOCOL NOROW NOPCT; RUN; /* c */ PROC FREQ DATA=children ORDER=DATA; TITLE 'TEST OF LIGHTING DISTRIBUTION'; WEIGHT Count; TABLES Light / TESTP=(35 50 15); RUN; /* d */ PROC GCHART DATA=children; TITLE "Barchart for Night lighting and Nearsightness"; VBAR nearsight / group=light sumvar=count; RUN; /* e(i)(ii) */ PROC FREQ DATA=children ORDER=DATA; TITLE "TWO-WAY TABLE FOR DARKNESS & MYOPIA"; Weight Count; TABLES Darkness*Myopia / CHISQ MEASURES ; RUN; /* f */ PROC FORMAT; VALUE $Lightfmt 1 = 'Darkness' 2 = 'Nightlight' 3 = 'Full Light'; VALUE $Nearsightfmt 1 = 'No myopia' 2 = 'Myopia' 3 = 'High myopia'; RUN; DATA MyopiaChildren; INPUT Light$ Nearsight$ Count; IF Light = '1' THEN Darkness = 'Yes'; ELSE Darkness = 'No'; FORMAT Light $Lightfmt.; 12 Winter, 2007 STAT 3910/4910 FORMAT Nearsight MIDTERM #1 $Nearsightfmt.; DATALINES; 1 2 15 1 3 2 2 2 72 2 3 7 3 2 36 3 3 5 ; RUN; PROC PRINT DATA=MyopiaChildren; TITLE "PARTIAL DATA ON LIGHTING AND MYOPIA"; ID Light; RUN; PROC FREQ DATA=MyopiaChildren ORDER=DATA; Weight Count; TABLES Darkness*Nearsight / CHISQ MEASURES ; RUN; 13 Winter, 2007