Stats 252 Lab Assignment 3

advertisement
Stats 252 Lab Assignment 3
Zeng, Yiye
Q3
Mark Prokopiuk
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
Question 1:
60
Count
40
20
0
88
96 104 106 109 115 120 129 131 133 135 137 139 141 143 147 149 151 153 155 162
height
There is a trend for height to be on the high end of all the averages. The mean height
looks to be around 139-141. The height varies quite a bit over the 31 day span.
Temperature, air pressure, might be factors in distorting measurements.
b)
10
8
Count
6
4
2
0
1.50000 1.75000 1.91667 2.08333 2.35000 3.16667 3.83333 4.03333 4.21667 4.38333 4.55000 4.75000
duration
There seem to be 2 different averages, one around 1.9 and the other around 4.4.
2
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
Question 2:
a)
Duration, Interval
120
interval
100
80
60
40
1.00000
2.00000
3.00000
4.00000
5.00000
duration
c) There is quite a strong relationship, quite linear. There are many outliers, but the
general trend is linear. Positive association, as duration of eruption increases, so will the
intervals between eruptions.
Question 3:
a)
Correlations
interval
interval
Pearson Correlation
duration
1
Sig. (2-tailed)
N
duration
Pearson Correlation
.924(**)
.000
272
263
.924(**)
1
Sig. (2-tailed)
.000
N
263
299
** Correlation is significant at the 0.01 level (2-tailed).
b) The sign and magnitude due agree with two. Positive sign, and very close to the value
1.
3
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
Question 4:
a)
Model Summary
Model
1
R
R Square
Adjusted R
Square
.924(a)
.854
a Predictors: (Constant), duration
Std. Error of
the Estimate
.853
6.493
ANOVA(b)
Model
1
Regression
Sum of
Squares
64228.526
Residual
df
1
Mean Square
64228.526
11004.721
261
42.164
75233.247
a Predictors: (Constant), duration
b Dependent Variable: interval
262
Total
F
1523.314
Sig.
.000(a)
Coefficients(a)
Unstandardized
Coefficients
Model
1
B
Standardized
Coefficients
Std. Error
(Constant)
33.347
1.201
duration
13.285
.340
Beta
t
.924
Sig.
27.765
.000
39.030
.000
a Dependent Variable: interval
Model – μ(interval | duration) = β1 + β0(duration)
b)
Estimate - û(interval | duration) = 33.347 + 13.285(duration)
The slope of the regression line indicates a positive relationship between interval and duration.
4
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
120
interval
100
80
60
R Sq Linear = 0.854
40
1.00000
2.00000
3.00000
4.00000
duration
5.00000
_
The line is a pretty decent fit, with many outliers from the actual line.
c)
R Square explains variation. R2 = 0.854
85.4 percent of variation in interval can be explained by the regression of interval on
duration.
d)
33.347 + 13.285(duration) = 33.347
+ 13.285(2) = 33.347 + 26.57 = 59.917
e) H0: β1 = 0, Ha: β1 ≠ 0
12.285 / 0.340 = 39.030  t-stat
sig = 0.000  Extremely strong evidence against H0
SSR(EM) = 75233.247
SSR(SLR) = 11004.721
f)
5
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
Scatterplot
Dependent Variable: interval
Regression Standardized Residual
4
2
0
-2
-4
-2
-1
0
1
2
Regression Standardized Predicted Value
The residuals are clumped into two groups, each with their seemingly own average. They
do however seem to be scattered about the horizontal (zero) line.
g)
Normal P-P Plot of Regression Standardized Residual
Dependent Variable: interval
Expected Cum Prob
1.0
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
Observed Cum Prob
Normality from the response is appropriate. The values of the above plot are almost
completely along the linear line presented.
h)
Question 5:
6
Stats 252 Q3
Zeng, Yiye
Lab Assignment 3
Mark Prokopiuk
1034216
a)
Interval and Duration*Height
120
interval
100
80
60
40
200.00
400.00
600.00
800.00
DandH
b) The graph is almost identical to the one of just Interval and Duration alone. The
relationship is quite linear. Given a certain height and duration, interval can be easily
predicted. Positive association. Yes, there are some outliers.
c)
model - μ(interval | duration*height) = β1 + β0(duration*height)
Estimate - û(interval | duration) = 33.347 + .088(duration)
R2 = 0.738
73.8 percent of the variation can be explained by the above plots.
7
Download