Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory

advertisement
Multiple Regression
A single numerical response
variable, Y.
 Multiple numerical explanatory
variables, X1, X2,…, Xk

1
Multiple Regression
Y  Y | x1 , x2 ,..., xk  
Y   0  1 x1   2 x2  ...   k xk  
2
Example
Y, Response – Effectiveness
score based on experienced
teachers’ evaluations.
 Explanatory – Test 1, Test 2,
Test 3, Test 4.

3
Student Eval
Teacher
Test1
Test2
Test3
1
2
3
4
5
6
489
423
507
467
340
524
81
68
80
107
43
129
151
156
165
149
134
163




434
76
141
23
Test4
46
46
76
56
49
72

54
44
45
55
43
49
50

58
4

JMP
Analyze – Fit Model
 Pick Role Variables



Y – EVAL
Construct Model Effects

Add – Test1, Test2, Test3, Test4
5
JMP

Analyze – Fit Model
Personality – Standard Least
Squares
 Emphasis – Minimal Report

6
Response EVAL
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
0.802861
0.759052
37.53627
444.4783
23
Analysis of Variance
Source
Model
Error
C. Total
DF
4
18
22
Sum of
Squares Mean Square
103286.25
25821.6
25361.49
1409.0
128647.74
F Ratio
18.3265
Prob > F
<.0001*
Parameter Estimates
Term
Intercept
Test1
Test2
Test3
Test4
Estim ate Std Error t Ratio Prob>|t|
-193.4994 125.3074
-1.54 0.1399
1.1158539 0.319746
3.49 0.0026*
2.243267 0.628449
3.57 0.0022*
-1.367001 0.563965
-2.42 0.0261*
6.0482387 1.202281
5.03 <.0001*
7
Prediction Equation

Predicted Evaluation = –193.50
+ 1.116*Test1 + 2.243*Test2
– 1.367*Test3 + 6.048*Test4
8
Conditions

The random error term,
 , is
Independent
 Identically distributed
 Normally distributed with
standard deviation,  .

9
Estimate of Error Variance, 
MS Error
2
SSError

df Error
 y  yˆ 


2
MS Error
MS Error
n  k  1
25361.49

 1409.0
18
10
Estimate of Error Std Dev, 

Root Mean Square Error
RMSE  MS Error
RMSE  1409.0  37.54
11
Multiple
2
R
SSModel
SSError
R 
 1
SSTotal
SSTotal
2
103286.25
R 
 0.802861
128647.74
2
12
Interpretation

80.3% of the variation in the
evaluation scores can be
explained by the model, i.e. the
relationship with the
explanatory variables.
13
Caution

Including additional explanatory
variables in a model can only
increase the value of R2, even if
those explanatory variables
have nothing to do with the
response variable.
14
Adjusted
2
R
MS Error
adjR  1 
MSTotal
2
adjR
2

25361.49 18
 1
 0.75905
128647.74 22
15
Test of Model Utility

Is there any explanatory
variable in the model that is
helping to explain significant
amounts of variation in the
response?
16
Step 1: Hypotheses
H 0 : 1   2  ...   k  0
H A : at least one parameter is not zero
17
Step 2: Test Statistic
MSModel
F
MSError
25821.6
F
 18.3265
1409.0
P  value  0.0001
18
Step 3: Decision

Reject the null hypothesis
because the P-value is so small.
19
Step 4: Conclusion
At least one of the Test 1, Test
2, Test 3 or Test 4 is providing
statistically significant
information about the
evaluation score.
 The model is useful. Maybe not
the best, but useful.

20
Alternative Form
R k 
2
F


 1 R2







n

k

1


0.802861
4  18.3265
F
0.197139
18
P  value  0.0001




21
Download