Oct. 22 Handout

advertisement
Stat 501 Oct. 22
Some Chapter 7 Ideas
1. If x1 and x2 are not correlated (correlation = 0)
 SSR(x1|x2) = SSR(x1) and SSR(x2|x1) = SSR(x2). That is, in a multiple regression, x1 and
x2 make independent contributions to reducing SSE.
 The sample estimate of β1 is the same in the model E( y)   0  1 x 1 as it is in the model
E( y)   0  1 x 1   2 x 2 . The sample estimate of β2 is the same in the model
E( y)   0   2 x 2 as it is in the model E( y)   0  1 x 1   2 x 2 . In other words, in the
multiple regression, the coefficient multiplying x1 is estimated independently of x2 and
the coefficient multiplying x2 is estimated independently of x1.
2. If x1 and x2 are correlated (correlation ≠ 0)
 SSR(x1|x2) ≠ SSR(x1) and SSR(x2|x1) ≠ SSR(x2). That is, in a multiple regression, x1 and
x2 do not make independent contributions to reducing SSE.
 The sample estimate of β1 is not the same in the model E( y)   0  1 x 1 as it is in the
model E( y)   0  1 x 1   2 x 2 . The sample estimate of β2 is the not the same in the
model E( y)   0   2 x 2 as it is in the model E( y)   0  1 x 1   2 x 2 . In other words, in
the multiple regression the coefficient multiplying x1 is not estimated independently of x2
and the coefficient multiplying x2 is not estimated independently of x1.
3. Each t-test for an individual coefficient (other than the intercept) is essentially assessing the
significance of the size of SSR(this variable | other variables in the model). This is true whether
x1 and x2 are correlated or not.
4. The Sequential SS given be Minitab or other programs can be used to put together a general
Linear F-test. Suppose a null hypothesis is that the β coefficients multiplying two particular
variables are both = 0. List those two variables last in the list of predictor variables when
specifying the model. The sum of the sequential SS for those two variables will equal
SSE(Reduced) – SSE(Full).
5. The SEQ SS is the only thing affected by the order of listing predictor variables. All other
aspects of the fit will be the same regardless of order. For example, MSE, R2, and estimated
coefficients don’t depend upon order.
An example illustrating point 4 ( and point 2 as well) is on the following page.
Example:
For the hospital infection risk data, y = infection risk, x1 = average length of stay, x2 = number of daily
bacterial cultures done, x3 = daily number of patients, x4 = number of beds in hospital and x5 = number
of nurses employed.
Output for Full Model
The regression equation is
InfctRsk = 0.841 + 0.247 Stay + 0.0525 Cultures - 0.00054 Census - 0.00039 Beds
+ 0.00291 Nurses
Analysis of Variance
Source
DF
SS
Regression
5
98.086
Residual Error 107 103.294
Total
112 201.380
Source
Stay
Cultures
Census
Beds
Nurses
DF
1
1
1
1
1
MS
19.617
0.965
F
20.32
P
0.000
Seq SS
57.305
33.397
4.645
0.057
2.681
Suppose we test
H o :  4  5  0
The “Seq SS” can be used to learn that
SSE(Reduced) – SSE(Full) = SSR(Beds, Nurses|Stay, Cultures,Census)
= 0.057+2.681 = 2.738.
2.738
And the F-statistic is F  2  1.42 with 2 and 107 df. (By the way, this won’t be significant.)
0.965
We can check this by estimating the reduced model that includes only the first three variables. The results
are below. You’ll see that SSE(Reduced)= 106.032. Above you can find that SSE(Full) = 103.294. The
difference is 106.032− 103.294 = 2.738, the same value we got using the SEQ SS.
Output for Reduced Model
The regression equation is
InfctRsk = 1.07 + 0.218 Stay + 0.0568 Cultures + 0.00150 Census
Source
Regression
Residual Error
Total
DF
3
109
112
SS
95.347
106.032
201.380
MS
31.782
0.973
F
32.67
P
0.000
---------------------------------------------------------------------NOTE: In the full model we could have listed Beds and Nurses in either order, so long as they
appear last in the list of variables. Here’s the result of listing them in the order Nurses, Beds.
Source
Stay
Cultures
Census
Nurses
Beds
DF
1
1
1
1
1
Seq SS
57.305
33.397
4.645
2.718
0.020
Adding the last two values gives 2.738, the same value we got above.
Download