regression - inference for the mean

advertisement
Inference for the regression mean
Least-Squares Estimator
According to the regression model, the mean of variable y for a given value for x, say x  x* , is
E y | x  x   
    x . This mean is estimated by yˆ  ˆ
 ˆ  ˆ x . With the usual
*
y| x0
0
1 *
*
y| x*
0
1 *
assumptions of normality, constant variance, and random sampling, the sampling distribution of ŷ* can
be calculated. It is a normally distributed, unbiased estimator with sampling variance
1
x*  x 2  . With this information, inference about the mean can be undertaken.
 2 
n
xi  x 2 


Testing
We can test the hypothesis H 0 :  y| x*  * with a t-test. Using the estimator s 2 for the unknown variance
 2 , we obtain t 
yˆ*  *
1
~ t n  2  , with s yˆ*  s

s yˆ*
n
x*  x 2
 xi  x 2
, and reject the null hypothesis
according to the direction of the alternative hypothesis and the probability of type I error.
Confidence interval for  x*
A confidence interval for the mean is constructed in the usual manner, as yˆ*  t 

1 
 2
df  n  2s yˆ
*
Predictive interval for y*
A predictive interval for y given x  x* can be obtained by remembering that y*   0  1 x*   * , so that
even with the mean known exactly, there is an irreducible variance component var  *    2 for
predicting y* . If we add this to the variance of the mean estimator, we obtain the variance

 y2*   2 1 


1

n
x*  x 2  .
 xi  x 2 
unknown  2 , yielding yˆ*  t 
Thus, a predictive interval can be constructed by substituting s 2 for the
 1
2
2
,
where
s

s
1 
y
*
*
 n

  df  n  2s y
1 
 2
x*  x 2  .
 xi  x 2 
Download