Chapter 1 Linear Regression with One Predictor Variable

advertisement
Chapter 1
Linear Regression with One Predictor Variable
許湘伶
Applied Linear Regression Models
(Kutner, Nachtsheim, Neter, Li)
hsuhl (NUK)
LR Chap 1
1 / 41
Regression analysis is a statistical methodology that utilizes the
relation between two or more quantitative variables so that a
response or outcome variable can be predicted from the other, or
others.
迴歸分析(Regression Analysis)是一種統計學上分析數據的方
法,目的在於了解兩個或多個變數間是否相關、相關方向與強
度,並建立數學模型以便觀察特定變數來預測研究者感興趣的變
數。 (Wiki)
hsuhl (NUK)
LR Chap 1
2 / 41
起源:「迴歸」一詞最早由法蘭西斯·高爾頓(Francis Galton)所
使用。他曾對親子間的身高做研究,發現父母的身高雖然會遺傳
給子女,但子女的身高卻有逐漸「迴歸到中等(即人的平均
值)」(regression to the mean)的現象。不過當時的迴歸和現在的
迴歸在意義上已不盡相同。 (Wiki)
「向平均迴歸」(regression to the mean)現象: 非常高的父母所生
的子女,往往比父母矮些,而非常矮的雙親所生的孩子,則往往
比父母親高。將人的身高從高、矮兩個極端往所有人類的平均值
拉。(統計 改變了世界)
hsuhl (NUK)
LR Chap 1
3 / 41
business
social science
biological sciences
engineering
chemical science
economics
management etc.
hsuhl (NUK)
LR Chap 1
4 / 41
Relations between Variable
Functional Relation between Two Variables
functional relation vs. statistical relation
工讀時數 vs. 薪水
Y = dollar sales of a product
X = a fixed price and number of units sold
hsuhl (NUK)
LR Chap 1
5 / 41
Relations between Variable
Functional Relation between Two Variables (cont.)
If the selling price is $2 per unit, Y = 2X
(a) figure
(b) data
Figure : Example of Functional Relation (Y = f (X))
hsuhl (NUK)
LR Chap 1
6 / 41
Relations between Variable
Functional Relation between Two Variables (cont.)
The observations for a statistical relation do not fall directly on
the curve of relationship.
Ex: Employees performance evaluations
Y = Year-end evaluations; X = midyear evaluations
hsuhl (NUK)
LR Chap 1
7 / 41
Relations between Variable
Functional Relation between Two Variables (cont.)
Which kind of pattern can be observed in the figure?
What kind of the relation between midyear and year-end
evaluations?
hsuhl (NUK)
LR Chap 1
8 / 41
Relations between Variable
Functional Relation between Two Variables (cont.)
(a): scatter plot
(b): Plotting a line of relationship to describe the statistical
relation between X and Y
hsuhl (NUK)
LR Chap 1
9 / 41
Relations between Variable
Functional Relation between Two Variables (cont.)
Figure : Curvilinear Statistical Relation between Age and Steroid(膽固醇)
Level in Healthy Females Aged 8 to 25.
hsuhl (NUK)
LR Chap 1
10 / 41
Regression Models and Their Uses
Basic Concepts
A regression model is a formal means of expressing the two essential
ingredients of a statistical relation:
A tendency of the response variable Y to vary with the predictor
variable X in a systematic fashion.
A scattering of points around the curve of statistical relationship.
hsuhl (NUK)
LR Chap 1
11 / 41
Regression Models and Their Uses
Basic Concepts (cont.)
A regression model:
A probability distribution of Y for each level of X
The probability distributions vary in some systematic fashion with
X.
Figure : Pictorial Representation of Regression Model
hsuhl (NUK)
LR Chap 1
12 / 41
Regression Models and Their Uses
Construction of Regression Models
Y: the dependent or response variable
X: the independent, explanatory or predictor variable
Construction:
Selection of Predictor Variables: may more than one predictor
variable
Functional Form of Regression Relation: linear, quadratic
regression functions . . .
Scope of Model: some interval or region of values of the predictor
variables; The shape of the regression function outside the range
would be in serious doubt.
hsuhl (NUK)
LR Chap 1
13 / 41
Regression Models and Their Uses
Extrapolation
(a) extrapolation
hsuhl (NUK)
(b) danger
LR Chap 1
14 / 41
Regression Models and Their Uses
Uses
Three major purposes:
1
description (描繪)
2
control (控制)
3
prediction (預測)
hsuhl (NUK)
LR Chap 1
15 / 41
Regression Models and Their Uses
Causality
Regression and Causality (因果關係):
The existence of a statistical relation between Y and X does not
imply that Y depends causally (有因果關係地 ) on X.
(迴歸並不意味著存在因果關係,即解釋變數是因,反應變數是果)
Correlation(關係)6=Causation(因果):
有關係不代表有因果關係。
已知X、Y之間有相關,可用已知的X來預測Y,但不表示X的改
變,定會造成Y有特定的改變。
吸菸 vs. 癌症
速度 vs. 車禍次數
hsuhl (NUK)
LR Chap 1
16 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Statement of Model
The linear regression function with one predictor variable:
Yi = β0 + β1 Xi + εi ,
i = 1, . . . , n
Yi : the value of response variable in the ith trial
β0 , β1 : parameters
Xi : a known constant; the value of the predictor variable in the ith
trial
εi : random error term; E{εi } = 0; σ 2 (εi ) = σ 2 ; uncorrelated
σ{εi , εj } = 0 ∀i, j(i 6= j)
simple, linear in the parameters; linear in the predictor variable
hsuhl (NUK)
LR Chap 1
17 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Features
1
Yi : the sum of two components:
(1) β0 + β1 Xi
(2) εi
2
E{εi } = 0:
E{Yi } = E{β0 + β1 Xi + εi } = β0 + β1 Xi
3
The regression function:
E{Y} = β0 + β1 X
(The regression function relates the means of the probability
distribution of Y for given X to the level of X.
hsuhl (NUK)
LR Chap 1
18 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Features (cont.)
1
2
Yi in the ith trial exceeds or falls short of the value of the
regression function by the error term amount εi
σ 2 {εi } = σ 2 :
σ 2 {Yi } = σ 2
3
2
2
σ {β0 + β1 Xi + εi } = σ {εi } = σ
2
The error terms are assumed to be uncorrelated, so are the
responses Yi and Yj .
hsuhl (NUK)
LR Chap 1
19 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Features (cont.)
Figure : Illustration of Simple Linear Regression Model
hsuhl (NUK)
LR Chap 1
20 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Meaning of Regression Parameters
Regression coefficients:
β0 (slope),
β1 (intercept)
Figure : Meaning of Parameters of Simple Linear Regression Model
hsuhl (NUK)
LR Chap 1
21 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Alternative modification
1
Original regression model:
Yi =β0 + β1 Xi + εi , i = 1, . . . , n
2
Yi =β0 X0 + β1 Xi + εi
(X0 ≡ 1)
3
Yi =β0 + β1 Xi + εi
=β0 + β1 (Xi − X̄) + β1 X̄ + εi
=(β0 + β1 X̄) + β1 (Xi − X̄) + εi
=β0∗ + β1 (Xi − X̄) + εi
hsuhl (NUK)
LR Chap 1
22 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Data from Regression Analysis
Unknown the regression parameters β0 , β1
Estimate parameters from relevant data
Rely on an analysis of the data for developing a suitable
regression model
hsuhl (NUK)
LR Chap 1
23 / 41
Simple Linear Regression Model with Distribution of Error Terms
Unspecified
Figure : Typical Strategy for Regression Analysis
hsuhl (NUK)
LR Chap 1
24 / 41
Estimation of Regression Function
Estimate: Method of Least Squares
Observations: (Xi , Yi ), i = 1, . . . , n
Deviation (偏差): Yi − β0 − β1 Xi
hsuhl (NUK)
LR Chap 1
25 / 41
Estimation of Regression Function
Estimate: Method of Least Squares (cont.)
The least square criterion:
Q=
n
X
(Yi − β0 − β1 Xi )2
i=1
The property of “Good” estimators?
The least squares estimators b0 , b1 minimize the criterion Q for
the given sample observations.
How to obtain the estimators b0 , b1 ?
hsuhl (NUK)
LR Chap 1
26 / 41
Estimation of Regression Function
Estimate: Method of Least Squares
∂Q =0
∂β0 b0 ,b1
∂Q =0
∂β1 b0 ,b1
Pn
(X − X̄)(Yi − Ȳ)
Pn i
⇒b1 = i=1
2
i=1 (Xi − X̄)
b0 = Ȳ − b1 X̄
X̄, Ȳ: the means of the Xi and Yi , respectively
hsuhl (NUK)
LR Chap 1
27 / 41
Estimation of Regression Function
Data on Lot Size and Work Hours
refrigeration equipment
manufacturing
Xi : Lot Size (i = 1, . . . , 25)
Yi : Labor work hours
b1 = 3.5702; b0 = 62.37
hsuhl (NUK)
LR Chap 1
28 / 41
Estimation of Regression Function
Normal equations
X
X
Yi = nb0 + b1
Xi
X
X
X
Xi Yi = b0
Xi + b1
Xi2
A test of the second partial derivatives will show that a minimum
is obtained with the least squares estimators b0 and b1 .
hsuhl (NUK)
LR Chap 1
29 / 41
Estimation of Regression Function
Property of Least Squares Estimators (Chap 2)
Gauss-Markov theorem
Unbiased:
E{b0 } = β0 ;
E{b1 } = β1
More precise:
The estimators b0 and b1 are more precise than any other
estimators belonging to the class of unbiased estimators that are
linear functions of the observations Y1 , . . . , Yn .
Linear function of Yi :
b1 =
X
ki Yi
b0 , b1 are linear estimators.
hsuhl (NUK)
LR Chap 1
30 / 41
Estimation of Regression Function
Point Estimation of Mean Response
response: a value of the response variable
Mean response: E{Y}
Estimated regression function:
Ŷ = b0 + b1 X
(Ŷ: the value of the estimated regression function at X of the
predictor variable)
Ŷ : an unbiased estimator of E{Y}
Fitted value Ŷi :
Ŷi = b0 + b1 Xi , i = 1, . . . , n
hsuhl (NUK)
LR Chap 1
31 / 41
Estimation of Regression Function
Residuals
Residual: ei
ei = Yi − Ŷi = Yi − (b0 + b1 Xi )
is the vertical deviation of Yi from the fitted value Ŷi on the
estimated regression line, and it is known.
Model error term: εi
εi = Yi − E{Y}
the vertical deviation of Yi from the unknown true regression line
and is unknown.
hsuhl (NUK)
LR Chap 1
32 / 41
Estimation of Regression Function
Properties of Fitted Regression Line
The sum of the residuals is zero:
n
X
ei = 0
i=1
(Rounded errors may be presented.)
P
The sum of the squared residuals is a minimum: ni=1 e2i
P
∵ the criterion Q to be minimized equals ni=1 e2i when b0 , b1 are
used for estimating β0 , β1
The sum of the observed values Yi equals the sum of the fitted
values Ŷi :
n
n
X
X
Yi =
Ŷi
i=1
hsuhl (NUK)
LR Chap 1
i=1
33 / 41
Estimation of Regression Function
Properties of Fitted Regression Line (cont.)
The sum of the weighted residuals is zero when the residual in the
ith trial is weighted by the level of the predictor variable in the
ithe trial:
n
X
Xi ei = 0
i=1
The sum of the weighted residuals is zero when the residual in the
ith trial is weighted by the fitted value of the response variable for
the ith trial:
n
X
Ŷi ei = 0
i=1
The regression line always goes through the point (X̄, Ȳ)
hsuhl (NUK)
LR Chap 1
34 / 41
Estimation of Regression Function
Estimation of σ 2
σ 2 {Yi } = σ 2
The error sum of squares or residual sum of squares: SSE
n
n
X
X
2
SSE =
(Yi − Ŷi ) =
e2i
i=1
i=1
The residual sum of squares SSE has n − 2 degrees of freedom.
(Two degrees of freedom are associated with the estimates b0 and
b1 involved in obtaining Ŷi )
E{SSE} = (n − 2)σ 2 (need to be proof)
hsuhl (NUK)
LR Chap 1
35 / 41
Estimation of Regression Function
Estimation of σ 2 (cont.)
The error mean square or residual mean square: MSE
Pn
P 2
2
SSE
ei
i=1 (Yi − Ŷi )
MSE =
=
=
n−2
n−2
n−2
MSE is an unbiased estimator of σ 2 :
E{MSE} = σ 2
An estimate of σ =
hsuhl (NUK)
√
MSE
LR Chap 1
36 / 41
Normal Error Regression Model
Normal Error Regression Model
The normal error regression model:
Yi = β0 + β1 Xi + εi
Yi : the observation response
Xi : a known constant
β0 , β1 : parameters
εi , i = 1, . . . , n: independent N(0, σ 2 )
常態分佈的特性?
The estimators of the parameters β0 , β1 and σ 2 van be estimated
be the method of maximum likelihood. (MLE)
hsuhl (NUK)
LR Chap 1
37 / 41
Normal Error Regression Model
Normal Error Regression Model (cont.)
The method of maximum likelihood chooses as the maximum
likelihood estimate that value for which the likelihood value is
largest.
Two methods for finding MLE:
a systematic numerical search
use of an analytical solution
Estimator of µ is the sample mean Ȳ
hsuhl (NUK)
LR Chap 1
38 / 41
Normal Error Regression Model
Normal Error Regression Model (cont.)
σ = 2.5; β0 = 0; β1 = 0.5
hsuhl (NUK)
LR Chap 1
39 / 41
Normal Error Regression Model
Normal Error Regression Model (cont.)
The density of an observation Yi for the normal error regression
model:
(E{Yi } = β0 + β1 Xi ; σ 2 {Yi } = σ 2 )
"
2 #
1 Yi − β0 − β1 Xi
1
fi = √ exp −
2
σ
2π
The likelihood function for n observations Y1 , . . . , Yn :
"
2 #
n
n
Y
Y
1
1
Y
−
β
−
β
X
i
0
1 i
√ exp −
L(β0 , β1 , σ 2 ) =
fi =
2
σ
2π
i=1
i=1
"
#
n
1 X
1
=
exp − 2
(Yi − β0 − β1 Xi )2
2
n/2
(2πσ )
2σ i=1
hsuhl (NUK)
LR Chap 1
40 / 41
Normal Error Regression Model
Normal Error Regression Model (cont.) (cont.)
The MLE of σ 2 is biased.
MSE =
n
σ̂ 2
n−2
Ex: β̂0 = b0 = 2.81; β̂1 = b1 = 0.177
hsuhl (NUK)
LR Chap 1
41 / 41
Download