The method of least squares

advertisement
CIS 2033 based on
Dekking et al. A Modern Introduction to Probability and Statistics. 2007
Slides by Kier Heilman
Instructor Longin Jan Latecki
C22: The Method of Least Squares
22.1 – Least Squares
Consider the random variables: Yi = α + βxi + Ui for i = 1, 2, . . ., n.
where random variables U1, U2, …, Un have zero expectation and
variance σ 2
Method of Least Squares: Choose a value for α and β such that
S(α,β)=( ∑ ( y − α− β x )) is minimal.
n
2
i
1
i
22.1 – Regression
The observed value yi corresponding to xi and the value α+βxi on the
regression line y = α + βx.
22.1– Estimation
 After some calculus magic, we have the following two
simultaneously equations to estimate α and β:
n
n
i= 1
i= 1
n α + β ∑ x i= ∑ y i
n
n
n
i= 1
i= 1
i= 1
α ∑ x i+ β ∑ x 2i = ∑ x i y i
22.1– Estimation
 After some simple algebraic rearranging, we put the equations in terms
of α and β:
̂β= n ∑ xi y i− ( ∑ xi )( ∑ y i )
2
2
n ∑ x i − (∑ x i)
̂α= ȳ n− β̂ x̄ n
(slope)
(intercept)
22.1– Least Square Estimators are
Unbiased
 All estimators for α and β are unbiased.
 For the simple linear regression model, the random variable
n
̂σ 2=
1
2
̂
(Y
−
̂α
−
β
x
)
∑
i
n− 2 i= 1 i
is an unbiased estimator for δ2.
22.2– Residuals
 Residual: The vertical distance between the ith point and the estimated
regression line:
̂ i
r i= yi− ̂α− βx
n
The sum of the residuals is zero.
r i= 0
∑
i= 1
22.2– Heteroscedasticity
 Homoscedasticity: The assumption of equal variance of the Ui (and
therefore Yi).
For instance, heteroscedasticity occurs when Yi with a large expected
value have a larger variance than those with small expected values.
22.3– Relation with Maximum
Likelihood
What are the maximum likelihood estimates for α and β?
To apply the method of least squares no assumption is needed about the
type of distribution of the Ui. In case the type of distribution of the Ui is
known, the maximum likelihood principle can be applied. Consider, for
instance, the classical situation where the Ui are independent with an
N(0, σ2) distribution.
Using the maximum likelihood estimation for a normal distribution: Yi
has an N (α + βxi, σ2) distribution, making the probability density
function
22.3– Maximum Likelihood
For fixed σ >0 the loglikelihood
l
(α, β, σ) obtains the maximum when
n
∑1 ( yi− α− β x i)2
is minimal. Hence, when random variables independent with a N(0,δ2)
distribution, the maximum likelihood principle and the least squares
method return the same estimators.
The maximum likelihood estimator for σ 2 is:
n
1
2
̂
̂σ = ∑ (Y i− ̂α − β xi )
n i= 1
2
Download