Chapter 5

advertisement
AAEC 4302
ADVANCED
STATISTICAL METHODS IN
AGRICULTURAL RESEARCH
Part II: Theory and Estimation of Regression Models
Chapter 5: Simple Regression Theory
1
Planted Acres
16
Population Line:
E[Y] = B0+B1X
Planted Acres
15
Yi = E[Yi]+ui
14
ui
13
12
11
E[Yi] = B0+B1Xi
10
9
50
55
Xi
60
65
70
Previous Year Price
75
2
80
Planted Acres
16
Population Line:
E[Y] = B0+B1X
Planted Acres
15
14
^ +e
Yi = Y
i
i
Estimated Line:
^=B
^ +B
^ X
Y
0
1
13
12 ui
ei
11
10E[Y ]
i
^Y = B^ +B^ X
i
0
1 i
9
50
55
Xi
60
65
70
Previous Year Price
75
3
80
Planted Acres
16
^=B
^ +B
^ X
Y
0
1
Planted Acres
15
ei
14
ei
13
ei
ei
12
ei
11
ei
10
9
50
55
Xi
60
65
70
Previous Year Price
75
4
80
The Ordinary Least Squares (OLS) Method
• In the Ordinary Least Squares (OLS) method,
the criterion for estimating β0 and β1 is to make
the sum of the squared residuals (SSR) of
the fitted regression line as small as possible
i.e.:
n
Minimize SSR = minimize  ei 2
i 1
= minimize  Y
i
= minimize  
ˆ
Y
i

2
ˆ B
ˆ X
Yi  B
0
1 i

2
5
The Ordinary Least Squares (OLS) Method
• The OLS estimator (formulas) are:
 X iYi   X i  Yi
n
Bˆ1 
2
2
n X i   X i 
 X  X Y  Y 

 X  X 

i
i
2
i

X

X

Y
Bˆ 
 X  X 
(5.12)
Bˆ 0  Y  Bˆ1 X
(5.13)

i
i
2
1
i
6
The Ordinary Least Squares (OLS) Method
•
The regression line estimated using the OLS
method has the following key properties:
1.
 e   Y  Bˆ
i
i
0

ˆ X 0
B
1 i
(i.e. the sum of its residuals is zero)
2.
3.
It always passes through the point Y, X 
The residual values (ei’s) are not correlated
with the values of the independent variable
(Xi’s)
7
Interpretation of the Regression Model
• Assume, for example, that the estimated or
fitted regression equation is:
ˆ
Y
i
 3.7  0.15Xi
or
Yi = 3.7 + 0.15Xi + ei
8
9
Interpretation of the Regression Model
Yi = 3.7 + 0.15Xi + ei
ˆ = 0.15 indicates that if the
• The value of B
1
cotton price received by farmers this year
increases by 1 cent/pound (i.e. X=1), then this
year’s cotton acreage is predicted to increase by
0.15 million acres (150,000 acres).
10
Interpretation of the Regression Model
Yi = 3.7 + 0.15Xi + ei
• The value of Bˆ 0 = 3.7 indicates that if the
average cotton price received by farmers was
ˆ =0), the cotton acreage planted this
zero (i.e. X
year will be 3.7 million (3,700,000) acres;
sometimes the intercept makes no practical
sense.
11
Measures of Goodness of Fit
•
There are two statistics (formulas) that
quantify how well the estimated regression
line fits the data:
1.
The standard error of the regression (SER)
(Sometimes called the standard error of the estimate)
2.
R2 - coefficient of determination
12
Measures of Goodness of Fit
• The SER slightly differs from the standard deviation
of the ei’s (by the degrees of freedom):
n
S
2


e

e
 i
i 1
n 1
n
SER 
 ei
2
i 1
n2
(5.20)
13
Measures of Goodness of Fit: The R2
2


 ei
2
i 1
R  1  n

2
 i1 Yi  Y  
n
  ei2 
 n i 1

2
 i1 Yi  Y  
n
The term on the left
measures the proportion of the total variation in
Y not explained by the model (i.e. by X)
• Thus, the R2 measures the proportion of the
total variation in Y that is explained by the
model (i.e. X)
14
Properties of the OLS Estimators
•
The Gauss-Markov Theorem states the
properties of the OLS estimators; i.e. of the:
ˆ and B
ˆ
B
0
1
They are unbiased : E[B0 ]= Bˆ 0
ˆ
E[B1]= B
1
and
15
Properties of the OLS Estimators
And if the dependent variable Y (and thus the
error term of the population regression
model, ui) has a normal distribution, the OLS
estimators have the minimum variance
16
Properties of the OLS Estimators
•
BLUE – Best Linear Unbiased Estimator
•
Unbiased
^
^
=> bias of βj = E(βj ) - βj = 0
•
Best Unbiased
=> minimum variance & unbiased
•
Linear
=> the estimator is linear
17
Download