Uploaded by abnerteng16

econometric

advertisement
Contents
1 Regression Topic
1.1 Linear Transformation . . . . . . . . .
1.2 Binary Variable . . . . . . . . . . . . .
1.3 Relaxing The Assumption . . . . . . .
1.4 Heteroskedasticity . . . . . . . . . . .
1.5 Non-Constant term . . . . . . . . . . .
1.6 Serial Correlation . . . . . . . . . . . .
1.7 Omitted Variables . . . . . . . . . . .
1.8 Multicollinearity and Non-Linear effect
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
3
4
6
7
8
9
2 Other Econometric Models
2.1 Binary dependent variable regression - LPM . . . . . .
2.2 Binary dependent variable regression - Logistic . . . .
2.3 Binary dependent variable regeression - Probit Model
2.4 Instrumental variable regression . . . . . . . . . . . . .
2.5 Panel data regression . . . . . . . . . . . . . . . . . . .
2.6 causal inference and Interaction effect . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
12
12
13
15
15
3 Time series
3.1 Lag Operator . . . . . . .
3.2 AR model . . . . . . . . .
3.3 MA model . . . . . . . . .
3.4 Out of sample forecasting
3.5 Non-stationary time series
3.6 martingale series . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
15
15
15
15
15
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 additional porpositions
15
4.1 Omitted variables - redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Heteroskedasticity - HC standard error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Sample forcasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
*All rights reserved.
*Appreciate to
PhD. Cheng-Che, Hsu
PhD. Shu-Sheng, Chen
PhD. Shi-Hsun, Hsu
PhD. Chung-Min, Kuan
1
1 Regression Topic
1.1 Linear Transformation
Basic Linear Transformation
Xi∗ = aXi + b, Yi∗ = cYi + d
then
c
bc
β̂ ∗ = β̂, α̂∗ = cα̂ + d − β̂
a
a
proof
X̄ ∗ = aX̄ + b , Ȳ ∗ = cȲ + dβ̂ ∗ =
Pn
∗ 2
i=1 (ûi )
•
= c2 ,
Pn
2
i=1 (ûi )
|ac|Sx∗ y∗
c
= β̂
2
a S x∗
a
= c2 × SSE
Demean
Yi∗ = (Yi − Ȳ ) , Xi∗ = (Xi − X̄)
then
proof
α̂∗ = 0 , β̂ = β̂ ∗
Yi∗ + Ȳ = α̂ + β̂(X̂i∗ + X̄) ⇒ Yi∗ = (α̂ + β̂ X̄ − Ȳ ) + β̂ X̂i∗ = β̂ X̂i∗
that is
α̂ + β̂ X̄ = Ȳ
Standarlize
Yi∗ =
(Yi − Ȳ )
(Xi − X̄)
, Xi∗ =
SY
SX
2
then
<proof>
α̂∗ = 0 , β̂ ∗ = rXY
Yi∗ SY + Ȳ = α̂ + β̂(Xi∗ SX + X̄) , Yi∗ SY = (α̂ + β̂ X̄ − Ȳ ) + β̂Xi∗ SX
Yi∗ =
(α̂ + β̂ X̄ − Ȳ )
SX ∗
+ β̂
X = rXY Xi∗
SY
SY i
that is
α̂ + β̂ X̄ = Ȳ , β̂
SX
= rXY
SY
• Linear Transformation won’t change coefficient of correlation
∗
2
2
R2 = R2 , rXY
= rX
∗Y ∗
1.2 Binary Variable
estimation by analogy
Easy to use
<proof>
Y = α + βX + u
α = E(Y |X = 0) , α + β = E(Y |X = 1)
β = E(Y |X = 1) − E(y|X = 0)
E(Y |X = 0) = Ȳ0 , E(Y |X = 1) = Ȳ1
3
Equivalence test
Sp2 ≡ SE(β̂)
φ∗ =
SE(β̂) =
β̂ − 0
SE(β̂)
v
u
u
tP
=
Y1 − Y0 − (µ1 − µ0 )
q
Sp2 ( n10 +
σ̂ 2
=
n
2
i=1 (Xi − X̄)
1
)
n1
s
Sp2 (
1
1
+ )
n0 n1
1.3 Relaxing The Assumption
• Throw two assumptions of Gauss - Markov Theorem
1. Let X is a random Variable
2. ui doesn’t subject to Normal Distribution
• Three points : E(β̂|X), V ar(β̂|X), p limn→∞ β̂
σ2
E(β̂|X) = β = E(β̂) , V ar(β̂|X) = Pn
2
i=1 (Xi − X̄)
<proof>
β̂ =
n
X
d i ui + β ,
n
X
di = 0 ,
E(β̂|X) = E(β +
d i Xi = 1 ,
n
X
di ui |X) = β +
i=1
V ar(β̂|X) = V ar(β +
n
X
di ui |X) =
i=1
lim β̂ = β +
n→∞
1
2
i=1 (Xi − X̄)
d2i = Pn
n
X
di E(ui |X) = β
i=1
n
X
d2i V ar(ui |X) =
i=1
n
X
i=1
Pn
(Xi − X̄)ui
lim Pi=1
n
2
n→∞
i=1 (Xi − X̄)
n
X
i=1
i=1
i=1
i=1
n
X
Pn
σ2
2
i=1 (Xi − X̄)
d2i σ 2 = Pn
(Xi − X̄)ui /n p
E(X̄ − µX )u
→β+
2
V ar(X)
i=1 (Xi − X̄) /n
= β + n→∞
lim Pi=1
n
4
=β+
E(X̄ − µX )(ū − µu )
Cov(X, u)
=β+
=β
V ar(X)
V ar(X)
• when E(u|X) = 0 , OLS estimator is unbiased and consistent
• when E(u|X) = k , Cov(X, u) = Cov(X, E(u|X)) = Cov(X, k) = 0 , β̂ is consistent.
1.4 Heteroskedasticity
Definition
The conditional variance of error term is X’s function
V ar(ui |Xi ) = σ 2 h(Xi )
• OLS estimator is not BLUE eneymore.
• t test and F test are not effective.
• Although OLS estimator is not BLUE, It is still unbiased and consistent.
Test the heteroskedasticity variance - Breusch-Pagan test
estimate
û2i = δ0 + δ1 X1i + δ2 X2i + ... + δk Xki + vi
1.
H0 : V ar(u|X) = σ 2 against H1 : V ar(u|X) ̸= σ 2
2.
A
φ = nR2 ∼ χ2 (k)
3.
given α, RR = φ|φ > χ2α (k)
4.
W hen φ∗ > χ2α (k), Reject H0
We can proof that
V ar(u|X) ̸= σ 2
5
HC standard error
SE(β̂)HC =
v
u n Pn
u
t n−2 i=1 (Xi
Pn
− X̄)2 u2i
[ i=1 (Xi − X̄)2 ]2
• Usually , SE(β̂)HC > SE(β̂) , so when we use HC standard error , test statisitc is usaully
smaller , significancy and power are also weaker.
WLS - Weighted Least Square
Regression model
Yi = α + βXi + ui , V ar(ui |Xi ) = h(Xi )σ 2
q
Yi
α
Xi
ui
=q
+ βq
+q
h(Xi )
h(Xi )
h(Xi )
h(Xi )
1.5 Non-Constant term
Regression model with no intercept
Yi = βXi + ui
Regression model
Yi = βXi + ui
Pn
X i Yi
2
i=1 Xi
β̂ = Pi=1
n
Pn
β̂ =
i=1 Xi Yi
Pn
2
i=1 Xi
=
n
X
i=1
Xi
Pn
i=1
Yi =
n
X
i=1
d i Yi =
6
n
X
i=1
di (βXi + εi ) = β
n
X
i=1
d i Xi +
n
X
i=1
d i εi
E(β̂) = E(β
n
X
di Xi +
i=1
N ote :
n
X
d i εi ) = β +
i=1
n
X
di εi = β...unbiased
i=1
n
X
sumn Xi
di = Pn i=1 2 ̸= 0
i=1 Xi
i=1
• So called as ”regression with zero intercept” , Model must go through ( 0 , 0 )
• Characteristic
n
SST ̸= SSR + SSE, because
X
ûi ̸= 0
i=1
1.6 Serial Correlation
Cause and effects
Yt = α + βXt + ut , Cov(ut , ut−s ) ̸= 0
• OLS estimator isn’t BLUE anymore, that is, OLS estimator is ineffiecient.
• t-test and F-test are invalid.
• OLS estimator is still unbiased and consistent.
7
Durbin-Watson test
test if there are AR(1) between error terms
AR(1) : ut = ρut−1 + εt
procedure
1.
H1 : ρ ̸= 0
H0 : ρ = 0
2.
PT
ρ=
t=2 (ût − ût−1 )
PT
2
t=1 ût
2
≃ 2(1 − ρ̂)
3.
RR = [ρ|ρ < dL or ρ > 4 − dL ] , 0 < dL < dU < 2
4.
when φd < dL or φd > 4 − dL ⇒ RejectH0
when dU < φd < 4 − dU ⇒ do not reject H0
when 4 − dU < φd < 4 − dL or dL < φd < dU ⇒ U ndef ined
disadvantages and limitations
• Durbin-Watson test can only test AR(1) model.
• the test has undefined region.
• can’t exist Yt−p , p ≥ 1
8
1.7 Omitted Variables
characteristic
• Satisfy Cov(X, Z) ̸= 0 , γ ̸= 0
• If model has omitted variable, error terms can’t be endogenous.
Model Y = α + βX + γZ + u
SLR : Ŷ = α̂ + β̂X...T otal Ef f ect
M LR : Ỹ = α̃ + β̃X + γ̃Z...Direct Ef f ect
AR : Ẑ = θ̂ + δ̂X...Indirect Ef f ect
β̂ = β̃ + δ̂γ̃
According to WLLN & CMT
Pn
β̂ =
i=1 (Xi − X̄)(Yi − Ȳ )/n p
=
Pn
2
i=1 (Xi − X̄) /n
Cov(X, Y )
Cov(X, α + βX + γZ)
γCov(X, Z)
=
=β+
V ar(X)
V ar(X)
V ar(X)
• γCov(X, Z) > 0 → β̂ overrated
• γCov(X, Z) = 0 → β̂ unbiased
• γCov(X, Z) < 0 → β̂ underrated
Biased
Pn
β̂ =
i=1 (Xi − X̄)(Yi −
Pn
2
i=1 (Xi − X̄)
Ȳ )
=
n
X
i=1
d i Yi =
n
X
di (α + βXi + γZi + ui ) = β + γ
i=1
n
X
d i Zi + ui
i=1
n
X
(Xi − X̄)(Zi − Z̄)
→ given Xi , Zi ,
di Zi is constant
d i Zi =
Pn
2
i=1 (Xi − X̄)
i=1
i=1
i=1
n
X
n
X
9
E(β̂|Xi , Zi ) = E(β + γ
n
X
d i Zi +
i=1
n
X
d i ui ) = β + γ
i=1
SXZ
2
SX
1.8 Multicollinearity and Non-Linear effect
How to test highly multicollinearity?
• Other coefficients’ estimator has big difference when increase / decrease an explanatory variable.
• F − test is significant but t − test isn’t significant.
1
• V IFj = 1−R
2 , When V IF > 10 , highly multicollinearity exists in the model.
j
Linear effect & Linear in parameter
• Linear Effect
∂E(Y |X)
= Constant
∂X
• Linear in parameter
∂E(Y |X)
̸= f unction of β
∂β
Polynomial regression model
• Quadratic Regression model
Y = β0 + β1 X + β2 X 2 + u ⇒
∂E(Y |X)
= β1 + 2β2 X(N ot Constant)
∂X
• Reciprocal Regression model
Y =α+β
1
1
+ u ⇒ −β 2 (N ot Constant)
X
X
• Other model
Y = αX β eu ⇒ ln Y = α + β ln X + u ⇒
∂E(ln Y | ln X)
= β(Constant)
∂ ln X
Y = c − e−(α+βX) + u ⇒ Can′ t be estimated by OLS estimator
10
Logarithmic Regression model
• Log - Linear Model
logY = α1 + β1 X + u → X change 1 unit, ∆Y /Y change β1
• Linear - Log Model
Y = α2 + β2 logX + u → X change 100%, Y change β2 unit
• Log - Log Model
logY = α3 + β3 logX + u → X change 100%, Y change 100β3 %
β3 =
∆Y /Y
∆Y X
=
= EY X
∆X/X
∆X Y
2 Other Econometric Models
2.1 Binary dependent variable regression - LPM
Introduction
P = E(Yi |Xi ) = P (Yi = 1|Xi ) = G(Xi β)
Y is a binary variable
Y ∼ Bernoulli(p), P (Y = 1) = p = E(Y ), V ar(Y ) = p(1 − p)
Let Yi |Xi ∼ Bernoulli[G(α + βXi )]
E(Yi |Xi ) = G(α + βXi ), V ar(Yi |Xi ) = G(α + βXi )[1 − G(α + βXi )]
• Heteroskedasticity exists, So this regression model isn’t efficient (Not BLUE), but still unbiased
and consistent.
Linear Probability Model(LPM model)
P (Yi = 1|Xi ) = G(α + βXi ) = α + βXi
11
• Heteroskedasticity exists in this model, So we need to use SEHC to test.
Disadvantages of LPM model
• It’s not propriate to assume the model as linear effect.
• We can’t use OLS to estimate, so the coefficient aren’t efficient, the test stat aren’t t-test too.
◦ To increase the efficiency of the estimator, we use WLS or GLS to estimate.
Yi = α + βXi + ui , ui = Yi − E(Yi |Xi ) = Yi − α − βXi
V ar(ui |Xi ) = (α + βXi )[1 − (α + βXI )] = h(Xi )
Let vi = q
ui
h(Xi )
→q
Yi
=q
h(Xi )
T arget F unction =
n h
X
q
i=1
α
βXi
ui
+q
+q
h(Xi )
h(Xi )
h(Xi )
Yi − (α̂ + β̂Xi )
i2
α̂ + β̂Xi [1 − (α̂ + β̂Xi )]
• The probability may >1 or <0.
Maximum Likelihood method
w ∼ Bernoulli(p), f (w) = pw (1 − p)1−w , w = 0, 1
yi is a binary
P (yi = 1|xi ) = G(xi β)
f (yi |xi ) = G(xi β)yi [1 − G(xi β)]1−yi , yi = 0, 1
L(β) =
n
Y
i=1
f (yi |xi ) =
n
Y
G(xi β)yi [1 − G(xi β)]1−yi
i=1
12
ln L(β) =
n
X
ln{G(xi β)yi [1 − G(xi β)]1−yi } =
i=1
n
X
{yi ln G(xi β) + (1 − yi ) ln[1 − G(xi β)]}
i=1
2.2 Binary dependent variable regression - Logistic
P (Yi = 1|Xi ) = G(α + βXi ) =
1
1+
e−(α+βXi )
• the fitted value of LPM (α + βXi ) ∈ [−∞, ∞], but P (Yi = 1|Xi ) ∈ [0, 1], so we transform it’s
probability to let the range ∈ [−∞, ∞].
p
p
1
p
∈ [0, ∞] ⇒ ln
∈ [−∞, ∞] ⇒ ln
= α + βXi ⇒ p =
1−p
1−p
1−p
1 + e−(α+βXi )
p
• 1−p
is called Odds.
• Maximum Likelihood method
p ∈ [0, 1] ⇒
g(z) =
G(z)[1 − G(z)] =
∂G(z)
e−z
=
∂z
(1 + e−z )2
1
e−z
1
(1
−
)
=
= g(z)
1 + e−z
1 + e−z
(1 + e−z )2
2.3 Binary dependent variable regeression - Probit Model
P (Yi = 1|Xi ) = Φ(α + βXi )
• Probit Model attempts that there is a latent Variable
Yi∗ |Xi = α + βXi + ui
Let ui ∼ N (0, 1) ⇒ Yi∗ |Xi ∼ N (α + βXi , 1)
13
Yi∗ can’t be obeserved, but we can observe Binary Variable Yi
(
Yi |Xi =
1, if Yi∗ |Xi > 0
0, if Yi∗ |Xi < 0
(1)
(2)
P (Yi = 1|Xi ) = P (Yi∗ |Xi ) > 0 = P [Yi∗ − (α + βXi ) > −(α + βXi )] = P [Z > −(α + βXi )]
= P (Z < α + βXi ) = Φ(α + βXi )
Beacuse
Φ(α + βXi ) ∈ (0, 1) ⇒ G(α + βXi ) = Φ(α + βXi ) = p
∂p
∂p ∂z
=
= g(z) × β = β × ϕ(α + βx)
∂x
∂z ∂x
• Goodness of fit analysis
McFadden’s pseudo - R2 = 1 −
ln L̂F ull
ln L̂Intercept
ln L̂F ull = Model’s Likelihood value
ln L̂Intercept = Intercept’s Likelihood value
2.4 Instrumental variable regression
Measurement error
• Endogeneity
In Empirical Searching, We can’t accept the estimator to be biased and inconsistency,
but inefficiency is acceptable.
◦ Cov(X, u) ̸= 0
p lim β̂ =
n→∞
Cov(X, Y )
Cov(X, α + βX + u)
Cov(X, α) + βV ar(X) + Cov(X, u)
=
=
V ar(X)
V ar(X)
V ar(X)
=β+
Cov(X, u)
̸= β...inconsistency
V ar(X)
◦ Factors causes Endogeneity
1. Omitted Variable
2. Measurement error
3. Simultaneity
14
• Omitted Variables
Y = α + βX + γZ + u, Cov(X, u) = Cov(Z, u) = 0, Let (γZ + u) = ε
Cov(X, ε) = Cov(X, γZ + u) = γCov(X, Z) + Cov(X, u) = γCov(X, Z) ̸= 0
• Measurement error
X ∗ = observed variable
X = realixed variable (can′ t be observed)
Let X ∗ = X + v ⇒ Y = α + βX + u = α + β(X ∗ − v) + u = α + βX ∗ + (u − βv)
Let u + βv = ε
Cov(X ∗ , ε) = Cov(X + v, u − βv) = Cov(X, u) − βCov(X, v) + Cov(u, v) − βV ar(v) ̸= 0
Y ∗ = observed variable
Y = realized variable, Let Y ∗ = Y + v
Y ∗ − v = α + βX + u ⇒ Y ∗ = α + βX + (u + v) = α + βX + ε
Cov(X, ε) = Cov(X, u+v) = Cov(X, u)+Cov(X, v) = 0...OLS estimators are still consistent and unbiased
V ar(ε) = V ar(u + v) = V ar(u) + V ar(v) > V ar(u)...SE(β̂) will be bigger
X̃ = best predictor of X,
Let v = X − X̃, Cov(X̃, v) = 0
◦ OLS estimator is still consistent and unbiased.
2 stage estimation(2SLS)
panel data regression
15
2.5 Panel data regression
2.6 causal inference and Interaction effect
3 Time series
3.1 Lag Operator
3.2 AR model
3.3 MA model
3.4 Out of sample forecasting
3.5 Non-stationary time series
3.6 martingale series
4 additional porpositions
4.1 Omitted variables - redundancy
4.2 Heteroskedasticity - HC standard error
4.3 Sample forcasting
16
Download