Uploaded by votriet9663

1020 formulas

advertisement
ECMT1020 Introduction to Econometrics
Semester 1, 2024
Formulas and statistical tables
Two random variables. Let X and Y be two random variables with expected values µX and
2 and σ 2 .
µY , and variances σX
Y
1. Covariance:
σXY := Cov(X, Y ) = E[(X − µX )(Y − µY )] = E(XY ) − E(X)E(Y ).
2. Correlation coefficient:
ρXY := Corr(X, Y ) =
σXY
.
σX σY
Expected value, variance, covariance rules. In the following, b is any constant, X, Y, V, W
are any random variables.
1. E(X + Y ) = E(X) + E(Y ).
2. E(b) = b.
3. E(bX) = bE(X).
4. Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y ).
5. Var(b) = 0
6. Var(bX) = b2 Var(X).
7. Cov(X, Y ) = Cov(Y, X).
8. Cov(X, V + W ) = Cov(X, V ) + Cov(X, W ).
9. Cov(X, bY ) = bCov(X, Y ).
10. Cov(X, b) = 0.
Estimators. Let X and Y be two random variables. Let {X1 , . . . , Xn } be a sample of X, and
{Y1 , . . . , Yn } be a sample of Y . Below are the commonly used sample estimators.
1. Sample mean: X = n1
Pn
i=1 Xi
2 = 1
2. Sample variance: σ̂X
n−1
Pn
2
i=1 (Xi − X)
1
3. Sample covariance: σ̂XY = n−1
Pn
i=1 (Xi − X)(Yi − Y )
Pn
i=1 (Xi −X)(Yi −Y )
√
4. Sample correlation: ρ̂XY = √σ̂XY
=
P
Pn
n
2 σ̂ 2
2
2
σ̂X
Y
i=1 (Xi −X)
i=1 (Yi −Y )
Hypothesis tests of a normal sample. Let {X1 , . . . , Xn } be a random sample of X which
follows a normal distribution with mean µ and variance σ 2 . We would like to test H0 : µ = µ0
for some µ0 .
X−µ
0
√0
1. If σ 2 is known, then we use a z statistic: z = X−µ
σX = σX / n which follows the standard
normal distribution, under the null hypothesis.
2. If σ 2 is unknown (meaning it needs to be estimated), then we use a t statistic: t =
X−µ0
X−µ
√0
σ̂X = σ̂X / n which follows t distribution with degrees of freedom n − 1, under the null
hypothesis.
Simple regression analysis. Consider a simple regression model Y = β1 + β2 X + u which
satisfies CLRM assumptions. We fit the regression by OLS procedure using a random sample
of (X, Y ) with n observations.
1. OLS estimators:
Pn
β̂1 = Y − β̂2 X,
β̂2 =
i=1 (Xi − X)(Yi − Y )
.
Pn
2
i=1 (Xi − X)
2
!
2. The variances of β̂1 and β̂2 :
σβ̂2 = σu2
1
X
1
+ Pn
2
n
(X
i − X)
i=1
,
σu2
.
2
i=1 (Xi − X)
σβ̂2 = Pn
2
Multiple regression analysis. Consider a multiple regression model with k − 1 explanatory
variables
Y = β1 + β2 X2 + · · · + βk Xk + u,
which satisfies CLRM assumptions. Given a sample of n observations, the fitted regression is
Ŷ = β̂1 + β̂2 X2 + · · · β̂k Xk . Note: simple regression is a special case of multiple regression with
k = 2.
1. Goodness of fit:
ESS
TSS
RSS/(n − k)
2
R =1−
TSS/(n − 1)
R2 =
Adjusted R2 :
2. t statistic for testing H0 : βj = βj0 :
t=
β̂j − βj0
s.e.(β̂j )
∼ tn−k ,
for j = 1, 2, . . . , k, where tn−k denotes the t distribution with degrees of freedom n − k.
3. F statistic for testing the joint explanatory power of the regression model:
F (k − 1, n − k) =
ESS/(k − 1)
R2 /(k − 1)
=
∼ Fk−1,n−k ,
RSS/(n − k)
(1 − R2 )/(n − k)
where Fk−1,n−k denotes the F distribution with degrees of freedom k − 1 and n − k.
4. “Generalized” F statistic for testing general null hypothesis of certain linear restrictions
on the parameters (restricted model against unrestricted model):
F (extra DF, DF remaining) =
improvement in fit/extra DF
RSS remaining/DF remaining
5. For a regression model with two explanatory variables: Y = β1 + β2 X2 + β3 X3 + u, the
standard error for OLS estimator β̂2 is
s
s.e.(β̂2 ) =
σ̂u2
1
.
×
2
2
1
−
ρ̂
X)
(X
−
i
X2 ,X3
i=1
Pn
1 Pn
2
2
where σ̂u2 = n−k
i=1 ûi is the unbiased estimator for σu , and ρ̂X2 ,X3 is the sample
correlation between X2 and X3 .
Heteroskedasticity. Consider a linear regression with k parameters and sample size n.
1. Test statistic of Goldfeld-Quandt test:
F (n∗ − k, n∗ − k) =
RSS2
RSS1
where n∗ is the size of the first and the last subsamples, and RSS1 and RSS2 are the RSS
of the subregressions using the first and last subsamples, respectively.
2. White test statistic for heteroskedasticity:
W = nR2 ∼ χ2k−1
where R2 is the R2 from a particular regression.
Download