Word Document - Mysmu .edu mysmu.edu

advertisement
Page
-1
Econ107 Applied Econometrics
Topic 4: Hypothesis Testing
(Studenmund, Chapter 5)
I. Statistical Inference: Review
Statistical inference “... draws conclusions from (or makes inferences about) a
population from a random sample taken from that population.”
A population is the ‘universe’ or the total number of observations.
A sample is a subset of a given population.
For example, we could compute the average income for Singapore households
during the last year:
1
3000
i=1 I i = I
3000
where I is household income. I is a sample statistic of the population parameter
E(I) which is the average income for entire households in this country.
There are 2 steps in this process: Estimation and Hypothesis Testing.
The idea is that there is some underlying distribution to our random variable,
household income.
Although it is unlikely to be the case, assume that income is distributed normally:
I ~ N(  I ,  2I )
with population mean and variance.
If this is the case, then the ‘point estimate’ I is distributed:
I ~ N(  I ,
where ‘n’ is the sample size.
 2I )
n
Page
-2
Which we can turn into a ‘standardized’ normal distribution by subtracting the
population mean and dividing by the standard deviation of the estimator.
Z=
I - I
I / n
~ N(0, 1)
Of course, we don’t know the population standard deviation σI, but we can replace
it with the sample standard deviation (S):
̂ = S =
1
2
n
 i=1 ( I i - I )
n -1
and rewrite the earlier expression:
t=
I - I
S/
n
which follows a t distribution with (n-1) degrees of freedom.
We can look up the critical t values in the Table for the t-distribution:
Prob(-2.576  t  2.576) = 0.99
Substitute in the earlier expression for t and rearrange terms:
Prob( I -
2.576 S
n
 I  I +
2.576 S
n
) = 0.99
This is the Confidence Interval around the unknown population parameter μI.
Interpretation: there is a 99% probability that this ‘random interval’ contains the
true μI. Confidence level=100%-Significance level.
Suppose I = 40 (i.e., the average household receives $40,000 per year), and the
sample standard deviation is 10.
40 -
25.76
25.76
  I  40 +
=
54.772
54.772
39.530   I  40.470
Page
-3
Properties of Point Estimators
•
Linearity. An estimator is linear if it’s a linear function of the observations
in the sample.
I=
•
1
1
3000
( I1 + I 2 + ... + I n )
i=1 I i =
3000
3000
Unbiasedness. An estimator is unbiased if in repeated samples the mean
value of the estimators is equal to the true parameter value.
E( I ) =  I
•
Efficiency. This has to do with the ‘precision’ of our estimator. An
estimator is efficient if it has the smallest variance among other linear
unbiased estimators. It’s BLUE (Best Linear Unbiased Estimator).
•
Consistency. An estimator is ‘consistent’ if it approaches the true
parameter value as the sample size gets larger.
Note that the properties of unbiasedness and consistency are conceptually very
different. Unbiasedness can hold for any sample size. Consistency is strictly a
large-sample property.
Suppose we want to ‘test the hypothesis’ that the true mean income of households
is equal to some value. For example, we might test the null hypothesis that it’s
equal to $41,000.
H 0 :  I = 41.0
H 1 :  I  41.0
This is known as a two-sided alternative hypothesis. An example of a one-sided
alternative hypothesis is that this true parameter is greater than $41K.
II. Three Typical Approaches to Hypothesis Testing: Review.
1. The Interval Approach.
We computed earlier the 99% confidence interval for μI.
39.530   I  40.470
Page
-4
Therefore, if a particular null hypothesis doesn’t lie within this interval, we can
reject it. The confidence interval is also known as the Acceptance Region. If not
within this interval, it lies in the Rejection Region.
Might want to say something about the different ‘types’ of mistakes that one can
make in these hypothesis tests.
•
‘Type I’ Error. Rejecting a null hypothesis when it’s true. Not always
going to be right.
Type I Error = Prob(Rejecting H0 | H0 is True) = α.
where this is a ‘conditional probability’, and α is .01. This is the
‘significance level’ chosen.
•
‘Type II’ Error. Accepting a null hypothesis when it’s false. Suppose our
null had been that the mean household income was $40,250. This clearly
lies with the confidence interval. We’d not reject the null. Yet, the truth is
it might be $41,000.
Type II Error = Prob(Accepting H0 | H0 is False) = β.
This probability has been traditionally assigned the Greek letter β (do not confuse
this β with β’s used in the regression model). One minus this value is called the
Power of the Test. The classical approach is to set α to be a small number and
minimise β (or maximise the power of the test given the confidence level).
2. The Significance Test Approach.
We know that the statistic computed from the sample follows a t distribution with
n-1 degrees of freedom.
t
I  I
S/ n
Let’s take the earlier example. Suppose we want to know whether or not μI is
equal to 41.
t=
40 - 41
 - 5.477
10 / 3000
Page
-5
Can we reject the null? Depends on the confidence level. Suppose again that we
set α=.01. Again, it’s a two-sided test.
Pr ob(| t | 2.576)  0.01
Clearly, the absolute value of our computed t exceeds this critical value. We reject
the null that the true mean household income is $41,000.
What about another null, for example that the population mean is equal to $40,250?
Compute the t statistic as:
t=
40 - 40.25
 - 1.369
10 / 3000
Clearly, here we’d be unable to reject this null.
In econometrics, more often than not, we test the null that the estimator is equal to
zero. In other words, whether or not an estimated coefficient is significantly
different from zero. The terminology is that the estimated coefficient is
‘statistically significant’. In context of this example, this is written:
t=
I
S / n
where the numerator is the estimated coefficient and the denominator is the
estimated standard error.
3. The ‘P Value’
The problem with classical hypothesis testing is that one has to choose a particular
significance level. This is quite arbitrary. Conventionally, α is set equal to 0.10,
0.05 and/or 0.01.
The way around this is to compute the Exact Significance Level or P Value. This
is the largest significance level at which the null cannot be rejected (or the lowest
significance level at which the null can be rejected). Consider the earlier example.
Instead of specifying α and looking up the critical value, you plug in the computed
t statistic as the critical value and ‘look up’ the corresponding α.
Page
-6
Prob( | t | > 5.477) = .0001
In this case, 5.477 is ‘just significant’ at a .01% level.
A more typical result might be the following:
Prob( | t | > 2.196) = .0355
This says that this test is significant at a 3.55% level. At 5%, the null is rejected,
but at the 1% level, we cannot reject the null.
Inference about the population variance:
The sample variance S 2 is an unbiased, consistent and efficient point estimator
for  . The statistic
2
(n  1) S 2
2
has a distribution called Chi-square with (n-1)
degrees of freedom, if the population is normally distributed.
Inference about the ratio of two population variances:
The parameter to be tested is  12 /  22 . The statistic used is
S12 /  12
which has a
S 22 /  22
distribution called F with degrees of freedom being n1  1 and n2  1, if the two
populations are normally distributed.
III. Hypothesis Testing in Linear Regression Models
Suppose we want to test the hypothesis in linear regression models,
H 0 : 1 = 0
H 1 : 1  0
Why is the test important? How to test it? Note that we typically put the value(s)
that we do not expect in the null hypothesis and put the values that we expect to
be true in the alternative hypothesis.
We note that
ˆ
where Var( ̂ 1 ) =  2 .
2
 xi
1
~ N(  , Var ( ˆ ) )
1
1
Page
-7
So by standardisation, we have
ˆ1  1
~ N (0,1)
 ( ˆ1 )
We can use this sampling distribution to make statistical inferences. However,
when the standard deviation of OLS estimator is unknown, this sampling
distribution is not applicable. We can substitute sigma by its estimator and get a
new statistic (with a new sampling distribution)
ˆ1 -  1 ( ˆ1 -  1 )  xi2
t=
=
~ t ( n K 1)
ˆ
s( ˆ1 )
where s( ˆ1 ) =
ˆ
 xi
2
(in SLRM), K is the number of independent variables. This
is a t distribution with n-K-1 degrees of freedom.
We’ve gone through the ‘general mechanics’ already, so let’s use a specific
numerical example to see how we would proceed. To illustrate, suppose we want
to put a ‘confidence interval’ around  1 and suppose we had a cross section of 10
households, and we wanted to estimate our linear consumption function. We get
the following:
Cˆ = 24.455 + .509 DI i
(6.414) (.036)
n = 10 df = 8 ̂ 2 = 42.159 R 2 = .962
We want to place a confidence interval around the estimated slope coefficient.
Need to choose a confidence level. Suppose 95% (α=.05). Two-tailed test. Look
it up in the table. Critical value is 2.306. Now recall the general expression for the
confidence interval.
Prob[ ˆ1 - t/2 s( ˆ1 )   1  ˆ1  t/2 s(ˆ1 ) ] = 1 - 
.509 - (2.306) (.036)   1  .509 + (2.306) (.036)
.4268   1  .5914
Page
-8
Plugging in the values and estimates from above we get:
Might want to construct the 95% confidence interval for β0. You should get the
following:
9.664   0  39.245
Now suppose I want to perform a two-sided test.
H 0 :  1 = 0.3
H 1 :  1  0.3
We want to know whether or not the true MPC is 0.3, where the alternative
hypothesis is that it’s something other than 0.3. Compute the t statistic from the
general formula above:
t=
.509 - .3
= 5.806
.036
which is significant at better than a 5% level (critical value given above as 2.306),
and 1% level (critical value 3.355). Reject the null.
Page
-9
the density function for the t variable
Now suppose I want to perform a one-sided test.
H 0 :  1  0.4
H 1 : 1 > 0.4
We want to know whether or not the true MPC is less than or equal to 0.4, where
the alternative hypothesis is that it’s greater than 0.4.
Compute the t statistic:
t=
.509 - .4
= 3.028
.036
but note that the formula and procedure are identical to the two-sided test. Only
the upper confidence limit will be changed. All 5% of the area is in the upper tail.
Page
- 10
IV. Do NOT Over Use t-Test
1. The t-test does not say if the model is economically valid. Examples: regress
stock prices on intensity of dog barking, regress the consumer price index in
Singapore on the rainfall in UK.
2. The t-test does not test importance of an independent variable.
V. Further Use of t-Test: Testing a Linear Restriction
In this example, begin with a Cobb-Douglas production function:
Y i =  Li 1 K i 2 e i

where:


Yi = Output.
Li = Labour input.
Ki = Capital input.
The natural logs of the variables can be written in linear form:
ln Yi   0  1 ln Li   2 ln K i   i
where  0  ln 1 .
Page
- 11
An example of a linear restriction would be the following:
H 0 : 1   2  1
This is a test of 'Constant Returns to Scale'.
Use the following t-test:
t=
( ˆ1 + ˆ 2 )  (  1 +  2 )
( ˆ1 + ˆ 2 )  1
=
SE( ˆ1 + ˆ 2 )
VAˆ R( ˆ1) + VAˆ R( ˆ 2)  2COˆ V ( ˆ1 , ˆ 2)
Compute this t statistic. If it exceeds critical value, reject H0.
VI. Testing for Overall Insignificance for MLR
In a MLR with K independent variables, you want to test:
H 0 : 1   2     K  0
H 1: At least one of these beta’s is not zero.
The t-test can’t be used to test overall significance of a regression model. In
particular, we can't simply take the product of the individual tests. The reason is
that the same data are used to estimate the two coefficients. They are not
independent on one another. It’s possible for the coefficients to be individually
equal to zero, and yet jointly they might be different from zero. Although R 2 or R 2
measure the overall degree of fit of an equation, they are not a formal test.
Begin with Analysis of Variance (ANOVA). Typical table.
SS
Df
ˆ1  y i x1i + ˆ2  y i x 2i
ESS
ˆ1  yi x1i + ˆ2  yi x2i
K
K
 ei2
n – K-1
 ei2
n - K 1
 yi2
n–1
RSS
TSS
MSS
Page
- 12
When  i is normally distributed, we can construct the following variable:
ESS
ˆ1  yi x1i + ˆ 2  yi x2i
df
(n - K  1 ) R 2
K
=
F = ESS =
RSS
 ei2
K(1 - R 2 )
df RSS
n - K 1
This is the ratio of 2 chi-square distributed variables. It has an F distribution with
K and n-K-1 degrees of freedom.
If the computed F statistic exceeds some critical value, we reject H0 that the slope
coefficients are simultaneously equal to zero.
Note that F statistic is closely related to R 2 . If R2=0, F=0. If R2=1, then F  .
Example: Woody’s.
2
Yˆ i = 102,192  9075 N i + 0.355Pi  1.288I i , R 0.618, n  33
F=
29 ( .618 )
 15.64
3(1 - .618 )
Since this exceeds the critical value at a 5 percent significance level (2.93), we can
reject H0 that all slope coefficients are simultaneously equal to zero or the model
is overall insignificant.
VII. Further Use of F-Test: Assessing the Marginal Contribution of
Regressors
Practical question: How do we know if other explanatory variables should be
added to our regression? How do we assess their 'marginal contribution'? Theory
often to weak. What does the data tell us?
Suppose we have a random sample of 100 workers. We first obtain these results:
Page
- 13
lˆnW i = .673 + .107 S i
(.013)
2
R = .405
where Wi is the wage rate, and Si is the number of years of schooling or education
completed (standard error in parentheses).
I want to know whether 'labour market experience' (Li) should be added as a
quadratic expression in this regression.
This involves ‘polynomial regressions’. Couldn’t discuss it earlier, because
requires more than one independent variable. This is a ‘second-degree’
polynomial (includes X and X2). A ‘third-degree’ polynomial includes X, X2 and
X3 .
2
2
lˆnW i = - .078 + .118 S i + .054 Li - .001 Li R = .433
(.016) (.026) (.001)
We obtain these results:
This means that the wage functions are ‘concave’ from below in terms of log
wages and experience (holding education constant).
No real estimation issues here. Still linear in the parameters. However, L and L2
will tend to be highly correlated.
To assess the incremental contribution of both Li and Li2, we use the following F
test (i.e., H0: β2=β3=0).
2
2
RSS R - RSS UR
RUR - R R
m
m
=
F=
2
RSS UR
1 - RUR
n - K 1
n - K 1
where K = number of slope coefficients in the unrestricted regression.
RUR2 = Coefficient of determination in the
'unrestricted' regression.
2
RR = Coefficient of determination in the
'restricted' regression.
m = Number of linear restrictions (m=2 in this example).
Page
- 14
.433 - .405
.014
2
F=
=
= 2.369
1 - .433
.00591
96
With a 5% confidence level, the critical value is 3.10. Since our F statistic is less
than this value, we can't reject H0. Experience and experience squared have an
insignificant effect in our regression. No statistical evidence for their inclusion.
Note that this F-test can be used to deal with a null hypothesis that contains
multiple hypotheses or a single hypothesis about a group of coefficients.
VIII Questions for discussion: Q5.11
IX Computing Excise: Q5.16
Download