ec 2030 - University of Warwick

advertisement
EC 2030
THE UNIVERSITY OF WARWICK
Summer Examinations 2001/2002
Economic Statistics and Econometrics
Time Allowed: 3 Hours, plus 15 minutes reading time during which notes may be made (on
the question paper) BUT NO ANSWERS MAY BE BEGUN.
Answer ALL questions in SECTION A, ANY THREE questions from SECTION B and ONE
question from SECTION C. Section A carries 28 marks in total and each of the other questions
is worth 18 marks.
Statistical Tables and a Formula Sheet are provided.
Approved pocket calculators are allowed.
Read carefully the instructions on the answer book provided and make sure that the particulars
required are entered on each answer book.
Section A
1.
2.
Define the following:
(a)
power of a test.
(b)
significance level.
Two independent random samples are denoted A and B. In group A, 26 observations
yielded a sample mean of 10 and a sample standard deviation of 3. In group B a sample
of 22 observations yielded a sample mean of 13 and sample standard deviation 4. At the
5% significance level, test the hypothesis of no difference in the two means, assuming
that the underlying distributions are normal.
1
(Continued)
EC 2030
3.
Calculate the approximate power of the t-test calculated in question 2, given that
A-B=-3.0. What is the Type I error for this test?
4.
For the following regression model, estimated using annual data from 1940 to 1999,
Y t  0.021  0.920 X t  e t
(0.038)
(standard error in parentheses)
5.
(a)
Test at the 5% significance level that the slope coefficient is unity.
(b)
What is the forecast for year 2000, given that X2000 = 4.1?
(c)
A dummy variable Dt is added to the equation, taking value unity in 2000 and
zero otherwise. The regression is re-estimated over the period 1940 to 2000
using the data point X2000 = 4.1, but data for Y2000 is unavailable so the
researcher simply inputs a zero value for Y2000. Interpret the coefficient on Dt.
Consider the following regression model:
Yt  1   2 X 1t   3 X 2 t   t
rearranging equation (1) we get:
(1)
Yt  X 2 t   1   2 ( X 1t  X 2 t )   3 X 2 t   t
(2)
(a)
Express each of the parameters,  1 ,  2 ,  3 as a function of 1 ,  2 ,  3
(b)
Hence or otherwise show how one can test the hypothesis  2   3  1 using
only equation 2.
2
(Continued)
EC 2030
6.
Estimating a model by OLS for imports (m) against relative prices (domestic
prices/foreign prices) (p/p*), and real GDP (y) for the UK using quarterly data over the
period 1978:1 to 1997:4 yielded the results:
ln( mt )  0.045  0.5 ln( p t / p t* )  0.6 ln( y t )  e t
(0.042) (0.125)
(0.201)
(1)
, ln = natural log
.
R2  0.962 , SSE  0147
(standard errors in parentheses)
(a)
(b)
7.
Calculate the power of the test (at the 5% significance level) that the coefficient
on the variable ln(y) is zero, given that the true coefficient is 0.3.
1 t  1994 : 1  1997 : 4
The variable ln( p t / p*t ) * D t , where D t  
, is added to
otherwise
0
equation (1), which is then re-estimated. Interpret the coefficients of this new
equation.
Explain what the implications on the properties of the OLS estimators are for each
of the following:
(a) Omitted relevant variables.
(b) An outlier in one of the explanatory variables.
3
(Continued)
EC 2030
Section B
8.
(a)
The West Anglia Great Northern Railway (WAGN) and Chiltern Railways
(Chiltern) have both attempted to prevent recent rail maintenance work leading
to an increase in rail journey time. A random sample of 100 rail journeys were
taken for each company. For WAGN, the average journey time rose by 3
minutes, with a standard deviation of 15.5, whereas for Chiltern the average
journey time rose by 1.3 minutes, with a standard deviation of 16.1.
At the 5% significance level:
(b)
(i)
Separately test whether WAGN and Chiltern have been successful in
preventing increased journey times. Write a sentence summarising the
findings of these tests.
(ii)
Test whether WAGN has been more successful than Chiltern in
preventing an increase in journey time.
In 1994 two American economists claimed that male catering staff earned more
than females performing the same job. A random sample of 50 female catering
staff was found to earn an average of $16,000, with a standard deviation of
$2,600, while a sample of 40 male catering staff was found to earn an average of
$17,300, with a standard deviation of $3,000.
(i)
What is the power of a test at the 5% level of the American economists’
claim, given that male catering staff are truly paid £100 more than
female catering staff?
(ii)
Diagramatically represent the power and significance level of the test
conducted in (b) (i).
(ii)
At the 10% level, is there a significant difference between the two groups
in the variance of earnings?
4
(Continued)
EC 2030
9.
The following estimated equation was obtained by Ordinary Least Squares using
quarterly data for the period 1963:3-1992:4 inclusive:
y t  0.010  0.209 w t  0189
. x t  0187
. zt  et
(0.072) (0.058) (0.088)
Regression Sum of Squares (SSR) = 0.0035, Error Sum of Squares (SSE) = 0.0157
(standard errors are given in parentheses)
(a)
Test whether each of the coefficients are significantly different from zero at
both the 5% and 1% significance level.
(b)
Calculate (i) the coefficient of determination (R2) and (ii) the standard error
(  ) of the regression.
(c)
Test the significance of the regression.
(d)
Test the hypothesis that the coefficient on wt is equal and opposite to that on
zt, given that the covariance between the coefficients is -0.0015. Why might
the researcher want to impose this restriction?
(e)
Given that z t  w 3t , test the hypothesis that the marginal response of yt to wt is
zero at w t  1 , given that the covariance between the coefficients is-0.0015.
(f)
Without assuming any particular relationship between the variables, calculate
the SSE of the following restricted version of the model:
y t   0   1w t   2 x t  e t .
5
(Continued)
EC 2030
10.
A researcher collected seasonally adjusted quarterly data for the UK over the period
1979:1 to 2001:4 on the three variables
Lm = Natural log of real broad money (M3)
Ly = Natural log of real Gross Domestic Product (GDP)
r = 3 month Treasury bill rate
Table 1 contains the results, as reported by PcGive, for a money demand equation for
the UK using quarterly data over the period 1980:1-1999:4.
(i)
Provide answers for the 6 spaces, marked ??(x) in Table 1.
(ii)
Using the results of Table 1 and Figure 1, explain how you would proceed from
this initial general regression model to a more appropriate and parsimonious
model.
(iii)
Explain what other tests you might think of using to test the validity of your
preferred model.
TABLE 1
EQ(1) Modelling Lm by OLS
Variable
Coefficient
Constant
1.9304
Ly
0.1642
Ly_1
0.2287
Ly_2
0.2056
Ly_3
-0.1056
R
-0.0131
r_1
-0.0123
r_2
0.0098
r_3
-0.003
Lm_1
0.4532
Lm_2
-0.1921
Lm_3
0.0765
sigma
??(c)
R^2
??(d)
log-likelihood
-112.858
no. of observations
80
mean(y)
4.4986
AR 1- 4:
ARCH 4:
Normality:
Hetero Test:
RESET:
t-value
t-prob
1.961
0.050
1.462
??(a)
1.872
0.061
1.553
0.120
0.337
??(b)
-1.351
0.177
-1.352
0.176
0.961
0.337
-0.303
0.762
2.997
0.003
-1.296
0.195
0.617
0.537
RSS
0.02886
F(11,68) = ??(e) [0.000]**
DW
2.16
no. of parameters
12
var(y)
0.00201
F(4,64) = 1.0382[0.394]
F(4,60) =
2.283[0.071]
2
Chi (2)= 15.197[??(f)]***
F(22,45) = 1.2019[0.294]
F(1,67) = 0.03373[0.855]
(Question 10 continued overleaf)
6
(Continued)
EC 2030
(Question 10 continued)
Figure 1:
4
Lm ´ Fitted
r:Lm (scaled)
4.6
2
4.5
0
4.4
-2
4.35
4.40
Density
r:Lm
4.45
4.50
4.55
4.60
1.0
N(0,1)
ACF-r:Lm
0.5
0.4
0.0
0.2
-0.5
-2
0
2
4
7
0
5
10
(Continued)
EC 2030
11.
A model is specified as
y t     1 x1,t   2 x1,t 1   3 x 2,t 1   4 x 2,t  2   5 y t  2   t (1)
Estimating this model using quarterly data over the period 1983:1-2000:4 resulted in
y t  0.424  0.624 x1,t  0.361x1,t 1  0.618 x 2,t 1  0.381x 2,t  2
(0.201) (0.168)
0.576 y t  2  et
(0.152)
(0.202)
(0.189)
(0.229)
R  0.705, Error sum of squares (SSE)  39.172
(Standard errors in parentheses)
2
(a)
Calculate the response in y to a unit increase in x1 and x2, (i) contemporaneously,
(ii) after 1 period, (iii) after 2 periods, (iv) after 3 periods, (v) in the long run.
(b)
Test the hypothesis that the long run response of y to x1 is –0.5, given that
cov( b1 , b 2 )  0.020 , cov( b1 , b5 )  0.01 and cov( b 2 , b5 )  0.009 .
(c)
Write out the restricted model that you would estimate in PcGIVE if you wanted
to impose the two hypotheses that the long-run coefficient on x1 is –1 and the
long-run coefficient on x2 is zero. How many parameters would you estimate in
the restricted model?
(d)
The Durbin-Watson statistic for the estimated equation (1) is 1.316. Based on
this information what do you conclude about the model (1)?
(e)
Describe the regressions you would run in order to construct the AR1-4 test
reported by PcGIVE. The p-value on the resultant F-statistic was 0.02. In light of
this information and that in (d) how would you suggest modifying equation (1)?
8
(Continued)
EC 2030
12.
An equation to determine happiness from a random sample of 600 employed individuals
is estimated based on the following model
ln( Happy i )    1 ln( Income i )   2 Female i   3 Married i   4 SOCIi
(1)
  5SOCIIi   6SOCIII i   7 SOCIVi   8 Educ i   9 Age   i
where,
Happy = Happiness score (out of 100)
Income=Annual income (£000s)
Educ = Number of years in education
Age = Age
1 fe male
1 Married
, Married i  
,
Female i  
0 male
0 Otherwise
1
SOCI i  
0
1 Managerial
1 Skilled
1
, SOCIII i  
, SOCIVi  
SOCII i  
0 Otherwise
0 Otherwise
0
1 Unskilled
.
SOCVi  
0 Otherwise
Pr ofessional
,
Otherwise
Semi - Skilled
,
Otherwise
Estimating equation (1) by OLS resulted in an error sum of squares (SSE) of 6.37 and a
regression sum of squares (SSR) of 2.72.
(a)
Excluding the variables SOCI, SOCII, SOCIII and SOCIV from equation (1) and
estimating the resultant equation by OLS, yielded an SSR of 2.48. Test the joint
significance of these variables.
(b)
The Standard Industrial Classifications (SIC) defines nine single digit industry
types. Explain how you would include variable(s) to control for the different
industry types in equation (1). Estimating equation (1) by OLS with the inclusion
of the industry variable(s) the SSE fell to 6.08. Test the significance of this
(these) variable(s).
(c)
Equation (1) is re-estimated separately for the 278 males and the 322 females in
our sample. Estimating these two models yielded SSEs of 3.13 and 3.02,
respectively. Comparing this model with equation (1), carefully explain what
null hypothesis you are testing, and test this hypothesis.
(d)
Adding three interacted dummy variables to equation (1), (Female)´(Educ),
(Female)´(Age) and (Female)´ln(Income) the SSE fell to 6.18. Test the joint
significance of these additional variables.
(e)
Compare the model estimated in (d) with the two models estimated in (c). Write
out the restrictions you must impose on the model described in (c) to produce the
model in (d) and test the restrictions.
9
(Continued)
EC 2030
13.
Consider the following model:
y t    1 ln( x 1t )   2 x 2 t  u t ,t  1, , n
(1)
(a)
Interpret the coefficients in equation (1)
(b)
Interpret the coefficient on the variable ln(x1t) in (1) if this variable is multiplied
by 100.
Discuss the consequences of estimating equation (1) by OLS, in terms of the OLS
estimators, b1 and b2, and the standard errors of these estimators, in each of the following
circumstances:
(c)
ut is serially correlated, when x2t = yt-1.
(d)
x 2t = [ln(x 1t )]2
(e)
x2t is unobservable and is therefore excluded from the estimated equation.
(f)
t  1,  n 1
1
. How could you improve upon OLS in this case?
V(u t ) = 
2 t  n 1  1 n
(g)
0
x 2t = 
1
t  1, 16,18,  n
.
t  17
Section C
14.
Blundell et al (2000) show that average earnings for both males and females who went
to university are markedly greater than those individuals who did not attend university.
Discuss this statement in the light of the statistical evidence on the returns to a university
degree.
15
Econometric models enable us to test competing economic theories against actual
data. Economic models which prove to be data inconsistent can then be discarded and
more appropriate models developed. Discuss.
10
(End)
Download