Econometric Methods - University of Kent

advertisement
EC821
Econometric Methods
2013/14
SCHOOL OF ECONOMICS
EC821 Econometric Methods
Staff
Module convenor
Dr Yu Zhu
Office
Keynes B1.05
Teaching Assistant
Email
y.zhu-5@kent.ac.uk
Ivan Mendieta-Munoz (iim3@kent.ac.uk)
Teaching information
Teaching period
Autumn Term
Teaching pattern
One two hour lecture/seminar per week and a one hour computer
practical per week
Hours of study
Contact hours
33
Private study hours
117
Total study hours
150
Assessment
Task
Weighting
Test/exam date
Class Test
20%
Computer Based Coursework Project
20%
N/A
Exam
60%
May/June 2014
Coursework
submission date
N/A
N/A
Coursework submission policy
All coursework must be submitted by the deadline stipulated by the module convenor, as listed
above, to the School of Economics General Office, Mg.14 Keynes College. All coursework should be
accompanied by a completed cover sheet.
No extensions to submission deadlines are granted. If you miss the deadline and submit the
coursework late, you must also submit a concessions form for late submission, available from the
Social Sciences Faculty Office, www.kent.ac.uk/socsci/studying/undergrad/concessions.htm
UNIVERSITY OF KENT
SCHOOL OF ECONOMICS
EC821
ECONOMETRIC METHODS
MODULE DOCUMENTS
September 2013
Module Convener: Yu Zhu
Notes: This document contains the basic module materials. Additional handouts may be distributed at the
appropriate lectures, seminar classes and computer practicals respectively. The aim is to make the module
materials easily accessible to participants. If you wish to print parts of this document you can do so from the pdf
file. Note that you can print several pages on one A4 sheet by selecting a suitable option on the print menu.
1
Module Documents
Contents
Page Number
Module outline, syllabus and reading
3–8
Coursework Assessment
9
Class Planner
10
Class Exercises
11 – 18
Computer Practicals
19 – 35
2012/13 Class Test
36 – 38
2011/12 Class Test
39 – 40
2010/11 Class Test
41 – 42
2012/13 Exam Paper
43 – 46
2011/12 Exam Paper
47 – 51
2010/11 Exam Paper
52 – 55
Statistical Tables
56 – 61
Notes
62 - 64
2
1 MODULE OUTLINE
Introduction
This module aims to study basic single equation econometric techniques in an intuitive and practical way to
develop your understanding and ability to apply econometric methods.
You will develop an understanding of the conventional linear regression model and the problems associated with
the application of regression methods to economic modelling. The module is concerned with the application of
econometric methods, with little emphasis on the mathematical aspects of the subject (which may be studied in
other modules). The microcomputer software package STATA will be used for practical work throughout this
module, both as a means of providing realistic applications of the theory developed in lectures and to give you
experience in the use of such software as a preparation for your own empirical research.
No previous knowledge of computing or econometrics is required.
Aims
The module aims to
 to develop students understanding and ability to apply quantitative economic methods
 to follow an intuitive approach by use of practical examples and practical classes, using STATA
 to give participants the ability to critically evaluate empirical literature
 to contribute to the students' ability to carry out empirical research
Learning Outcomes
By the end of the module participants should be able to:
 understand the nature of economic and economic models
 apply least squares estimation methods using STATA
 perform and interpret the results of specification tests
 evaluate model adequacy using diagnostic tests and other criteria
 understand simultaneous equation methods
 undertake unsupervised practical work using STATA
 interpret the empirical economic research of others and be able to evaluate critically empirical
literature
 analyse and report in writing on own and others’ empirical economic results.
Skills
This module contributes substantially to subject specific skills acquired across all MSc programmes. Empirical
evaluation of economic models is crucial to the study and application of economics. By the end of this course you
should acquire the skills and understanding to read and evaluate the empirical literature in economics and to carry
out your own empirical research.
As regards general and transferable skills, the module will develop or reinforce students’ skills in a number of
different areas. In addition to technical and research skills, they will:
3





develop their ability to utilize modern computing resources to access and acquire data from the Internet
(and other available sources) and utilize standard Office based PC software (currently Microsoft) to
generate written reports and undertake oral presentations
acquire the ability to undertake modelling of economic behaviour and use statistical software
develop and reinforce skills in numeracy and problem solving from the interpretation and manipulation of
empirical economic models
improve their skills in communication and team work in making group presentations in class
present economic arguments orally as well as in written form
This module also contributes to most of the intellectual and transferable skills of the MSc programmes.
If you need help in study skills you may ask for advice from the lecturer or get assistance from the Student
Learning Advisory Service.
The Economics Graduate Handbook gives information on support available through the Student Learning
Advisory Service, which is part of the Unit for the Enhancement of Learning and Teaching, and through the
English Language Unit. You should read this handbook carefully and make full use of these services. All students
should visit the Student Advisory Service to see what it offers in terms of advice and literature on essay writing,
examination preparation, time management etc.
Module Administration
Module Convener: Yu ZHU, Keynes B1.05, x 7438, email yz5@kent.ac.uk
Timetable:
Lecture/seminar:
Computer practical:
Consultation hours:
Tuesday 11 am - 1 pm, KS23
Monday 11.05am – 11.55am, KSA1
Tuesday 3-4 pm and Thursday 4-5 pm
Teaching Methods
There will be a two-hour lecture/seminar session (22 hours in total) and one computer practical (11 hours in total)
per week. The lectures introduce the module material and provide an overview of the principles of basic
econometric methods. Applications of these techniques are conducted in computing workshops using simulated or
real world data. Seminars will be used to facilitate discussion of computer and class exercises and for student
presentations. The seminar programme improves the analytical abilities of students, their understanding of the
module material and their communication skills. The seminars also give students the opportunity to show their
understanding of the module material and ask questions about topics they are not sure about. Advice and feedback
on seminar communication skills are also given. The lectures and computer workshops are designed to improve
the analytical and problem solving skills of students, and develop their ability to apply their knowledge and
understanding of econometric issues to simulated and real world data. Throughout the module, emphasis is put on
the need for students to improve their own learning skills and academic performance. This is achieved through
feedback on student work and academic guidance on private study.
The lecture is on Tuesdays, 11am-1pm. Normally about 1.5 hours will be devoted to the lecture material, the
remaining time used to discuss computer and class exercises and for student presentations. The computer sessions
are on Monday, 11.05 – 11.55 in the Terminal Room KSA1.
You are also expected to see the lecturer out of class hours if you have any difficulties with the material or
exercises and if you have any other problems relating to the module.
4
Study Methods
An effective way to study this subject is to regularly attempt questions supplied as part of the module materials,
from textbooks or from past examination papers and to read what are often only relatively short sections of the
textbooks.
You will be given weekly exercises, some of which will be based on the output from the computer sessions. It is
extremely important to attempt these exercises prior to the class at which solutions are discussed to test your
understanding of the material as the module progresses. The module is cumulative, in the sense that understanding
of later parts of the syllabus being dependent on a thorough grasp of earlier material. The exercises enable you to
test your grasp of the concepts and give a guide to areas in which to consult the texts or the lecturer.
You should devote 10 hours per week in the Autumn term to this module. This means that in addition to teaching
hours, you should spend around 7 hours a week during term time. With the examination term, you should devote
around 150 hours to this module. A substantial proportion of your study time can be spent on the class and
computer exercises, problems from the textbooks and associated reading.
The solutions to exercises will be discussed in the Tuesday lecture/seminar session and if you have any difficulty
in completing exercises please see the lecturer for help and clarification as soon as possible. All staff have
consultation hours during which they are available to see students - the times are posted on their consultation
doors and on the economics web pages. You may also e-mail with simple questions.
Computer Practicals
In the computing practical classes you will estimate models which illustrate and develop the lecture material, and
you will gain experience in the use of microcomputers and econometric software. The results from these practicals
will be discussed in the lecture/class sessions, so you should bring your printed results to the lectures. A few of the
exercises will be from the recommended textbook so that you have access to the textbook discussion of the topic.
Although the module uses STATA, there are other programmes available on the networked computers which are
useful for both professional and vocational purposes. In particular you should become familiar with Word, a wordprocessing programme, since this will be used for writing essays and your dissertation. Introductory documents are
available from the reception desk in the Computer Laboratory and the Computing staff run introductory courses
during the year which you may attend. Also familiarity with a spreadsheet programme (for example Excel) is often
expected by employers in the private and public sectors, and for data entry and manipulation such programmes can
be extremely useful both in this module and in your dissertation work. Again introductory courses and
documentation are available and the lecturer can offer assistance.
During the module you will also be introduced to the World Wide Web and the sources of information on it of
particular interest to economists. This includes economic databases (such as the Penn World Tables) which might
be useful for your dissertation.
Assessment
The final mark for the present module is made up of 40% of the coursework plus 60% of the exam mark. The
coursework is in two equally weighted parts; the first is based on a class test in week 8 which tests students’ use
and knowledge of the basic single-equation econometrics part of the module. The computer-based coursework
project to be submitted by the end of the Autumn Term assesses the writing, modelling, literature, computing,
interpretation and empirical research development learning outcomes. The two-hour examination consists of two
questions from a choice of six. The exam is designed to test and develop the non-computing and non-oral skills
and learning outcomes identified earlier.
5
The word limit for the computer-based coursework project is 1,500 words, plus an appendix up to 5 page long
containing summary statistics and estimation results. The work should be submitted to the Economics General
Office no later than 12.00 on Friday 24th January 2014. In fairness to those who meet the due date and time, no
work will be accepted after this time and a zero grade will be recorded unless there are acceptable, documented
medical or other reasons for late submission. You are advised to begin your work for the assignment well before
the end of term.
Reading
The core text for this module is:

Wooldridge, J.M., 2013, Introductory Econometrics – A Modern Approach, South-Western, 5th
edition (International Edition).
All students should either buy a copy or ensure they have easy access to it since the module will follow the text
quite closely.
In addition, you will find the following book very useful, especially for the computing classes:

Baum, C.F., 2006, Introduction to Modern Econometrics Using STATA, STATA Press, ISBN-10: 159718-013-0.
The syllabus for the module is also covered adequately by many textbooks, of which the following are suitable.
You may like to refer to one or more of these for some topics. Guidance will be given in lectures. References
might also be made to journal articles which both illustrate the material and link to other modules. Multiple copies
of all texts are in the library, some in the short loan collection.




Kennedy, P., 2008, A Guide to Econometrics, 6th edition, John Wiley and Sons Ltd.
Mukherjee, C., White, H. and M. Wuyts, 1997, Econometrics and Data Analysis for Developing
Countries, Taylor & Francis Book Ltd, paperback.
Gujarati, D., 2003, Basic Econometrics, 4th Edition, McGraw-Hill.
Dougherty, C., 2002 Introduction to Econometrics, 2nd edition, Oxford University Press.
Although there are multiple copies of all the above books (some are of earlier editions) in the Library, if you have
any difficulty obtaining the reading, either from the library or the bookshop, please let the lecturer know
immediately.
Problems
We hope you find your study of this module interesting and productive. If you have any problems
or suggestions to make about the subject matter, the organisation of the module or any other
issues the lecturer would like to hear from you. Alternatively, you can talk in confidence to your
Director of Studies or your Staff/Student Liaison Committee representatives.
6
SYLLABUS
The references given for each topic are alternatives and it is not essential to read more than one
reference, although you find it helpful to do so. More detailed section and page references to the core
texts will be given during lectures.
1. The Linear Regression Model
1.1 The "Classical" Assumptions
1.2 Estimators and their properties
1.3 Simple linear regression
1.3.1 OLS Estimators
1.3.2 Predicted Values and Residuals
1.3.3 Interpretation of OLS Estimators
1.3.4 Goodness of Fit
1.3.5 Elasticities
1.3.6 Some Non-linear Functions and Elasticities
1.4 Multiple regression
1.4.1 Introduction: 3-variable regression model
1.4.2 Interpretation of coefficients of multiple regression
1.5 Recovering estimation results and presenting regression estimates
1.6 Properties of Ordinary Least Squares
1.7 Inference
1.7.1 Standard errors and t-ratios
1.7.2 Hypothesis testing: some practical aspects
1.7.3 Tests of linear restrictions - F-tests
Reading
 Wooldridge (2009), Chapters 2, 3.1-3.2, & 4
 Baum, Chapters 2, 3 & 4
 Kennedy, Chapter 2, 3 & 4
 Mukherjee, Chapters 4 & 5
 Gujarati, Chapters 2, 3, 4, 5, 7 and 8
 Dougherty, Chapters 2, 3, 5 & 6 (section 5)
2. Extensions of the Linear Regression Model
2.1 Dummy Variables
2.1.1 Qualitative and seasonal dummy variables
2.1.2 Slope dummies
2.2 Omitted variable bias (underfitting)
2.3 Non-linear models
2.4 Multicollinearity
Reading
 Wooldridge (2009), Chapters 6, 7, 3.3-3.5
 Baum, Chapters 5, 7.1-7.3
 Kennedy, Chapter 15, 5, 6 & 12
 Mukherjee, Chapter 6
 Gujarati, Chapter 9, Chapter 6, Chapter 13(Sections 1-5)
 Dougherty, Chapters 4, 5 (section 5), 6 (sections 1-4), 9
7
3. Failure of Classical Assumptions
3.1 Autocorrelation
3.2 Heteroscedasticity
3.3 Non-normality
3.4 Misspecification and diagnostic tests
Reading
 Wooldridge(2009), Chapters 10, 12, 8,, 9
 Baum, Chapters 6, 7.4
 Kennedy, Chapters 7-10
 Mukherjee, Chapter 7 & 11
 Gujarati, Chapters 11, 12 and 13
 Dougherty, Chapter 7
Class Test (Tuesday, Week 8)
4. Instrumental Variable Estimation
4.1. The IV estimator (with a Single Regressor and A Single Instrument)
4.2. The General IV Regression Model
4.3. Errors in Variables
4.4. Testing for Errors in Variables or Exogeneity (the Hausman Test)
4.5 Checking Instrument Validity
4.6. Where Do Valid Instruments Come From?
Reading
 Wooldridge (2009), Chapter 15
 Baum, Chapter 8
 Kennedy, Chapter 9
 Mukherjee, Chapters 13 & 14;
 Gujarati, Chapters 18-20;
 Dougherty, Chapter 10
5. Simultaneous Equation Models
5.1. The Seemingly Unrelated Regressions (SUR) Models
5.2. Simultaneous-equation Models
5.3. The Simultaneous-equation Bias
5.4. The Identification Problem
5.5. The Estimation of Structural Equations
Reading
 Wooldridge (2009), Chapters 16 & 15
 Baum, Chapter 8
 Kennedy, Chapter 11
 Mukherjee, Chapters 13 & 14;
 Gujarati, Chapters 18-20;
 Dougherty, Chapter 10
8
2 COURSEWORK ASSESSMENT
The assessment exercise is in two parts, each contributing 20% to the 40% coursework contribution.
1. The first part is a class test in Week 8 (see the Class Planner on the following page) in which you
will answer questions of a similar type to an examination question. The aim is to give some
practice in answering quantitative questions under exam conditions, as well as testing subject
specific knowledge and skills as stated in the module outline. The work will be marked and
returned by Week 10.
2. The second part is a small empirical project. You will be given a dataset. You will be expected to
select and estimate a model, interpret the results and evaluate the adequacy of your model. The
work will be assessed on the quality of your interpretation and evaluation of your chosen model,
not on how good the results are (e.g. the size of R2). However we do expect more than a simple
bivariate static regression. Chapter 19 of Wooldridge (2013) offers a nice guide on how to carry
out an empirical project.
The word limit for the computer-based project is 1,500 words, plus an appendix up to 5 page long
containing summary statistics and estimation results. The work should be submitted to the
Economics General Office no later than 12.00 on Friday 24th January 2014. In fairness to those
who meet the due date and time, no work will be accepted after this time and a zero grade will be
recorded unless there are acceptable, documented medical or other reasons for late submission.
9
3 CLASS PLANNER
Each week you will complete exercises and other questions for the lecture/seminar classes. Over the
course of the term you will make at least one (group) presentation. You may make a note of what you
have to do each week on the following timetable grid. To allow flexibility, the lecturer will set the
tasks as term progresses, usually announcing in the lecture what is to be done for the following week
(and emailing a reminder).
Week
1
Seminar work and presentations
No EC821 teaching (intensive math course).
2
3
4
5
6
7
8
CLASS TEST
9
10
11
12
COMPUTER-BASED COURSEWORK PROJECT
10
4 CLASS EXERCISES
These exercises and those from the Computer Exercises are for class discussion. These may be supplemented by
questions from the textbooks and from past class test or examination questions. You will be asked to attempt
specific questions each week for discussion in the next class. Make a note of what you are required to do on the
Class Planner. Do not be discouraged if you cannot complete some exercises since it is normally the case that
students have difficulty in doing so at the first attempt.
If you are unable to complete some exercises, do see the lecturer for help after you have read the relevant
sections of the textbooks, seek clarification in class or discuss solutions with fellow students. The aim is to help
you test your understanding, to guide you in your reading and to provide practice in the type of questions you
may expect in the examination.
Q1.
a)
Suppose you are asked to conduct a study to determine whether deficiency in English lead to lower
wages for immigrants.
Suppose you are given observational data on a large sample of the working-age immigrants in the UK
with information on their native language (state or private) and country of birth. Would you expect a
positive or negative correlation between English deficiency and wages?
b)
Would a negative correlation necessarily mean that not being a native English speaker causes lower
wages? Explain.
c)
If you could conduct any experiment you want, what would you do? Be specific.
Q2.
a)
What is meant by the statement that an estimator is unbiased?
b)
What is the difference between an endogenous variable and an exogenous variable?
c)
Under what assumptions is ordinary least squares an unbiased estimator?
Q3.
Estimates of a model for the demand for Maltese exports (data from Mukherjee et al.) are given below.
LE = log of Maltese exports (US, current prices)
LWD = log of a measure of world demand
LP = log of the price of Maltese exports
Ordinary Least Squares Estimation
*******************************************************************************
Dependent variable is LE
27 observations used for estimation from 1963 to 1989
*******************************************************************************
Regressor
Coefficient
Standard Error
T-Ratio[Prob]
INT
-3.6341
2.7185
-1.3368[.194]
LWD
.41156
.59068
.69676[.493]
LP
1.4300
.28199
5.0710[.000]
*******************************************************************************
R-Squared
.88289
R-Bar-Squared
.87313
S.E. of Regression
.29674
F-stat.
F( 2, 24)
90.4691[.000]
Mean of Dependent Variable
5.2197
S.D. of Dependent Variable
.83311
Residual Sum of Squares
2.1133
Equation Log-likelihood
-3.9190
Akaike Info. Criterion
-6.9190
Schwarz Bayesian Criterion
-8.8627
DW-statistic
.50400
11
a)
Interpret the coefficients of LWD and LP. Comment on their signs.
b)
Test the hypothesis that the true parameter of LWD is equal to one.
c)
Test the significance of LWD.
d)
Test the hypothesis that the true parameter of LP is zero.
e)
Interpret the value of R-Squared.
f)
Examine the graph below. Does it look as though the classical assumptions are met?
Plot of Residuals and Two Standard Error Bands
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
1963
1968
1973
1978
1983
1988
1989
Years
Q4.
A model of the demand for wine uses the following variables:
LW = the natural logarithm of sales of wine
LY = the natural logarithm of real per capita income
LP = the natural logarithm of the price of wine
S2, S3 and S4 are seasonal dummy variables for quarters 2,3 and 4 respectively.
The data set has 44 quarterly observations for the period 1980 quarter 1 to 1990 quarter 4.
a)
Interpret the coefficient estimates for Model I. Which coefficients are significantly different from zero
at a 5% significance level? Interpret the value of R2 for this model.
b) Model I was estimated for the period 1980Q1 to 1985Q4 and the Chow test gave a value of 6.9. What
does this show?
c)
Three seasonal dummy variables were added to give Model II.
(i)
Interpret the coefficients of these variables. In which quarter are wine sales estimated to be
highest?
(ii)
Test the joint significance of the dummy variables.
12
TABLE 1: RESULTS FOR MODEL I
Ordinary Least Squares Estimation
******************************************************************************
Dependent variable is LW
44 observations used for estimation from 80Q1 to 90Q4
******************************************************************************
Regressor
Coefficient
Standard Error
T-Ratio[Prob]
C
2.1170
.12078
17.5277[.000]
LY
.56078
.10049
5.5805[.000]
LP
-.031500
.27157
-.11599[.908]
******************************************************************************
R-Squared
.46870
F-statistic F( 2, 41)
18.0845[.000]
R-Bar-Squared
.44278
S.E. of Regression
.067881
Residual Sum of Squares
.18892
Mean of Dependent Variable
2.7720
S.D. of Dependent Variable
.090935
Maximum of Log-likelihood
57.4804
DW-statistic
1.8717
******************************************************************************
TABLE 2: RESULTS FOR MODEL II
Ordinary Least Squares Estimation
******************************************************************************
Dependent variable is LW
44 observations used for estimation from 80Q1 to 90Q4
******************************************************************************
Regressor
Coefficient
Standard Error
T-Ratio[Prob]
C
2.3563
.074266
31.7275[.000]
LY
.42385
.064480
6.5733[.000]
LP
-.13756
.12735
-1.0801[.287]
S2
-.12468
.012853
-9.7006[.000]
S3
-.14616
.012890
-11.3387[.000]
S4
-.034041
.017282
-1.9697[.056]
******************************************************************************
R-Squared
.90323
F-statistic F( 5, 38)
70.9349[.000]
R-Bar-Squared
.89049
S.E. of Regression
.030092
Residual Sum of Squares
.034410
Mean of Dependent Variable
2.7720
S.D. of Dependent Variable
.090935
Maximum of Log-likelihood
94.9458
DW-statistic
1.3238
******************************************************************************
Q5.
The following Stata output is based on a random sample of male graduates (i.e. with at least a first
degree) in the UK aged 25-55 and in full-time employment.
Model A
Source |
SS
df
MS
-------------+-----------------------------Model |
12.530607
5 2.50612139
Residual | 88.2395447
527 .167437466
-------------+-----------------------------Total | 100.770152
532 .189417578
Number of obs
F( 5,
527)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
533
14.97
0.0000
0.1243
0.1160
.40919
-----------------------------------------------------------------------------lrhrwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1039071
.020438
5.08
0.000
.0637571
.144057
agesq | -.0011493
.000259
-4.44
0.000
-.001658
-.0006405
highrdeg |
.0842088
.0415832
2.03
0.043
.0025195
.165898
london |
.1327832
.060486
2.20
0.029
.0139599
.2516066
se |
.0988441
.0507766
1.95
0.052
-.0009052
.1985935
_cons |
.5586906
.3897368
1.43
0.152
-.2069378
1.324319
------------------------------------------------------------------------------
where lrhrwage is the natural logarithm of real hourly wage, age is age and agesq is age squared, highrdeg
equals one if the respondent holds a higher degree and zero otherwise, london and se are indicators for living in
London and the Southeast region (excluding London) respectively.
13
a)
What is the interpretation of the coefficient for the term _cons in the Stata output? Provide an
interpretation of the coefficient on highrdeg. Which region in the UK has the lowest expected wage for
male graduates?
b)
Briefly comment on the statistical significance of each regressor. Are the regressors statistically
significant jointly?
c)
What is the expected log real hourly wage of a 40-year old male graduate who has a higher degree and
lives in London? At what age is his wage expected to peak?
Q6.
The diagnostic tests for the model of question 5 are given below (NB: Q5 & Q6 are adapted from
Class Test of 2011/12).
a)
. estat ovtest
Ramsey RESET test using powers of the fitted values of lrhrwage
Ho: model has no omitted variables
F(3, 524) =
2.35
Prob > F =
0.0720
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of lrhrwage
chi2(1)
= 6.82
Prob > chi2 = 0.0090
b) In Model B below, the region dummies are left out. Compare the goodness-of-fit of the two models. Use a
formal statistical test to determine whether the region dummies are jointly significant. [15]
Model B
Source |
SS
df
MS
-------------+-----------------------------Model | 11.2606921
3 3.75356404
Residual | 89.5094595
529 .169205028
-------------+-----------------------------Total | 100.770152
532 .189417578
Number of obs
F( 3,
529)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
533
22.18
0.0000
0.1117
0.1067
.41135
-----------------------------------------------------------------------------lrhrwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1012471
.0204911
4.94
0.000
.0609933
.141501
agesq | -.0011168
.0002597
-4.30
0.000
-.001627
-.0006066
highrdeg |
.0773654
.0416706
1.86
0.064
-.0044948
.1592255
_cons |
.6397865
.3901326
1.64
0.102
-.1266128
1.406186
------------------------------------------------------------------------------
Q7.
Discuss the implications of a structural break for least squares estimation when pooling survey data
from two different years into a larger sample. Show that a test for the joint significance of the intercept
and slope dummies in the pooled specification is equivalent to the Chow test.
14
Q8.
a)
b)
c)
A simple consumption function model has been estimated using annual UK data for the period 1959 to
1987 inclusive. The dependent variable is real consumer expenditures (rcons). The explanatory
variable is real personal disposable income (rpdi).
Interpret the coefficient of real personal disposable income. What is the corresponding income
elasticity (evaluated at the sample mean)?
Test the individual significance of the variable(s) and the joint significance of the model. What is the
relationship between these two measures in this simple model?
This model was then estimated for the period 1959-1970 and 1971-1987 separately (see STATA
results). Is there evidence of a structural break?
. sum
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------year |
29
1973
8.514693
1959
1987
rcons |
29
170.2025
32.26553
118.547
238.46
rpdi |
29
188.588
36.92147
124.964
252.185
pcons |
29
.4405449
.3301454
.137464
1.08375
For the sample as a whole
. reg
rcons rpdi
Source |
SS
df
MS
-------------+-----------------------------Model |
28700.884
1
28700.884
Residual | 448.912005
27 16.6263706
-------------+-----------------------------Total |
29149.796
28 1041.06414
Number of obs
F( 1,
27)
Prob > F
R-squared
Adj R-squared
Root MSE
=
29
= 1726.23
= 0.0000
= 0.9846
= 0.9840
= 4.0775
-----------------------------------------------------------------------------rcons |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------rpdi |
.8671409
.0208709
41.55
0.000
.8243174
.9099644
_cons |
6.670154
4.008166
1.66
0.108
-1.553924
14.89423
-----------------------------------------------------------------------------For the period 1959-1970
. reg
rcons rpdi if year<=1970
Source |
SS
df
MS
-------------+-----------------------------Model | 1661.60442
1 1661.60442
Residual | 17.5544905
10 1.75544905
-------------+-----------------------------Total | 1679.15891
11
152.65081
Number of obs
F( 1,
10)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
12
946.54
0.0000
0.9895
0.9885
1.3249
-----------------------------------------------------------------------------rcons |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------rpdi |
.8408455
.0273304
30.77
0.000
.7799495
.9017415
_cons |
11.39298
4.148099
2.75
0.021
2.15044
20.63552
-----------------------------------------------------------------------------For the period 1971-1987
. reg
rcons rpdi if year>=1971
Source |
SS
df
MS
-------------+-----------------------------Model | 6495.60804
1 6495.60804
Residual | 361.249078
15 24.0832719
-------------+-----------------------------Total | 6856.85712
16
428.55357
Number of obs
F( 1,
15)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
17
269.71
0.0000
0.9473
0.9438
4.9075
-----------------------------------------------------------------------------rcons |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------rpdi |
.9567709
.058258
16.42
0.000
.8325968
1.080945
_cons | -13.13152
12.58361
-1.04
0.313
-39.95285
13.68981
15
Q9.
a)
b)
c)
Q10.
What is the implication of pure autocorrelation?
What distinguishes Durbin’s h test from the Durbin-Watson d test?
What are the advantages of the Lagrange Multiplier (LM) test over the traditional Durbin-Watson test?
The following Stata output is based on an OLS regression over the quarterly period 1963Q1 to
1977Q4 (n=60)
xi: reg lrcons lry i.season if _n<=60
i.season
_Iseason_0-3
(naturally coded; _Iseason_0 omitted)
Source |
SS
df
MS
Number of obs =
60
-------------+-----------------------------F( 4,
55) = 551.40
Model | .694516705
4 .173629176
Prob > F
= 0.0000
Residual |
.01731884
55 .000314888
R-squared
= 0.9757
-------------+-----------------------------Adj R-squared = 0.9739
Total | .711835545
59 .012065009
Root MSE
= .01775
-----------------------------------------------------------------------------lrcons |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lry |
.9082043
.0207827
43.70
0.000
.8665549
.9498538
S1 | -.0626869
.0065651
-9.55
0.000
-.0758436
-.0495303
S2 |
-.039448
.0064995
-6.07
0.000
-.0524733
-.0264227
S3 | -.0249351
.0064869
-3.84
0.000
-.0379352
-.0119351
_cons |
.9362644
.2274578
4.12
0.000
.4804287
1.3921
------------------------------------------------------------------------------
where lrcons and lry are the log of the real total consumer spending and the log of real personal disposable
income respectively. Sj denotes the dummy variable for quarter j (j=1, 2, 3). The Residual Sum of Squares
(RSS) is equal to 0.01732.
a) Provide an interpretation of the coefficient for lry. Is it statistically significant? How would you calculate the
marginal propensity to consume?
b) Provide an interpretation of the coefficients for S1-S3. In which quarter is total consumer spending highest?
Test for the overall significance of the sample regression.
c) This model was then estimated for the (post-sample) period 1978Q1-1987Q4 (n=40) before the two samples
were pooled together (i.e. 1963Q1-1987Q4, n=100). The resulting residual sum of squares are 0.01679 and
0.04788 respectively. Is there evidence of a structural break?
d) Comment on the Durbin-Watson test reported below (NB: the number of regressors k in the Durbin-Watson
table excludes the constant term). How would you test for autocorrelation if the lagged value of lrcons was
included as a regressor?
. estat dwatson
Durbin-Watson d-statistic (5, 60) = 1.737572
Q11.
a) Further diagnostic tests for the model in Question 10 are undertaken. Comment on the appropriateness of the
following diagnostic tests and discuss the implications.
. estat bgodfrey, lags(4)
Breusch-Godfrey LM test for autocorrelation
--------------------------------------------------------------------------lags(p) |
chi2
df
Prob > chi2
-------------+------------------------------------------------------------4
|
11.001
4
0.0266
--------------------------------------------------------------------------H0: no serial correlation
16
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of lrcons
chi2(1)
Prob > chi2
=
=
0.19
0.6663
. estat ovtest
Ramsey RESET test using powers of the fitted values of lrcons
Ho: model has no omitted variables
F(3, 52) =
1.02
Prob > F =
0.3926
b) An alternative to the log-log specification used in Question 9 is a linear regression relating real consumption
to real income. In your opinion, how should you choose between these two alternative specifications?
Q12.
Briefly explain the implications of the following problems and describe how you would deal with
them:
a) Near-perfect multicollinearity
b) Omission of relevant variables (underfitting)
Q13.
The following Stata output is based on an OLS estimation of returns to a degree relative to 2 or more A
Levels using a random sample of male employees in England from the 1996 UK Quarterly Labour
Force Survey.
Model A: 1996 Sample
. reg logwage degree exp expsq if lfsyear==1996
Source |
SS
df
MS
-------------+-----------------------------Model | 33.7828767
3 11.2609589
Residual | 192.560253
946 .203552065
-------------+-----------------------------Total |
226.34313
949 .238506986
Number of obs
F( 3,
946)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
950
55.32
0.0000
0.1493
0.1466
.45117
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------degree |
.2885419
.0318528
9.06
0.000
.2260316
.3510521
exp |
.0867526
.0127547
6.80
0.000
.0617218
.1117833
expsq | -.0010543
.0001783
-5.91
0.000
-.0014043
-.0007044
_cons |
.7414799
.2195963
3.38
0.001
.3105276
1.172432
------------------------------------------------------------------------------
where logwage is the natural logarithm of real hourly wage, degree is equal to one if the respondent has a
degree and zero if he/she only has two or more A Levels, exp is years of potential working experience and
expsq is exp squared. The Residual Sum of Squares (RSS) is equal to 192.560.
a) What is the interpretation of the constant term (_cons in the Stata output)? Provide an interpretation of the
coefficients for degree, exp and expsq. Are these three slope coefficients statistically significant individually?
Are they statistically significant jointly? How good does the model fit the data? [30%]
b) A second model is estimated using the 2006 UK Quarterly Labour Force Survey (in Model B below). The
resulting RSS is equal to 163.906. Comment on the estimates and compare them with the corresponding figures
for 1996. [25%]
17
Model B: 2006 Sample
. reg logwage degree exp expsq if lfsyear==2006
Source |
SS
df
MS
-------------+-----------------------------Model | 40.0258288
3 13.3419429
Residual | 163.905563
794
.20643018
-------------+-----------------------------Total | 203.931392
797 .255873767
Number of obs
F( 3,
794)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
798
64.63
0.0000
0.1963
0.1932
.45435
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------degree |
.3804167
.0340784
11.16
0.000
.3135222
.4473112
exp |
.0924649
.0135285
6.83
0.000
.0659091
.1190208
expsq | -.0011228
.0001794
-6.26
0.000
-.001475
-.0007707
_cons |
.6150071
.2430529
2.53
0.012
.1379049
1.092109
------------------------------------------------------------------------------
c) A third model is estimated by pooling data from 1996 and 2006. The resulting RSS is equal to 358.223.
Discuss whether this new model is justified. [25%]
Model C: Pooled Sample
. reg logwage degree exp expsq
Source |
SS
df
MS
-------------+-----------------------------Model | 73.1637007
3 24.3879002
Residual | 358.222973 1744 .205403081
-------------+-----------------------------Total | 431.386674 1747 .246929979
Number of obs
F( 3, 1744)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1748
118.73
0.0000
0.1696
0.1682
.45321
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------degree |
.33036
.0232567
14.20
0.000
.284746
.3759741
exp |
.0893459
.009078
9.84
0.000
.0715411
.1071508
expsq | -.0010824
.0001236
-8.76
0.000
-.0013247
-.00084
_cons |
.6794776
.1597029
4.25
0.000
.3662484
.9927069
------------------------------------------------------------------------------
d) Comment on the following diagnostic tests and discuss their implications. [20%]
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1)
=
18.01
Prob > chi2 =
0.0000
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 1741) =
0.85
Prob > F =
0.4663
Q14.
a)
b)
c)
Briefly discuss the following:
Structural break and the Chow-test.
The general-to-specific modelling approach.
Omitted variable bias and the RESET test.
18
5 COMPUTER PRACTICALS
The purpose of these exercises is to help you understand the material delivered in the lectures by
actually doing econometrics. Specialized software packages have been written to help one do
econometrics. We will use the package STATA 12.
The first computer class introduces you to Stata. The remainder involves undertaking class based
exercises that provide “hands-on” implementation of econometric techniques reviewed in the formal
lecture programme. Answering the exercises involves writing a program in Stata, running it and
commenting on the results. Skeleton Stata programs (known as do files) for the respective exercises
can be found on Moodle under the directory “Computer Practicals”. You may not have time to
complete the exercises in class, in which case you should do so in your own time. The results of each
exercise will be presented by groups of students and discussed by the class in a seminar.
The data files are located in the directory “Computer Practicals” on Moodle.
For a review of Stata features and commands, consult the webbook titled “Introduction to Stata 8”
available on Moodle. A complete set of STATA 9 documentation (12 volumes), together with books
on programming with STATA, are available in the library. Additional resources on STATA, such as
FAQs, e-tutorials, webbooks and even movies can be accessed through StataCorp’s website at
http://www.stata.com/. You will also find a full set of STATA 7 reference manual in the Economics
General Office.
COMPUTER EXERCISE I(A): Getting Started with Stata
The data file is auto.dta.
1)
Start STATA
Log on to Moodle, then go to the directory Computer Practicals. Double click on the data file auto.dta
which should start Stata.
2)


3)
Save windowing preferences
Adjust window sizes and location
Click Prefs (on the menu bar) → Save Windowing Preferences
Familiarize yourself with the various windows (Review, Variables, Results, Command and
Data Editor) in STATA.
4)

Open log file
Click File → Log → Begin, and then Select output folder “Z:\ec821\” (you need to click
the New Folder button to create the new folder ec821 if it does not already exist); then
Type filename comp1a and save as type log. Alternatively
Type log using "Z:\EC821\comp1a.log", replace (if the folder ec821 already exists)


Describe Data
Click Data → Describe Data; alternatively
Type describe in the Command Window

5)
19
6)


7)
Summary Statistics
Click Statistics → Summaries, tables & tests → summary statistics → summary statistics;
alternatively
Type summarize (or simply su) in the Command Window

Graphs
Click Graphics → Simple Graphs → Scatter Plot (Xvariable: weight, Yvairable: price);
alternatively
Type graph twoway scatter price weight in the Command Window


Save data
Click File → Save As …; alternatively
Type save “Z:\ec821\autonew” in the Command Window


Exit
Click File → Exit
Type exit in the Command Window

8)
9)
10)
Repeat this exercise with the help of the handout
11)

Load the do file exer1a.do
Click Window → Do-file Editor
o Click File → Open
12)
Modify the do file as you wish (especially the file path in the log command)
13)
Do/Run the do file by clicking the appropriate icons
14)
Exit STATA and check the datafile and log file you have just saved on your home folder.
COMPUTER EXERCISE 1(B):
To reinforce what you have learnt in this exercise, you should attempt this supplementary exercise on
your own after class.
Work in pairs if you can. One of you should work with census5.dta while the other with hsng.dta, both
from the same data directory. Produce summary statistics and scatter diagrams. Save the data for
future use. Discuss your problems and findings with your mates.
20
COMPUTER EXERCISE 2: AN EXPERIMENT
In this exercise you will each use the same values for the independent variable (X) and create some
observations for the dependent variable (Y) by adding a random disturbance (u) to the deterministic
part of the model. The parameter values are known, but you will then use the constructed data to
estimate the parameters by Ordinary Least Squares (OLS) and compare the estimated values with the
true values.
1. LOAD STATA
 Click Start → Programs → Central Software → STATA; alternatively
 Log on to Moodle, then go to the directory Computer Practicals. Double click on the data
file auto.dta which should start Stata. Then type clear in the Command Window
2. INVOKE STATA’S SPREADSHEET-LIKE DATA EDITOR
 Click Windows → Data Editor, or click the Data Editor button directly, or type the
command edit
 Type values 10, 20, 30, 40, 50 in columns STATA automatically calls var1
 Assign more informative variable name by double-clicking on the column heading of
var1
 Type a new name x in the resulting dialogue box
 Create variable label that contains a brief description, such as indep var
 Click OK to close the dialogue box
 Click X (top right corner of the Data Editor window) to close the Editor
 Check you have created the right dataset (with 5 observations and 1 variable) by using
describe and list
3. CREATING RANDOM SAMPLE
Type the following commands:
set seed xxxx
// where xxxx is any positive integer, which specifies the initial value
of the random number (RN) seed used by the uniform() function. Explicitly setting the seed number
makes it possible to later reproduce the same “random” numbers.
generate randnum = uniform() // creates uniformly distributed RNs over the interval [0,1)
generate v = invnorm(randnum) // creates a standard normal distribution, v~N(0,1)
generate u=5*v
// u~N(0, 5)
generate y=100 + 0.7*x + u
// creates values for the dependent variable Y with known
values for the intercept (100) and slope (0.7).
Click the DATA Editor button to check the new variables (or type list in the command window).
Graphs


Click Graphics → Simple Graphs → Scatter Plot (Xvariable: x, Yvairable: y);
alternatively
Type graph twoway scatter y x in the Command Window.
4. ESTIMATE A MODEL
 Click Statistics → Linear regression and related → Linear Regression (Dependent
variable: y, Independent vairable: x); alternatively
21
 Type regress y x in the Command Window.
to estimate the model Yt     X t  ut
5. SAVE YOUR RESULTS TO AN OUTPUT FILE
 Highlight the summary statistics and regression results
 Right Click Copy text
 Read the file into a word processor, such as Word or Notepad, in which it may be edited
and printed. You may want to choose a reduced font size - for example 8pt - to fit the
results to a page width.
 Save the file as Z:\login initials\EC821\compex2.doc.
Alternatively type in the command window:
log using "Z:\Login initials\EC821\compex2.log", replace
list
sum
reg
// reg typed without arguments redisplays results
log close
to write the results to a log file.
There is no need to save the data since it will not be used again.
Print your output file and bring them to the next lecture.
DISCUSSION
1.
This is a simple Monte Carlo exercise. The data generating process is known and satisfies all
the classical assumptions. The model used is
Yt     X t  ut , t = 1,2,...,5
where the X values are constant (fixed in repeated sampling), the parameters are known (100
and 0.7) and values for the random disturbance are generated by the random normal
command, giving an independent random sample of 5 values from a normal distribution with
mean zero and standard deviation 5.
2.
The objective is to calculate many estimates of a parameter and to construct an empirical
sampling distribution for that parameter. We may then be able to judge the accuracy of the
OLS estimator since the true value of the parameter is known.
3
For the OLS estimator the properties of the estimator can be derived without the necessity for
a Monte Carlo experiment - the nature of the sampling distribution is known. We will
compare our empirical sampling distribution with this "theoretical" distribution in the next
lecture.
QUESTIONS C2
1.
Compare your estimate of , the slope parameter, with that of another student. Explain why
the two sets of results are different.
2.
Compare your estimate of , the slope parameter, with the true value. Explain why your
estimate differs from the true value.
3.
Suppose 100 students performed this exercise and found a 95% confidence interval for . On
average, how many times would the confidence interval not include the true value?
22
COMPUTER EXERCISE 3: A simple consumption function
Introduction
The aim of this exercise is to estimate a simple consumption function for the U.K. using annual data.
The data is in a file named comp3.dta. The variables you will use are as follows:
rcons = Real consumption expenditure
rpdi = Real personal disposable income
A. ESTIMATING A MODEL BY ORDINARY LEAST SQUARES
1. Load STATA
2. Load the datafile comp3.dta
3. Declare the data to be a time series
 Type: tsset year
4. Generate new variables:
gen apc=rcons/rpdi
{Creates a new variable - what is its interpretation?)
tsline apc, xlab(1959 1962 to 1987)
{Shows a plot of APC over time - how has it changed?}
5. Estimate a model by Ordinary Least Squares.
The dependent variable is rcons.
The independent variables are rpdi.
Save the results, in the log file comp3.log
 log using “Z:\login initial\EC821\comp3.log”, replace
6. Generate the fitted values and residuals.
 predict rconshat
{create the fitted values}
 gen rconsres = rcons - rconshat
{create the residuals}
 tsline rcons rconshat rconsres
{plotting the actual and fitted values, as well as the residuals}.
B. ESTIMATE THE MODEL IN LOGS


gen lrcons = log(rcons)
{The new variable is the natural log of RCONS}
gen lrpdi = log(rpdi)
{The new variable is the natural log of RPDI}
Estimate a model by Ordinary Least Squares with lrcons as the dependent variable and lrpdi
as independent variable.
Also try plotting the actual and fitted values.
23

Save the results, in the log file comp3.log
log using “Z:\login initial\EC821\comp3.log”, append
NB: the “append” option specifies that results are to be appended onto the end of an already existing
file.
C. SAVE THE DATA TO YOUR HOME FOLDER
Use file/save to save the data in a file called comp3out.dta
 This saves the original data and any new variables, such as lrcons
Print your results (log file) before exit from STATA.
N.B. The RESULTS are in the file comp3.log and the DATA in a special STATA datafile
comp3out.dta.
QUESTIONS C3
1. For the linear model
(a) Test the hypothesis that the marginal propensity to consume (mpc) is equal to 0.7.
(b) Examine the residuals. Do they suggest a failure of any of the basic assumptions?
(c) Suggest at least one possible explanation for the failure of the basic assumptions.
(d) Interpret the value of R2 for the linear model.
2. For the log-log model (also known confusingly as the log-linear model in some textbooks)
(a) Interpret the coefficients of the log model. What is the mpc for this model?
(b) Test the hypothesis that elasticity of consumption with respect to income is unity.
(c) Explain what R2 shows for the log model.
24
COMPUTER EXERCISE 4: Production Function
The file prodfun from Mukherjee, C., White, H. and M. Wuyts, 1997 (MWW), which is in the
module folder, is to be used to illustrate tests of linear restrictions and the Chow test.
The data is cross-section data for a developing country for two manufacturing sectors. The variables
are:
LQ = log(Output);
LK = log(capital stock);
LN = log(employment)
D = 0 if from manufacturing sector A and = 1 if from manufacturing sector B (a dummy
variable)
_______________________________________________________
 The first part of the exercise is to estimate a Cobb-Douglas production function (equation A) and
to compare this with a similar function with constant returns to scale assumed (equation B).

The second part of the exercise is to allow the parameters to differ between the two manufacturing
sectors, so the general Cobb-Douglas function is estimated separately for each sector (equations C
and D).

The final part is to use slope and intercept dummy variables as an alternative way of obtaining
distinct estimates for each sector as in C and D (equation E).
Obtain OLS estimates of the following equations (NB for equations B and E you need to create some
new variables before estimation e.g. LQCR=LQ-LN; LKCR=LK-LN; LKD=LK*D;LND=LN*D)
A. LQi = 1 +  2 LNi + 3 LK i + i
B. ( LQi - LNi ) = 1 + 3 ( LK i - LNi ) + i
The first 42 observations are for sector A and observations 43 to 83 are for sector B, so estimate
separately for the two sectors by simply setting the sample appropriately:
C. Estimate Equation A for observations 1 to 42.
D. Estimate Equation A for observations 43 to 83.
To use dummy variables to obtain C and D in a single equation (create the necessary new variables
first) estimate:
E. LQi = 1 + 2 LNi + 3 LK i + 4 Di + 5 ( LNi * Di ) + 6 ( LK i * Di ) + i
QUESTIONS C4 (Reading MWW pages 229-233 might be useful):
1.
Interpret the parameters of equation A and estimate the returns to scale parameter.
2.
Show that equation B can be derived from A by assuming constant returns to scale.
3.
Test the validity of the constant returns to scale restriction.
25
4.
Using residual sums of squares from A, C and D perform a Chow test for identical parameters
for each of the two sectors.
5.
Show that equation E gives the same coefficients for each sector as C and D.
6.
Use the residual sums of squares from A and E to perform the test of common parameters for
the two sectors. Compare this with the test in question 4.
26
COMPUTER EXERCISE 5: An Aid Model
Use the file aidsav.dta in the module folder, which has 1987 cross-section data for 66 developing
countries on savings (S), aid (A) and income per capita (Y, all in $US) from Mukherjee, C., White, H.
and M. Wuyts, 1997 (MWW) to replicate the results in the MWW textbook, pages 209-211.
Check the data for missing values and outliers. Is there a good reason to exclude some countries with
very high income per capita from the sample?
Estimate the following models (N.B: You need to create the new variables,
S
etc in the processing
Y
screen):
(A)
S
 A
  = 1 +  2   +  i
 Y i
 Y i
(B)
S
 A
1
  = 1 +  2   +  3   +  i
 Y i
 Y i
Y 
QUESTIONS C5
Answer the following questions (Reading MWW pages 208/211 might be useful)
1. Describe your sample selection criterion.
2. Interpret the coefficients from both models
3. Compare the estimates of  2 from the two models. Why might the estimate from (A) be
biased?
4. Determine whether
1
should be included as in model (B), both on theoretical grounds and
Y
statistically (using a significance test).
5. How can you improve the model (Hints: functional form and dummy variables)
27
COMPUTER EXERCISE 6: Diagnostic Tests
You are asked to estimate and carry out diagnostic tests for a cross-sectional data set. Please save
relevant results, print them and bring them to lectures. We will discuss the results and the answers to
the questions in lectures.
1
Load the data file from comp6.dta into STATA.
2
The data is from Stewart and is for 24 grouped observations from the UK Family Expenditure
Survey on total household expenditure (EXTOTAL), the number of children in the household
(NCHILD), household expenditure on food (EXFOOD) and the number of households in
each group (NFAM).
3
Estimate the models (A) to (C) below.
N.B.
YOU HAVE TO CREATE THE LOG VARIABLES.
(A)
LEXFOODi =  0 + 1 LEXTOTALi + ui
(B)
LEXFOODi =  0 + 1 LEXTOTALi +  2 NCHILDi + ui
where) the L prefix indicates the logarithm of the corresponding variable: for example
LEXFOOD = log(EXFOOD).
(C)
EXFOODi =  0 + 1 EXTOTALi +  2 NCHILDi + ui
QUESTIONS C6
1.
Interpret the coefficients of models (A) and (B) and test their individual significance.
Compare the LEXTOTAL coefficients in the two models.
2.
Explain why “omitted variable bias” may affect the results from (A).
3.
Use specification plots (rvfplot or rvpplot) to find any patterns in the residuals in
each of the three models.
4.
Using the diagnostic tests, test for heteroscedasticity and functional form
misspecification for each of the models. Is there evidence that a linear function (i.e.
Model (C)) is inappropriate?
28
COMPUTER EXERCISE 7: Dynamic Models
No new features in STATA are used, apart from the specification of lagged variables.

The data is in the file comp7.dta

The variables are:
RCONS = real consumers’ expenditure (billion, 1985 prices).
RPDI = real personal disposable income (billion, 1985 prices).
RLIQ = real liquid assets of the personal sector (billion, 1985 prices).
Declare the data to be a quarterly time series:
. gen time = quarterly(date, "yq")
. format time %tq
. tsset time, quarterly
Estimate the following models and save the results, including the diagnostic tests.
(A)
LRCONSt =  0 + 1 LRPDIt +  2 LRLIQt + ut
where LRCONSt = log(RCONSt) etc.
(B)
LRCONSt =  0 + 1 LRPDIt +  2 LRLIQt +  3 LRCONSt-1 + ut
(C)
LRCONSt =  0 + 1 LRPDIt +  2 LRLIQt +  3 LRCONSt-4 +  4 LRPDIt-4 + ut
N.B. Lagged variables can be entered directly using STATA lag operator (e.g
L4.LRCONS=LRCONSt-4, without having to create them in the command window.
QUESTIONS C7
1.
Which of the coefficients in model (B) are significantly different from zero?
2.
Carry out tests for autocorrelation, heteroscedasticity and functional form for each of
the models.
3.
What are the consequences for the OLS estimates of the results of the diagnostic tests
for model (B)?
4.
Calculate estimated short-run and long-run income elasticities for models (B) and (C)
29
COMPUTEREXERCISE 7B: General-to-Specific Approach
This brief exercise illustrates the principle of the general-to-specific approach. The dataset to be used
is on the network server in the file comp7b.dta.
(a)
Estimate the following autoregressive model for income:
logYt = 0 + 1logYt-1 + 2logYt-2 + 3logYt-3 + 4logYt-4 + ut
You will need to transform the original variables into the natural log form before generating lags. The
Stata lag operator could be helpful here. For instance,
gen lrpdi_2 = L2.lrpdi
generates the second lag of lrpdi, i.e. lrpdit-2.
(b)
Eliminate the least significant lag and re-estimate. Repeat this process until all remaining lags
are significant (at 5%). Test the final specification against the original model to see if the
restrictions imposed are jointly valid (NB: the STATA command sw performs both forward
and backward stepwise estimation).
30
COMPUTER EXERCISE 8: Instrumental Variable Estimation
I will illustrate how to use ivregress (2sls) with a classical study of male wages (Griliches JPE
1976). Griliches models log real wage as a function of:
s: years of schooling;
exper: years of experience;
rns: South dummy;
smsa: urban/rural dummy
tenure: years of tenure
and a set of year dummies since the data are a set of pooled cross sections. The suspected endogenous
variable is iq (the worker’s IQ score), which is believed to contain measurement error.
Load the dataset comp8.dta into Stata.
1) Estimate Two-Stage Least Squares (2SLS), instrumenting iq on med, kww, age and mrt (mother’s
level of education, the score on another standardized test, own age and own marital status), and test for
over-identifying restrictions:
ivregress 2sls lw s expr tenure rns smsa _Iyear* (iq=med kww age mrt), first
2) Rerun 2SLS, but only using med, kww as instruments while treating mrt as exogenous, and test
again for over-identifying restrictions.
ivregress 2sls lw s expr tenure rns smsa _Iyear* mrt (iq=med kww), first
3) Carry out the Hausman test for endogeneity in IV estimation.
quietly ivregress 2sls lw s expr tenure rns smsa _Iyear* mrt (iq=med kww)
estimates store iv
quietly reg lw s expr tenure rns smsa _Iyear* mrt iq
estimates store ols
hausman iv ols, constant sigmamore
QUESTIONS C8
1. Compare your IV (2SLS) and OLS estimates. Comment on the differences in coefficients.
2. Are you convinced that your instruments are both relevant and exogenous?
31
COMPUTER EXERCISE 9: The Computer-based coursework Project
The main purpose of this exercise is to familiarize yourself with the dataset to be used in the project,
which is a 20% random sample of working-age men in England from the UK Quarterly Labour Force
Survey (QLFS). You should make an attempt to obtain a consistent estimator when one or more of
your regressors are correlated with the error term, using the instrumental variable (IV) approach.
The length of the computer-based coursework project is 1,500 words, plus an appendix up to 5 page
long containing summary statistics and main estimation results. The work should be submitted to the
Economics General Office no later than 12.00 on Friday 24th January 2014. In fairness to those who
meet the due date and time, no work will be accepted after this time and a zero grade will be recorded
unless there are acceptable, documented medical or other reasons for late submission. You are advised
to begin your work for the assignment well before the end of term.
The computer-based coursework project assesses the writing, modelling, literature, computing,
interpretation and empirical research development learning outcomes. You are expected to select and
estimate your own model, interpret the results and evaluate the adequacy of your model. You are not
expected to undertake a comprehensive search for an adequate model. The work will be marked on the
quality of your interpretation and evaluation of your chosen model, not necessarily on the success of
finding a valid instrument.
Here are a few general tips:
 Motivate your paper with a brief literature review
 Include a data section with summary statistics for the key variables
 Check for outliers and inconsistencies before running regressions
 Run diagnostic tests after regression to assess the validity of the empirical model
 Interpret the empirical findings and discuss the policy implications if necessary
 Carry out sensitivity (robustness) checks if possible
 Summarize your findings
 Don’t forget your references
Chapter 17 of Wooldridge (2009) offers a nice guide on how to carry out an empirical project.
* 1) Create and save a 50% personalized random sample using a unique seed number (such as your
date of birth)
use samp821, clear
set seed xxxxxx
// e.g. 880301 if you were born on the 1st March 1988
sample 50, by(lfsyear nvqequiv) // create a 50% random sample within each by group
save sample50, replace
* 2) Check for outliers and inconsistencies
tab lfsyear nvqequiv
count if logwage==.
// find the number of observations with missing real hourly wages
codebook logwage
egen lgwgpc1 = pctile(logwage), p(1) by(nvqequiv)
egen lgwgpc99 = pctile(logwage), p(99) by(nvqequiv)
32
table lfsyear nvqequiv, c(mean logwage median logwage mean lgwgpc1 mean lgwgpc99)
format(%4.2f)
keep if logwage>=lgwgpc1 & logwage<=lgwgpc99 // drop the top and bottom 1% wages
table lfsyear nvqequiv, c(mean logwage) format(%4.2f) row col
table nvqequiv highqvoc, c(mean logwage) format(%4.2f) row col
gen age_sq = age_^2
// create the quadratic term for age_
tab nvqlv2 anyqual
* 3) Summary statistics for the key variables
su logwage nvqlv2 anyqual married cohab age_ age_sq nonwhite lim_dis lfsyear london se
* 4) OLS estimation and diagonostic tests
* 4a) treating qualifications as continuous
xi: reg logwage nvqequiv married cohab age_ age_sq nonwhite lim_dis i.lfsyear london se if
nvqequiv>=0 & nvqequiv<=2
*4b) treating qualifications as categorical or binary
xi: reg logwage i.nvqequiv married cohab age_ age_sq nonwhite lim_dis i.lfsyear london se
xi: reg logwage nvqlv2 married cohab age_ age_sq nonwhite lim_dis i.lfsyear london se
xi: reg logwage anyqual married cohab age_ age_sq nonwhite lim_dis i.lfsyear london se
* 5) Simple IV using either nvqlv2 or anyqual as the education measure
xi: ivregress 2sls logwage (nvqlv2=rosla) married cohab age_ age_sq nonwhite lim_dis i.lfsyear
london se, first
xi: ivregress 2sls logwage (anyqual=rosla) married cohab age_ age_sq nonwhite lim_dis i.lfsyear
london se, first
exit
33
COMPUTER EXERCISE 10: Simultaneous Equation Systems
The data used for this exercise is simeq.dta.
The Klein Model Number 1 is a very simple, highly aggregated linear model for the US economy in
the inter-war period. While it is not necessarily an accurate model, it is useful for pedagogical
purposes. The model consists of three behavioural equations and five identities:
Ct = 0 + 1Wt + 2t + 3t-1 + 1t
(1)
It = 0 + 1t + 2t-1 + 3Kt-1 + 2t
(2)
W1t = 0 + 1Et + 2Et-1 + 3t + 3t
(3)
Yt + Tt = Ct + It + Gt
Total Product identity
(4)
Yt = t + Wt
Income
(5)
Kt = It + Kt-1
Capital stock dynamics
(6)
Wt = W1t + W2t
Wage bill
(7)
Et = Yt + Tt - W2t
Private sector product
(8)
C = consumers’ expenditure
I = net investment
 = profits
K = capital stock
E = private sector product
G = Government expenditure
1t, 2t, 3t are serially uncorrelated error terms.
W = Wages
W1 = Private sector wages
W2 = Government sector wages
Y = income
T = taxes
where
The three behavioural equations are:



a consumption function (1) which allows for different propensities to consume from wage and
profit income and allows for simple dynamics by including a lagged profits term;
an investment function (2) which is cash-flow type equation typical of much early US
econometric work on investment in which investment is related to current and past profits and
the beginning of year (i.e. end of previous year) capital stock;
an equation determining the private sector wage bill (3) as a function of current and lagged
private sector product and a trend effect to capture productivity growth.
The five identities close the model. Klein specifies that the variables Ct, It, W1t, Yt, t, Kt, Wt and Et
are endogenous, and the remaining variables are exogenous.
1.
Establish the degree of identification of each of the three behavioural equations (1), (2) and
(3).
34
2.
The dataset to be used in this exercise is contained in the file simeq.dta on the network server
in the usual directory. This contains the required data for estimation of the above model - note
that KLAG is the capital stock already lagged by one period. Estimate each behavioural
equation by OLS and interpret and appraise your results.
3.
Write down the reduced form equations for Wt, t and Et (in general terms - do not try to
specify the reduced form parameters in terms of the structural coefficients!). Estimate the
reduced form equations by OLS and re-estimate the behavioural structural equations using the
fitted values of Wt, t and Et as appropriate. Compare and contrast these estimates with the
OLS estimates obtained above.
4.
Finally replicate these indirect 2SLS estimates by estimating each of equations (1), (2), and
(3) directly by two stage least squares.
QUESTIONS C10
1
Comment on the econometric specification of the model.
35
School of Economics, University of Kent
Class Test for EC821 Econometric Methods,
13th November, 2012
There are TWO sections. Candidates should answer the question in Section A and one of the
two questions in Section B. The percentage of marks is given in square brackets.
Section A
Q1 [60 marks]:
The Stata output for Model A in the Appendix is based on a random sample of non-UK born
female immigrants in Great Britain who have obtained post-compulsory qualifications in the
UK.
a)
b)
c)
What is the expected log real hourly wage of a 50 year-old native English speaker,
who has a degree and lives in Scotland? At what age is her wage expected to peak? [10]
Interpret the coefficients for degree and eal respectively. [10]
Are the regressors statistically significant individually? Are they statistically
significant jointly? Comment on the goodness-of-fit of the model. [10]
d) Comment on the following diagnostic tests for Model A. [10]
i)
.
estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1)
Prob > chi2
ii)
.
=
=
2.03
0.1545
estat ovtest
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 395) =
0.40
Prob > F =
0.7550
e) In Model B in the Appendix, four new interaction terms of lse with the regressors are
added (e.g. lse_age is the defined as lse*age). Explain the rationale for including these extra
regressors. Compare the goodness-of-fit of the two models. Use a formal statistical test to
determine whether the inclusion of the additional regressors in Model B is justified. [20]
Turnover
36
Section B
Q2 [40 marks]:
Briefly explain the following terms:
a) Omitted variable bias (OVB) [20]
b) Cochrane-Orcutt Estimator [20]
Q3 [40 marks]:
Discuss the usefulness of the difference-in-difference (DID) approach in policy
(programme) evaluation. Use an example to illustrate if necessary.
End
37
Estimation Results Appendix
Model A
Source |
SS
df
MS
-------------+-----------------------------Model | 24.1105589
5 4.82211177
Residual | 77.4233283
398 .194530976
-------------+-----------------------------Total | 101.533887
403 .251945129
Number of obs
F( 5,
398)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
404
24.79
0.0000
0.2375
0.2279
.44106
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0764121
.0177832
4.30
0.000
.0414513
.1113728
agesq |
-.000843
.0002216
-3.80
0.000
-.0012788
-.0004073
degree |
.3894523
.0454262
8.57
0.000
.3001471
.4787575
eal | -.0915466
.044975
-2.04
0.042
-.1799649
-.0031283
lse |
.143367
.0461615
3.11
0.002
.0526161
.2341178
_cons |
.5539594
.3430593
1.61
0.107
-.1204754
1.228394
Note: logwage is the natural logarithm of real hourly wage, age is age and agesq is age
squared, degree equals one if the respondent holds any degree and zero otherwise, eal equals
one if the respondent is a non-native English speaker and zero otherwise, and lse is an
indicator for living in the Southeast region (including London) of England.
Model B
Source |
SS
df
MS
-------------+-----------------------------Model | 25.0530159
9 2.78366843
Residual | 76.4808712
394 .194113886
-------------+-----------------------------Total | 101.533887
403 .251945129
Number of obs =
F( 9,
394) =
Prob > F
=
R-squared
=
Adj R-squared =
Root MSE
=
404
14.34
0.0000
0.2467
0.2295
.44058
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.078254
.0295438
2.65
0.008
.0201708
.1363371
agesq | -.0008332
.0003643
-2.29
0.023
-.0015493
-.000117
degree |
.5073835
.0787403
6.44
0.000
.3525798
.6621872
eal | -.0165463
.0767522
-0.22
0.829
-.1674415
.1343489
lse |
.4374799
.7164851
0.61
0.542
-.9711321
1.846092
lse_age | -.0046711
.0371022
-0.13
0.900
-.0776142
.0682719
lse_agesq |
.0000115
.0004602
0.03
0.980
-.0008932
.0009163
lse_degree | -.1794237
.0964716
-1.86
0.064
-.3690872
.0102398
lse_eal | -.1149829
.0947365
-1.21
0.226
-.3012351
.0712693
_cons |
.3823179
.5750189
0.66
0.507
-.748171
1.512807
------------------------------------------------------------------------------
Note: logwage, age, agesq, degree, eal and lse are defined as above. lse_age=lse*age,
lse_agesq=lse*agesq, lse_degree=lse*degree and lse_eal=lse*eal.
38
School of Economics, University of Kent
Class Test for EC821 Econometric Methods,
15th November, 2011
There are TWO sections. Candidates should answer the question in Section A and one of the
two questions in Section B. The percentage of marks is given in square brackets.
Section A
Q1 [60 marks]:
The following Stata output is based on a random sample of male graduates (i.e. with at least a
first degree) in the UK aged 25-55 and in full-time employment.
Model A
Source |
SS
df
MS
-------------+-----------------------------Model |
12.530607
5 2.50612139
Residual | 88.2395447
527 .167437466
-------------+-----------------------------Total | 100.770152
532 .189417578
Number of obs
F( 5,
527)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
533
14.97
0.0000
0.1243
0.1160
.40919
-----------------------------------------------------------------------------lrhrwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1039071
.020438
5.08
0.000
.0637571
.144057
agesq | -.0011493
.000259
-4.44
0.000
-.001658
-.0006405
highrdeg |
.0842088
.0415832
2.03
0.043
.0025195
.165898
london |
.1327832
.060486
2.20
0.029
.0139599
.2516066
se |
.0988441
.0507766
1.95
0.052
-.0009052
.1985935
_cons |
.5586906
.3897368
1.43
0.152
-.2069378
1.324319
------------------------------------------------------------------------------
where lrhrwage is the natural logarithm of real hourly wage, age is age and agesq is age
squared, highrdeg equals one if the respondent holds a higher degree and zero otherwise,
london and se are indicators for living in London and the Southeast region (excluding
London) respectively.
d)
What is the interpretation of the coefficient for the term _cons in the Stata output?
Provide an interpretation of the coefficient on highrdeg. Which region in the UK has the
lowest expected wage for male graduates? [15]
e)
Briefly comment on the statistical significance of each regressor. Are the regressors
statistically significant jointly? [10]
f)
What is the expected log real hourly wage of a 40-year old male graduate who has a
higher degree and lives in London? At what age is his wage expected to peak? [10]
Turn over
39
d) Comment on the following diagnostic tests for Model A. [10]
i)
. estat ovtest
Ramsey RESET test using powers of the fitted values of lrhrwage
Ho: model has no omitted variables
F(3, 524) =
2.35
Prob > F =
0.0720
ii)
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of lrhrwage
chi2(1)
Prob > chi2
=
=
6.82
0.0090
e) In Model B below, the region dummies are left out. Compare the goodness-of-fit of the
two models. Use a formal statistical test to determine whether the region dummies are jointly
significant. [15]
Model B
Source |
SS
df
MS
-------------+-----------------------------Model | 11.2606921
3 3.75356404
Residual | 89.5094595
529 .169205028
-------------+-----------------------------Total | 100.770152
532 .189417578
Number of obs
F( 3,
529)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
533
22.18
0.0000
0.1117
0.1067
.41135
-----------------------------------------------------------------------------lrhrwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1012471
.0204911
4.94
0.000
.0609933
.141501
agesq | -.0011168
.0002597
-4.30
0.000
-.001627
-.0006066
highrdeg |
.0773654
.0416706
1.86
0.064
-.0044948
.1592255
_cons |
.6397865
.3901326
1.64
0.102
-.1266128
1.406186
------------------------------------------------------------------------------
Section B
Q2 [40 marks]:
Briefly explain the following terms:
c) Near-perfect multi-collinearity [20]
d) Linear Probability Model (LPM) [20]
Q3 [40 marks]:
Discuss the consequences of underfitting a regression model (omitting relevant variables).
Explain how you can test for this potential problem using a formal test.
End
40
School of Economics, University of Kent
Class Test for EC821 Econometric Methods,
16th November, 2010
There are TWO sections. Candidates should answer the question in Section A and one of the
two questions in Section B. The percentage of marks is given in square brackets.
Section A
Q1 [60 marks]:
The following Stata output is based on a 10% random sample of male employees aged 25-59
in the 2000 UK Quarterly Labour Force Survey (n=495).
Model A: All Employees
Source |
SS
df
MS
-------------+-----------------------------Model | 42.7914008
3 14.2638003
Residual | 109.259875
491 .222525203
-------------+-----------------------------Total | 152.051275
494 .307796104
Number of obs
F( 3,
491)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
495
64.10
0.0000
0.2814
0.2770
.47173
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu |
.1005878
.0077948
12.90
0.000
.0852725
.1159032
age |
.1139737
.0225003
5.07
0.000
.069765
.1581824
agesq |
-.001263
.0002707
-4.67
0.000
-.0017948
-.0007311
_cons | -1.793552
.4758444
-3.77
0.000
-2.728494
-.858609
------------------------------------------------------------------------------
logwage is the natural logarithm of real hourly wage, edu is age left full-time continuous
education, age is age and agesq is age squared. The Residual Sum of Squares (RSS) is equal
to 109.260 (keeping 3 decimal places).
a) What is the interpretation of the constant term (_cons in the Stata output)? Provide an
interpretation of the coefficient for edu and comment on its statistical significance. What is
the effect of age on log wages? What is the expected log real hourly wage of a 30-year old
who left full-time continuous education at age 18? Are the regressors statistically significant
jointly? [20]
b) Comment on the following diagnostic tests for Model A. [15]
i)
. predict res1, res
. sktest res1
Skewness/Kurtosis tests for Normality
------- joint -----Variable |
Obs
Pr(Skewness)
Pr(Kurtosis) adj chi2(2)
Prob>chi2
-------------+--------------------------------------------------------------res1 |
495
0.1706
0.0030
9.88
0.0072
41
ii)
. estat ovtest
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 488) =
0.37
Prob > F =
0.7742
iii)
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1)
Prob > chi2
=
=
0.31
0.5797
c) A researcher argues that separate regressions should be run for people who are members of
the trade-union (n1=146) and those who are not (n2=349), on the grounds that the trade
unions have a direct impact on wages through collective wage bargaining. The resulting
residuals sum of squares (RSS) are 16.955 and 89.099 for members and non-members
respectively. On the basis of the available evidence, do you think that the pooling of the two
sub-samples as in Model A is still justified? [15]
d) Comment on the view that actually it would be more interesting to test whether, after
allowing for an intercept difference, the slopes for union members and non-members are still
the same. [10]
Section B
Q2 [40 marks]:
Briefly explain the following terms:
a) autocorrelation and the Cochrane-Orcutt estimator [20]
b) the Difference-in-differences (DID) estimator [20]
Q3 [40 marks]:
Explain the consequences of failure of the homoskedasticity assumption. Discuss the
advantages and disadvantages of the following two approaches to deal with the problem:
a) Computing heteroskedasticity-robust statistics;
b) Using Weighted Least Squares (WLS) method.
End
42
UNIVERSITY OF KENT
EC821/13
FACULTY OF SOCIAL SCIENCES
LEVEL M EXAMINATION
SCHOOL OF ECONOMICS
ECONOMETRIC METHODS
Day, date : time
(exam is 2 hours long)
There are SIX questions, three in Section A and three in Section B. All questions carry equal
weight. Candidates should answer TWO questions, ONE from SECTION A and ONE from
SECTION B.
Statistical tables are attached to the paper. Approved calculators may be used.
A percentage breakdown of marks within each question is given as a guide to candidates in
their allocation of time.
Turn over
43
2
SECTION A
Answer ONE question from this section
1 A researcher investigates the pay penalty of being a non-native English speaker for UK male
immigrants with some UK qualifications.
Model A:
Source |
SS
df
MS
-------------+-----------------------------Model | 35.9785618
5 7.19571236
Residual | 129.589686
595 .217797791
-------------+-----------------------------Total | 165.568248
600 .275947079
Number of obs
F( 5,
595)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
601
33.04
0.0000
0.2173
0.2107
.46669
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------degree |
0.42988
0.03854
11.16
0.000
0.35419
0.50556
age |
0.02173
0.02207
0.98
0.325
-0.02163
0.06508
agesq |
-0.00014
0.00028
-0.50
0.616
-0.00069
0.00041
londonse |
0.12710
0.03860
3.29
0.001
0.05128
0.20291
nonnative |
-0.13854
0.04140
-3.35
0.001
-0.21986
-0.05723
_cons |
1.63717
0.42583
3.84
0.000
0.80086
2.47347
where logwage denotes log real hourly wage, degree is a dummy of holding a degree level
qualification, age is the age of the immigrant and agesq is the quadratic term, londonse is a
dummy for living in Southeast England including London, and nonnative is equal to one if the
immigrant is not a native English speaker.
a)
What is the expected log real hourly wage of a 30-year old graduate who is a non-native
English speaker and lives in London? At what age is his wage expected to peak? What is the
wage penalty of not being a native English speaker?
(15%)
b)
Are the slope coefficients statistically significant individually and jointly? Would you
drop the age variables? Comment on the goodness-of-fit of the regression model.
(15%)
c) Comment on the following diagnostic tests.
(20%)
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1) = 3.55
Prob > chi2 = 0.0597
Turn over
3
44
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 592) = 0.33
Prob > F = 0.8008
d) In Model B below, controls for being non-white (nonwhite) and being born in a developing
country (dvlpng) are added to the model. Comment on the estimates and compare them with their
counterparts in Model A. Is the inclusion of these extra regressors justified?
(25%)
Model B:
Source |
SS
df
MS
-------------+-----------------------------Model | 37.3716564
7 5.33880806
Residual | 128.196591
593 .216183122
-------------+-----------------------------Total | 165.568248
600 .275947079
Number of obs
F( 7,
593)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
601
24.70
0.0000
0.2257
0.2166
.46495
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------degree |
0.43454
0.03850
11.29
0.000
0.35893
0.51015
age |
0.02071
0.02200
0.94
0.347
-0.02249
0.06391
agesq |
-0.00013
0.00028
-0.45
0.652
-0.00067
0.00042
londonse |
0.13386
0.03861
3.47
0.001
0.05802
0.20969
nonnative |
-0.13474
0.04128
-3.26
0.001
-0.21581
-0.05367
nonwhite |
-0.08542
0.06365
-1.34
0.180
-0.21044
0.03959
dvlpng |
-0.06895
0.04359
-1.58
0.114
-0.15455
0.01666
_cons |
1.76811
0.42789
4.13
0.000
0.92774
2.60847
------------------------------------------------------------------------------
e) Discuss whether you could regress a binary indicator for working (1=working, 0=not working)
on the same set of right-hand-side regressors as in Model B. What specific econometric issues
would arise from such a model?
(25%)
2
Write short essays on TWO of the following:
a) The Weighted Least Squares (WLS).
b) The Cochrane-Orcutt procedure.
(50%)
c) The Chow test.
3
(50%)
(50%)
Discuss the implications of underfitting a model and the strategies one can adopt to deal
with the problem in empirical work.
Turn over
45
4
SECTION B
Answer ONE question from this section
4.
A researcher investigates the relationship between weekly earnings and (weekly) overtime
hours for working mothers with dependent children.
lnY = β10 + β11overtime + β12edu + β13age + β14public + β15union + u1
(1)
overtime = β20 + β21lnY + β22edu + β23age + β24ageyngkid + u2
(2)
where lnY is log real weekly earnings, overtime is weekly overtime hours, edu is years of
education, age is the age of the respondent, public is a dummy for working in the public
sector, union is a dummy for being a member of a trade union and ageyngkid is the age of
the youngest child.
5.
6.
a)
Why are OLS estimates for both equations biased?
(20%)
b)
Under what conditions are these two equations identified?
c)
Write down the reduced-form for log weekly earnings, i.e. eq. (1).
d)
Briefly describe how you would solve this simultaneous equation system.
(20%)
e)
Suppose it turns out that both public sector jobs and trade union
membership have a direct effect on overtime hours. What problem does this
pose for the identification of the system? (20%)
(20%)
(20%)
Write short essays on TWO of the following:
a) Weak Instruments.
(50%)
b) The Order Condition.
(50%)
c) Recursive Systems.
(50%)
Explain why Two Stage Least Squares (2SLS) can be used to estimate causal relationships when
Ordinary Least Squares (OLS) fails. Use examples to illustrate if necessary.
END
46
UNIVERSITY OF KENT
EC821/12
FACULTY OF SOCIAL SCIENCES
LEVEL M EXAMINATION
SCHOOL OF ECONOMICS
ECONOMETRIC METHODS
Day, date : time
(exam is 2 hours long)
There are SIX questions, three in Section A and three in Section B. All questions carry equal
weight. Candidates should answer TWO questions, ONE from SECTION A and ONE from
SECTION B.
Statistical tables are attached to the paper. Approved calculators may be used.
A percentage breakdown of marks within each question is given as a guide to candidates in
their allocation of time.
Turn over
47
2
SECTION A
Answer ONE question from this section
4
A researcher investigates the part-time pay penalty (PTPP) for women in the UK. The
following Stata output is based on an OLS regression using a sample of prime-aged female
graduates working as employees in Southeast England from the 2010 UK Quarterly Labour
Force Survey (QLFS).
Model A: Graduate sample
Source |
SS
df
MS
-------------+-----------------------------Model | 9.50228844
4 2.37557211
Residual | 93.7019943
554 .169137174
-------------+-----------------------------Total | 103.204283
558 .184953912
Number of obs
F( 4,
554)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
559
14.05
0.0000
0.0921
0.0855
.41126
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1239444
.0206581
6.00
0.000
.0833667
.1645221
agesq | -.0014951
.0002662
-5.62
0.000
-.0020181
-.0009722
london |
.1289722
.0363192
3.55
0.000
.057632
.2003124
parttime | -.0970247
.0414972
-2.34
0.020
-.1785357
-.0155137
_cons |
.3604125
.3837088
0.94
0.348
-.3932894
1.114114
where logwage denotes log real hourly wage, age is the age of the graduate and agesq is the
quadratic term, london is a dummy for living in London and parttime is equal to one if the
respondent works less than 30 hours a week.
a)
What is the interpretation of the intercept term? Interpret the slope coefficients. Are
the slope coefficients statistically significant individually and jointly? Comment on the
goodness-of-fit of the regression model.
(30%)
b) Comment on the following diagnostic tests.
(20%)
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1)
Prob > chi2
=
=
0.86
0.3548
. estat ovtest
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 551) =
1.11
Prob > F =
0.3460
Turn over
48
3
c) In Model B below, analysis was conducted using a sample of women whose highest
qualifications are 2 or more A Levels but otherwise have the same characteristics as the sample in
Model A. Comment on the estimates and compare them with their counterparts in Model A.
(10%)
Model B: Non-graduate sample
Source |
SS
df
MS
-------------+-----------------------------Model | 4.49693739
4 1.12423435
Residual | 40.6018868
208 .195201379
-------------+-----------------------------Total | 45.0988242
212 .212730303
Number of obs
F( 4,
208)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
213
5.76
0.0002
0.0997
0.0824
.44182
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0982575
.0363186
2.71
0.007
.0266578
.1698573
agesq | -.0012374
.0004578
-2.70
0.007
-.0021399
-.0003349
london |
.074544
.0654488
1.14
0.256
-.0544841
.2035721
parttime | -.2692822
.0675418
-3.99
0.000
-.4024364
-.136128
_cons |
.7528719
.6898957
1.09
0.276
-.6072124
2.112956
------------------------------------------------------------------------------
d) Model C pooled graduates and non-graduates to increase the sample size. Is the pooling
justified?
(20%)
Model C: Pooled sample
Source |
SS
df
MS
-------------+-----------------------------Model | 12.0149052
4
3.0037263
Residual | 146.116448
767 .190503844
-------------+-----------------------------Total | 158.131353
771 .205099032
Number of obs
F( 4,
767)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
772
15.77
0.0000
0.0760
0.0712
.43647
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.1103622
.0185349
5.95
0.000
.0739771
.1467474
agesq | -.0013458
.0002375
-5.67
0.000
-.0018121
-.0008796
london |
.1226312
.0330094
3.72
0.000
.0578318
.1874306
parttime | -.1591381
.0367086
-4.34
0.000
-.2311994
-.0870768
_cons |
.5935165
.346235
1.71
0.087
-.0861641
1.273197
------------------------------------------------------------------------------
e) Discuss how you would modify Model C above to allow for differential part-time pay penalty
for women with different education levels. How would you select your final model specification,
to avoid either over-fitting or under-fitting?
(20%)
Turn over
49
5
6
Write short essays on TWO of the following:
a) The Linear Probability Model (LPM).
(50%)
b) The Omitted Variable Bias (OVB).
(50%)
d) The Durbin-Watson (DW) test for autocorrelation.
(50%)
Discuss how natural experiments can be exploited to uncover the treatment effect of
government policies. Use examples to illustrate if necessary.
Turn over
50
4
SECTION B
Answer ONE question from this section
4.
A researcher investigates the relationship between wage and house ownership for male
employees:
lnW = β10 + β11ownhouse + β12edu + β13age + β14age2 + β15union + u1
(1)
ownhouse = β20 + β21lnW + β22edu + β23age + β24age2 + β25anykid + u2
(2)
where lnW is log real hourly wage, ownhouse is a dummy for owning house (either
outright or with mortgage), age is the age of the respondent, edu is years of education,
union is a dummy for being a member of a trade union and anykid is a dummy for having
any children.
f)
Why are OLS estimates for the wage equation, i.e. equation (1), biased? (20%)
g)
Under what conditions are these two equations identified?
(20%)
h)
Write down the reduced-form for house ownership. i.e. eq(2).
(20%)
i)
j)
7.
6.
Briefly describe how you would solve this simultaneous equation system.
(20%)
Suppose it turns out that having any children has a direct effect on wages.
What problem does this pose for the identification of the system? (20%)
Write short essays on TWO of the following:
a) The Two Stage Least Squares (2SLS) method.
(50%)
b) Structural Equations and Reduced Form (RF).
(50%)
c) Indirect Least Squares (ILS) Estimator.
(50%)
Discuss the identification problem in estimating simultaneous equation models.
END
51
UNIVERSITY OF KENT
EC821/11
FACULTY OF SOCIAL SCIENCES
LEVEL M EXAMINATION
SCHOOL OF ECONOMICS
ECONOMETRIC METHODS
Day, date : time
(exam is 2 hours long)
There are SIX questions, three in Section A and three in Section B. All questions carry equal
weight. Candidates should answer TWO questions, ONE from SECTION A and ONE from
SECTION B.
Statistical tables are attached to the paper. Approved calculators may be used.
A percentage breakdown of marks within each question is given as a guide to candidates in
their allocation of time.
Turn over
52
2
SECTION A
Answer ONE question from this section
1
A researcher is interested in the economic returns to a Master’s degree. The following
Stata output is based on an OLS regression using a sample of prime- aged male graduates
working as employees in England and Wales from the 2008 UK Quarterly Labour Force
Survey (QLFS).
Model A
Source |
SS
df
MS
-------------+-----------------------------Model | 37.2972485
3 12.4324162
Residual | 323.485383 1516
.21338086
-------------+-----------------------------Total | 360.782632 1519 .237513253
Number of obs
F( 3, 1516)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1520
58.26
0.0000
0.1034
0.1016
.46193
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_ |
.1067626
.0139003
7.68
0.000
.0794967
.1340284
age_sq | -.0011569
.0001744
-6.63
0.000
-.001499
-.0008148
master |
.1012331
.0287821
3.52
0.000
.0447761
.1576901
_cons |
.4399224
.267999
1.64
0.101
-.0857656
.9656105
------------------------------------------------------------------------------
where logwage denotes log real hourly wage, age_ is the age of the graduate and age_sq is the
quadratic term, master is a dummy for having a Master’s degree.
a)
What is the interpretation of the intercept term? Interpret the slope coefficients. Are
the slope coefficients statistically significant individually and jointly? Comment on the
goodness-of-fit of the regression model.
(30%)
b) Comment on the following diagnostic tests.
(20%)
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of logwage
chi2(1)
Prob > chi2
=
=
15.29
0.0001
. estat ovtest
Ramsey RESET test using powers of the fitted values of logwage
Ho: model has no omitted variables
F(3, 1513) =
2.54
Prob > F =
0.0552
Turn over
53
3
c) In an extended model, dummies for broad undergraduate subjects studied are included: LEM
(Law, Economics and Management), COMB (Combined subjects), and OSSAH (Other Social
Sciences, Arts and Humanities), with STEM (Science, Technology, Engineering and Maths)
omitted. Comment on the coefficients of these dummy variables. Which subjects give the highest
returns and which give the lowest? Is the inclusion of the degree subject dummies justified?
(30%)
Model B
Source |
SS
df
MS
-------------+-----------------------------Model | 45.1531913
6 7.52553189
Residual |
315.62944 1513 .208611659
-------------+-----------------------------Total | 360.782632 1519 .237513253
Number of obs
F( 6, 1513)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
1520
36.07
0.0000
0.1252
0.1217
.45674
-----------------------------------------------------------------------------logwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_ |
.1052694
.0137467
7.66
0.000
.0783047
.1322341
age_sq | -.0011343
.0001725
-6.58
0.000
-.0014726
-.0007959
master |
.0862536
.02863
3.01
0.003
.030095
.1424122
LEM |
.054434
.0322357
1.69
0.091
-.0087975
.1176654
COMB | -.0222508
.0337596
-0.66
0.510
-.0884714
.0439698
OSSAH | -.1654217
.0320012
-5.17
0.000
-.2281932
-.1026502
_cons |
.4902699
.2654278
1.85
0.065
-.0303756
1.010915
------------------------------------------------------------------------------
d) How would you modify Model B above to allow for differential returns to Master’s degree for
graduates in different undergraduate subjects? How would you select your final model
specification, between Model B above and the more general specification?
(20%)
2
3
Write short essays on TWO of the following:
a) The Difference-in-differences (DID) estimator.
(50%)
b) The Cochrane-Orcutt Regression.
(50%)
e) The Chow-test.
(50%)
Discuss the implications of underfitting and overfitting a regression model.
Turn over
54
4
SECTION B
Answer ONE question from this section
4.
A researcher investigates the relationship between wages and job tenure for married
men:
lnW = β10 + β11tenure + β12edu + β13age + β14age2 + β15union + u1 (1)
tenure = β20 + β21lnW + β22edu + β23age + β24age2 + β25dist + u2
(2)
where lnW is log hourly wage, tenure is job tenure (years with the current employer),
edu is years of education, age is the age of the employee, union is a dummy for being
a union member and dist is travel-to-work distance.
a) Why are OLS estimates of both equations biased in general?
(20%)
b) Write down the reduced-form for the job tenure equation (ie equation (2)).
(20%)
c) Discuss the Order Conditions of both equations.
(20%)
d) Explain in detail how you would solve this simultaneous equation system.
(20%)
e) Another researcher argues that equation (2) is misspecified, as union status should
have a direct effect on job tenure. What problem does this pose for the identification of
the system?
(20%)
5
6.
Write short essays on TWO of the following:
d) The Hausman test for endogeneity.
(50%)
e) The Order Conditions.
(50%)
f) Recursive System.
(50%)
Discuss the popularity of the Two Stage Least Squares (2SLS) method in applied microeconometrics.
END
55
EC821 QUANTITATIVE ECONOMICS - STATISTICAL TABLES
TABLE 1:
Areas for the standard normal distribution
N(0,1)
The table shows the area under the standard
normal distribution, N(0,1), between 0 and z.
For example, P(0 < z < 1.40) = 0.4192
0
z
z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.5
4.0
0.0000
0.0398
0.0793
0.1179
0.1554
0.1915
0.2257
0.2580
0.2881
0.3159
0.3413
0.3643
0.3849
0.4032
0.4192
0.4332
0.4452
0.4554
0.4641
0.4713
0.4772
0.4821
0.4861
0.4893
0.4918
0.4938
0.4953
0.4965
0.4974
0.4981
0.4987
0.4997
0.4999
0.0040
0.0438
0.0832
0.1217
0.1591
0.1950
0.2291
0.2611
0.2910
0.3186
0.3438
0.3665
0.3869
0.4049
0.4207
0.4345
0.4463
0.4564
0.4649
0.4719
0.4778
0.4826
0.4864
0.4896
0.4920
0.4940
0.4955
0.4966
0.4975
0.4982
0.0080
0.0478
0.0871
0.1255
0.1628
0.1985
0.2324
0.2642
0.2939
0.3212
0.3461
0.3686
0.3888
0.4066
0.4222
0.4357
0.4474
0.4573
0.4656
0.4726
0.4783
0.4830
0.4868
0.4898
0.4922
0.4941
0.4956
0.4967
0.4976
0.4982
0.0120
0.0517
0.0910
0.1293
0.1664
0.2019
0.2357
0.2673
0.2967
0.3238
0.3485
0.3708
0.3907
0.4082
0.4236
0.4370
0.4484
0.4582
0.4664
0.4732
0.4788
0.4834
0.4871
0.4901
0.4925
0.4943
0.4957
0.4968
0.4977
0.4983
0.0160
0.0557
0.0948
0.1331
0.1700
0.2054
0.2389
0.2704
0.2995
0.3264
0.3508
0.3729
0.3925
0.4099
0.4251
0.4382
0.4495
0.4591
0.4671
0.4738
0.4793
0.4838
0.4875
0.4904
0.4927
0.4945
0.4959
0.4969
0.4977
0.4984
0.0199
0.0596
0.0987
0.1368
0.1736
0.2088
0.2422
0.2734
0.3023
0.3289
0.3531
0.3749
0.3944
0.4115
0.4265
0.4394
0.4505
0.4599
0.4678
0.4744
0.4798
0.4842
0.4878
0.4906
0.4929
0.4946
0.4960
0.4970
0.4978
0.4984
0.0239
0.0636
0.1026
0.1406
0.1772
0.2123
0.2454
0.2764
0.3051
0.3315
0.3554
0.3770
0.3962
0.4131
0.4279
0.4406
0.4515
0.4608
0.4686
0.4750
0.4803
0.4846
0.4881
0.4909
0.4931
0.4948
0.4961
0.4971
0.4979
0.4985
0.0279
0.0675
0.1064
0.1443
0.1808
0.2157
0.2486
0.2794
0.3078
0.3340
0.3577
0.3790
0.3980
0.4147
0.4292
0.4418
0.4525
0.4616
0.4693
0.4756
0.4808
0.4850
0.4884
0.4911
0.4932
0.4949
0.4962
0.4972
0.4979
0.4985
0.0319
0.0714
0.1103
0.1480
0.1844
0.2190
0.2517
0.2823
0.3106
0.3365
0.3599
0.3810
0.3997
0.4162
0.4306
0.4429
0.4535
0.4625
0.4699
0.4761
0.4812
0.4854
0.4887
0.4913
0.4934
0.4951
0.4963
0.4973
0.4980
0.4986
0.0359
0.0753
0.1141
0.1517
0.1879
0.2224
0.2549
0.2852
0.3133
0.3389
0.3621
0.3830
0.4015
0.4177
0.4319
0.4441
0.4545
0.4633
0.4706
0.4767
0.4817
0.4857
0.4890
0.4916
0.4936
0.4952
0.4964
0.4974
0.4981
0.4986
56
Areas for Student’s t distribution
TABLE 2:
t
-t
v
The table shows the critical value of the t distribution with v
degrees of freedom and total area in the two tails of the
distribution.
0
For example, if T has a t-distribution with 3 degrees of
freedom, then P(-3.182 < T < 3.182) = 0.95
t
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
80
100

0.20
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
1.311
1.310
1.303
1.296
1.292
1.290
1.282
Two-tail Probability
0.10
0.05
0.02
6.314
12.706
31.821
2.920
4.303
6.965
2.353
3.182
4.541
2.132
2.776
3.747
2.015
2.571
3.365
1.943
2.447
3.143
1.895
2.365
2.998
1.860
2.306
2.896
1.833
2.262
2.821
1.812
2.228
2.764
1.796
2.201
2.718
1.782
2.179
2.681
1.771
2.160
2.650
1.761
2.145
2.624
1.753
2.131
2.602
1.746
2.120
2.583
1.740
2.110
2.567
1.734
2.101
2.552
1.729
2.093
2.539
1.725
2.086
2.528
1.721
2.080
2.518
1.717
2.074
2.508
1.714
2.069
2.500
1.711
2.064
2.492
1.708
2.060
2.485
1.706
2.056
2.479
1.703
2.052
2.473
1.701
2.048
2.467
1.699
2.045
2.462
1.697
2.042
2.457
1.684
2.021
2.423
1.671
2.000
2.390
1.664
1.990
2.374
1.660
1.984
2.364
1.645
1.96
2.33
57
0.01
63.656
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.660
2.639
2.626
2.575
Chi-squared (2) Distribution
TABLE 3
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
40
50
60
70
80
90
100
0.50
0.455
1.386
2.366
3.357
4.351
5.348
6.346
7.344
8.343
9.342
10.341
11.340
12.340
13.339
14.339
15.338
16.338
17.338
18.338
19.337
24.337
29.336
39.335
49.335
59.335
69.334
79.334
89.334
99.334
0.20
1.642
3.219
4.642
5.989
7.289
8.558
9.803
11.030
12.242
13.442
14.631
15.812
16.985
18.151
19.311
20.465
21.615
22.760
23.900
25.038
30.675
36.250
47.269
58.164
68.972
79.715
90.405
101.05
111.67
Right-tail Probability
0.10
0.05
2.706
3.841
4.605
5.991
6.251
7.815
7.779
9.488
9.236
11.070
10.645
12.592
12.017
14.067
13.362
15.507
14.684
16.919
15.987
18.307
17.275
19.675
18.549
21.026
19.812
22.362
21.064
23.685
22.307
24.996
23.542
26.296
24.769
27.587
25.989
28.869
27.204
30.144
28.412
31.410
34.382
37.652
40.256
43.773
51.805
55.758
63.167
67.505
74.397
79.082
85.527
90.531
96.578
101.88
107.56
113.14
118.50
124.34
58
0.02
5.412
7.824
9.837
11.668
13.388
15.033
16.622
18.168
19.679
21.161
22.618
24.054
25.472
26.873
28.259
29.633
30.995
32.346
33.687
35.020
41.566
47.962
60.436
72.613
84.580
96.388
108.07
119.65
131.14
0.01
6.635
9.210
11.345
13.277
15.086
16.812
18.475
20.090
21.666
23.209
24.725
26.217
27.688
29.141
30.578
32.000
33.409
34.805
36.191
37.566
44.314
50.892
63.691
76.154
88.379
100.42
112.33
124.12
135.81
TABLE 4 - Durbin-Watson Table
N
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
45
50
55
60
65
70
75
80
85
90
95
100
Durbin-Watson d Statistic: dL and dU, 5% Significance Level
k=1
k=2
k=3
k=4
dL
dU
dL
dU
dL
dU
dL
dU
1.08
1.36
0.95
1.54
0.82
1.75
0.69
1.97
1.10
1.37
0.98
1.54
0.86
1.73
0.74
1.93
1.13
1.38
1.02
1.54
0.90
1.71
0.78
1.90
1.16
1.39
1.05
1.53
0.93
1.69
0.82
1.87
1.18
1.40
1.08
1.53
0.97
1.68
0.86
1.85
1.20
1.41
1.10
1.54
1.00
1.68
0.90
1.83
1.22
1.42
1.13
1.54
1.03
1.67
0.93
1.81
1.24
1.43
1.15
1.54
1.05
1.66
0.96
1.80
1.26
1.44
1.17
1.54
1.08
1.66
0.99
1.79
1.27
1.45
1.19
1.55
1.10
1.66
1.01
1.78
1.29
1.45
1.21
1.55
1.12
1.66
1.04
1.77
1.30
1.46
1.22
1.55
1.14
1.65
1.06
1.76
1.32
1.47
1.24
1.56
1.16
1.65
1.08
1.76
1.33
1.48
1.26
1.56
1.18
1.65
1.10
1.75
1.34
1.48
1.27
1.56
1.20
1.65
1.12
1.74
1.35
1.49
1.28
1.57
1.21
1.65
1.14
1.74
1.36
1.50
1.30
1.57
1.23
1.65
1.16
1.74
1.37
1.50
1.31
1.57
1.24
1.65
1.18
1.73
1.38
1.51
1.32
1.58
1.26
1.65
1.19
1.73
1.39
1.51
1.33
1.58
1.27
1.65
1.21
1.73
1.40
1.52
1.34
1.58
1.28
1.65
1.22
1.73
1.41
1.52
1.35
1.59
1.29
1.65
1.24
1.72
1.42
1.53
1.36
1.59
1.31
1.66
1.25
1.72
1.43
1.54
1.37
1.59
1.32
1.66
1.26
1.72
1.43
1.54
1.38
1.60
1.33
1.66
1.27
1.72
1.44
1.54
1.39
1.60
1.34
1.66
1.29
1.72
1.48
1.57
1.43
1.62
1.38
1.67
1.34
1.72
1.50
1.59
1.46
1.63
1.42
1.67
1.38
1.72
1.53
1.60
1.49
1.64
1.45
1.68
1.41
1.72
1.55
1.62
1.51
1.65
1.48
1.69
1.44
1.73
1.57
1.63
1.54
1.66
1.50
1.70
1.47
1.73
1.58
1.64
1.55
1.67
1.52
1.70
1.49
1.74
1.60
1.65
1.57
1.68
1.54
1.71
1.51
1.74
1.61
1.66
1.59
1.69
1.56
1.72
1.53
1.74
1.62
1.67
1.60
1.70
1.57
1.72
1.55
1.75
1.63
1.68
1.61
1.70
1.59
1.73
1.57
1.75
1.64
1.69
1.62
1.71
1.60
1.73
1.58
1.75
1.65
1.69
1.63
1.72
1.61
1.74
1.59
1.76
59
k=5
dL
DU
0.56
2.21
0.62
2.15
0.67
2.10
0.71
2.06
0.75
2.02
0.79
1.99
0.83
1.96
0.86
1.94
0.90
1.92
0.93
1.90
0.95
1.89
0.98
1.88
1.01
1.86
1.03
1.85
1.05
1.84
1.07
1.83
1.09
1.83
1.11
1.82
1.13
1.81
1.15
1.81
1.16
1.80
1.18
1.80
1.19
1.80
1.21
1.79
1.22
1.79
1.23
1.79
1.29
1.78
1.34
1.77
1.38
1.77
1.41
1.77
1.44
1.77
1.46
1.77
1.49
1.77
1.51
1.77
1.52
1.77
1.54
1.78
1.56
1.78
1.57
1.78
Degrees of Freedom for Denominator
TABLE 5A
F-distribution: 5% critical values
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
100
120
Degrees of Freedom for Numerator
1
2
3
4
5
6
7
8
9
10
15
20
25
30
40
60
80
100
161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 245.9 248.0 249.3 250.1 251.1 252.2 252.7 253.0
18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.43 19.45 19.46 19.46 19.47 19.48 19.48 19.49
10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.70 8.66 8.63 8.62 8.59 8.57 8.56 8.55
7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.86 5.80 5.77 5.75 5.72 5.69 5.67 5.66
6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.62 4.56 4.52 4.50 4.46 4.43 4.41 4.41
5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.94 3.87 3.83 3.81 3.77 3.74 3.72 3.71
5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.51 3.44 3.40 3.38 3.34 3.30 3.29 3.27
5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.22 3.15 3.11 3.08 3.04 3.01 2.99 2.97
5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.01 2.94 2.89 2.86 2.83 2.79 2.77 2.76
4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.85 2.77 2.73 2.70 2.66 2.62 2.60 2.59
4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.72 2.65 2.60 2.57 2.53 2.49 2.47 2.46
4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.62 2.54 2.50 2.47 2.43 2.38 2.36 2.35
4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.53 2.46 2.41 2.38 2.34 2.30 2.27 2.26
4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.46 2.39 2.34 2.31 2.27 2.22 2.20 2.19
4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.40 2.33 2.28 2.25 2.20 2.16 2.14 2.12
4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.35 2.28 2.23 2.19 2.15 2.11 2.08 2.07
4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.31 2.23 2.18 2.15 2.10 2.06 2.03 2.02
4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.27 2.19 2.14 2.11 2.06 2.02 1.99 1.98
4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.23 2.16 2.11 2.07 2.03 1.98 1.96 1.94
4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.20 2.12 2.07 2.04 1.99 1.95 1.92 1.91
4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.18 2.10 2.05 2.01 1.96 1.92 1.89 1.88
4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.15 2.07 2.02 1.98 1.94 1.89 1.86 1.85
4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.13 2.05 2.00 1.96 1.91 1.86 1.84 1.82
4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.11 2.03 1.97 1.94 1.89 1.84 1.82 1.80
4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.09 2.01 1.96 1.92 1.87 1.82 1.80 1.78
4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 2.07 1.99 1.94 1.90 1.85 1.80 1.78 1.76
4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 2.06 1.97 1.92 1.88 1.84 1.79 1.76 1.74
4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 2.04 1.96 1.91 1.87 1.82 1.77 1.74 1.73
4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 2.03 1.94 1.89 1.85 1.81 1.75 1.73 1.71
4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.01 1.93 1.88 1.84 1.79 1.74 1.71 1.70
4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 1.92 1.84 1.78 1.74 1.69 1.64 1.61 1.59
4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.84 1.75 1.69 1.65 1.59 1.53 1.50 1.48
3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 1.93 1.77 1.68 1.62 1.57 1.52 1.45 1.41 1.39
3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91 1.75 1.66 1.60 1.55 1.50 1.43 1.39 1.37
60
Degrees of Freedom for Denominator
TABLE 5B
F-distribution: 1% critical values
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
100
120
1
4052
98.51
34.12
21.20
16.26
13.75
12.25
11.26
10.56
10.04
9.65
9.33
9.07
8.86
8.68
8.53
8.40
8.29
8.18
8.10
8.02
7.95
7.88
7.82
7.77
7.72
7.68
7.64
7.60
7.56
7.31
7.08
6.90
6.85
2
5000
99.00
30.82
18.00
13.27
10.92
9.55
8.65
8.02
7.56
7.21
6.93
6.70
6.52
6.36
6.23
6.11
6.01
5.93
5.85
5.78
5.72
5.66
5.61
5.57
5.53
5.49
5.45
5.42
5.39
5.18
4.98
4.82
4.79
3
5402
99.17
29.46
16.70
12.06
9.78
8.45
7.59
6.99
6.55
6.22
5.95
5.74
5.56
5.42
5.29
5.19
5.09
5.01
4.94
4.87
4.82
4.76
4.72
4.68
4.64
4.60
4.57
4.54
4.51
4.31
4.13
3.98
3.95
4
5625
99.24
28.71
15.98
11.39
9.15
7.85
7.01
6.42
5.99
5.67
5.41
5.21
5.04
4.89
4.77
4.67
4.58
4.50
4.43
4.37
4.31
4.26
4.22
4.18
4.14
4.11
4.07
4.04
4.02
3.83
3.65
3.51
3.48
5
5763
99.30
28.24
15.52
10.97
8.75
7.46
6.63
6.06
5.64
5.32
5.06
4.86
4.69
4.56
4.44
4.34
4.25
4.17
4.10
4.04
3.99
3.94
3.90
3.85
3.82
3.78
3.75
3.73
3.70
3.51
3.34
3.21
3.17
6
5859
99.33
27.91
15.21
10.67
8.47
7.19
6.37
5.80
5.39
5.07
4.82
4.62
4.46
4.32
4.20
4.10
4.01
3.94
3.87
3.81
3.76
3.71
3.67
3.63
3.59
3.56
3.53
3.50
3.47
3.29
3.12
2.99
2.96
Degrees of Freedom for Numerator
7
8
9
10
15
20
25
30
40
60
80
100
5927 5980 6023 6054 6156 6209 6240 6261 6287 6312 6326 6334
99.37 99.37 99.40 99.40 99.43 99.46 99.46 99.46 99.47 99.49 99.49 99.49
27.67 27.49 27.34 27.23 26.87 26.69 26.58 26.50 26.41 26.32 26.27 26.24
14.98 14.80 14.66 14.55 14.20 14.02 13.91 13.84 13.75 13.65 13.61 13.58
10.46 10.29 10.16 10.05 9.72 9.55 9.45 9.38 9.29 9.20 9.16 9.13
8.26 8.10 7.98 7.87 7.56 7.40 7.30 7.23 7.14 7.06 7.01 6.99
6.99 6.84 6.72 6.62 6.31 6.16 6.06 5.99 5.91 5.82 5.78 5.75
6.18 6.03 5.91 5.81 5.52 5.36 5.26 5.20 5.12 5.03 4.99 4.96
5.61 5.47 5.35 5.26 4.96 4.81 4.71 4.65 4.57 4.48 4.44 4.42
5.20 5.06 4.94 4.85 4.56 4.41 4.31 4.25 4.17 4.08 4.04 4.01
4.89 4.74 4.63 4.54 4.25 4.10 4.01 3.94 3.86 3.78 3.73 3.71
4.64 4.50 4.39 4.30 4.01 3.86 3.76 3.70 3.62 3.54 3.49 3.47
4.44 4.30 4.19 4.10 3.82 3.66 3.57 3.51 3.43 3.34 3.30 3.27
4.28 4.14 4.03 3.94 3.66 3.51 3.41 3.35 3.27 3.18 3.14 3.11
4.14 4.00 3.89 3.80 3.52 3.37 3.28 3.21 3.13 3.05 3.00 2.98
4.03 3.89 3.78 3.69 3.41 3.26 3.17 3.10 3.02 2.93 2.89 2.86
3.93 3.79 3.68 3.59 3.31 3.16 3.07 3.00 2.92 2.83 2.79 2.76
3.84 3.71 3.60 3.51 3.23 3.08 2.98 2.92 2.84 2.75 2.70 2.68
3.77 3.63 3.52 3.43 3.15 3.00 2.91 2.84 2.76 2.67 2.63 2.60
3.70 3.56 3.46 3.37 3.09 2.94 2.84 2.78 2.69 2.61 2.56 2.54
3.64 3.51 3.40 3.31 3.03 2.88 2.79 2.72 2.64 2.55 2.50 2.48
3.59 3.45 3.35 3.26 2.98 2.83 2.73 2.67 2.58 2.50 2.45 2.42
3.54 3.41 3.30 3.21 2.93 2.78 2.69 2.62 2.54 2.45 2.40 2.37
3.50 3.36 3.26 3.17 2.89 2.74 2.64 2.58 2.49 2.40 2.36 2.33
3.46 3.32 3.22 3.13 2.85 2.70 2.60 2.54 2.45 2.36 2.32 2.29
3.42 3.29 3.18 3.09 2.82 2.66 2.57 2.50 2.42 2.33 2.28 2.25
3.39 3.26 3.15 3.06 2.78 2.63 2.54 2.47 2.38 2.29 2.25 2.22
3.36 3.23 3.12 3.03 2.75 2.60 2.51 2.44 2.35 2.26 2.22 2.19
3.33 3.20 3.09 3.00 2.73 2.57 2.48 2.41 2.33 2.23 2.19 2.16
3.30 3.17 3.07 2.98 2.70 2.55 2.45 2.39 2.30 2.21 2.16 2.13
3.12 2.99 2.89 2.80 2.52 2.37 2.27 2.20 2.11 2.02 1.97 1.94
2.95 2.82 2.72 2.63 2.35 2.20 2.10 2.03 1.94 1.84 1.78 1.75
2.82 2.69 2.59 2.50 2.22 2.07 1.97 1.89 1.80 1.69 1.63 1.60
2.79 2.66 2.56 2.47 2.19 2.03 1.93 1.86 1.76 1.66 1.60 1.56
61
Notes
62
Notes
63
Notes
64
Download