Empirical econometrics attempts to overcome problems of - cerge-ei

advertisement
Econometrics II, Summer 2004, CERGE-EI, Daniel Munich
Instructor’s notes
Applied econometrics (based on PK, Ch.1, 21)
1) Why?
a) Empirical thesis at CERGE-EI is more common
b) Use of econometrics with M.A. only in business
c) Expansion of empirical econometrics due to technological progress
in recent decades: PCs, speed, memory, survey and population
databases
d) Hunt for causal not only statistical relationship
2) Empirical econometrics
a) Theory
i) ≠ empirical
ii) Y=f(X) deterministic model of an economist
iii) technique oriented not problem oriented
iv) Standard solutions to standard problems
b) Empirical
i) Y=f(X) +  stochastic model of an econometrician
ii) Econometrics is much easier w/o data
iii) Attempts to overcome problems of imperfect data using
standard solution
 errors, mistakes, definitions
 endogeneity, lack of controlled experiments
iv) Why error term?
 omission of non-systematic factors
 omission of systematic influence
 measurement error
v) randomness of human behavior
3) General principles of empirical econometrics
a) Use common sense and/or economic theory
i) rate vs. rate, real vs. real, trend vs. trend, per capita vs. per
capita
ii) correlation ≠ causality
b) Know the context
i) History, institutions, data gathering, instructions, variables
definition, preliminary data cleaning, rounding, etc.
c) Inspect the data
i) Skipping this step -> wasting time in later research
ii) Summary statistics
1
Econometrics II, Summer 2004, CERGE-EI, Daniel Munich
Instructor’s notes
 Format types
 Summary statistics: positive, negative, zero, missing,
min/max,
 Graphing: scatter plots, histograms, trends (technology!)
 Cause of missing data: rejection, wrong coding, top-coding
d) Keep it simple
i) Bottom-up or specific-to-general approach
 Testing is biased if model is not complete
ii) Top-down or general-to-specific approach
 Less biased
 Infinite number of variables and functional forms with data
limits
iii) Compromise
 Expand simple model whenever it fails (misspecification
test)
e) Results
i) Expected sign of coefficients
 Omitted variable negatively correlated with included
variable
 Multicollinearity -> high variance -> possible negative
values
 Endogeneity bias: ALMP: U=f(-M) or M=g(+U)?
 Selection bias: impact of retraining on earnings (selfselection)
 Outliers
 Lack of identification (moves along demand or supply
curve?)
ii) Have plausible interpretation of unexpected results
iii) Significant important variables
iv) Magnitude of coefficients
v) Sensitivity to
 Functional form
 Included variables
 Sample/period
f) Data mining – pros & cons
i) Bad side: tailoring specification to get desired results
ii) Good side: discover regularities to inform economic theory
2
Econometrics II, Summer 2004, CERGE-EI, Daniel Munich
Instructor’s notes
4) Practical hints (we will learn)
a) Log variables
i) if % change makes more sense
ii) Logging variables can eliminate heteroskedasticity
iii) Logging zero (or negative) observations is impossible
iv) Remember that impact of dummy on logged dependent variable
is approximate only
b) Recognize trade-off between bias and efficiency
c) Multicollinearity
i) Does not create a bias
ii) Solution requires more information
d) Remember that that heteroskedasticity consistent estimators do not
differ from OLS coefficients. Only V-C matrix and std. errors.
e) Do not forget to consider interactions of variables.
f) Do not use linear form if dependent variable measuring fractions.
Possible only if far enough from 0.
g) Carefully use ordered explanatory variable (schooling level,
children).
h) Forgetting about possible endogeneity.
i) Bias is not sacred. Some bias can buy efficiency!
j) Be able to predict direction of measurement error bias
k) R2
i) has no meaning if intercept is omitted
ii) adjR2 much better than R2
l) Dropping observations
i) Outliers should be inspected before omitted
ii) Try to understand missing data
 Selection problem
 Omit or predict?
m) Pre-testing should be done at higher significance level 25% instead
of 5%
n) Always check sensitivity
o) Reporting
i) Admit problems and deficiencies (and learn from them)
ii) Failure to prove theory or significant effect is also valuable
finding. (selection bias toward significant results).
iii) Make plausible assumptions to overcome problems.
iv) Finiteness and focus of output
p) Practice
i) Programs
 Fine tune at command line, save steps in DO
 Number versions
 Store dta, do, logs separately
3
Econometrics II, Summer 2004, CERGE-EI, Daniel Munich
Instructor’s notes
 Comments in DO file
 Understand commands, test commands by simple data
ii) Check, check, check outcomes of all your steps
iii) Give variables meaningful names
4
Download