Part 2A: Endogeneity [ 1/73] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business Part 2A: Endogeneity [ 2/73] Endogeneity y = X+ε, Definition: E[ε|x]≠0 Why not? Omitted variables Unobserved heterogeneity (equivalent to omitted variables) Measurement error on the RHS (equivalent to omitted variables) Structural aspects of the model Endogenous sampling and attrition Simultaneity (?) Part 2A: Endogeneity [ 3/73] Aggregate Data and Multinomial Choice: The Model of Berry, Levinsohn and Pakes Part 2A: Endogeneity [ 4/73] Theoretical Foundation Consumer market for J differentiated brands of a good j =1,…, Jt brands or types i = 1,…, N consumers t = i,…,T “markets” (like panel data) Consumer i’s utility for brand j (in market t) depends on p = price x = observable attributes f = unobserved attributes w = unobserved heterogeneity across consumers ε = idiosyncratic aspects of consumer preferences Observed data consist of aggregate choices, prices and features of the brands. Part 2A: Endogeneity [ 5/73] BLP Automobile Market Jt t Part 2A: Endogeneity [ 6/73] Random Utility Model Utility: Uijt=U(wi,pjt,xjt,fjt|), i = 1,…,(large)N, j=1,…,J wi = individual heterogeneity; time (market) invariant. w has a continuous distribution across the population. pjt, xjt, fjt, = price, observed attributes, unobserved features of brand j; all may vary through time (across markets) Revealed Preference: Choice j provides maximum utility Across the population, given market t, set of prices pt and features (Xt,ft), there is a set of values of wi that induces choice j, for each j=1,…,Jt; then, sj(pt,Xt,ft|) is the market share of brand j in market t. There is an outside good that attracts a nonnegligible market share, j=0. Therefore, Jj=1 s j (pt , X t , ft | θ) < 1 t Part 2A: Endogeneity [ 7/73] Endogenous Prices: Demand side Uij=U(wi,pj,xj,fj|)= xj'βi – αpj + fj + εij fj is unobserved features of model j Utility responds to the unobserved fj Price pj is partly determined by features fj. In a choice model based on observables, price is correlated with the unobservables that determine the observed choices. Part 2A: Endogeneity [ 8/73] Part 2A: Endogeneity [ 9/73] Instrumental Variable Estimation One “problem” variable – the “last” one yit = 1x1it + 2x2it + … + KxKit + εit E[εit|xKit] ≠ 0. (0 for all others) There exists a variable zit such that Relevance E[xKit| x1it, x2it,…, xK-1,it,zit] = g(x1it, x2it,…, xK-1,it,zit) In the presence of the other variables, zit “explains” xit A projection interpretation: In the projection XKt =θ1x1it,+ θ2x2it + … + θk-1xK-1,it + θK zit, θK ≠ 0. Exogeneity E[εit| x1it, x2it,…, xK-1,it,zit] = 0 In the presence of the other variables, zit and εit are uncorrelated. Part 2A: Endogeneity [ 10/73] The General Problem y X1 X 2 Cov( X1 , ) 0, K1 variables Cov( X 2 , ) 0, K 2 variables X 2 is endogenous OLS regression of y on (X1 ,X 2 ) cannot estimate (, ) consistently. Some other estimator is needed. Additional structure: X 2 =Z +V where Cov(V,) 0 but Cov(Z,)= 0. An instrumental variable (IV) estimator based on (X1 ,X 2 , Z) may be able to estimate (, ) consistently. Part 2A: Endogeneity [ 11/73] Instrumental Variables Framework: y = X + , K variables in X. There exists a set of K variables, Z such that plim(Z’X/n) 0 but plim(Z’/n) = 0 The variables in Z are called instrumental variables. An alternative (to least squares) estimator of is bIV = (Z’X)-1Z’y ~ Cov(Z,y) / Cov(Z,X) We consider the following: Why use this estimator? What are its properties compared to least squares? We will also examine an important application Part 2A: Endogeneity [ 12/73] The First IV Study: Natural Experiment (Snow, J., On the Mode of Communication of Cholera, 1855) http://www.ph.ucla.edu/epi/snow/snowbook3.html London Cholera epidemic, ca 1853-4 Cholera = f(Water Purity,u) + ε. ‘Causal’ effect of water purity on cholera? Purity=f(cholera prone environment (poor, garbage in streets, rodents, etc.). Regression does not work. Two London water companies Lambeth Southwark & Vauxhall River Thames Main sewage discharge Paul Grootendorst: A Review of Instrumental Variables Estimation of Treatment Effects… http://individual.utoronto.ca/grootendorst/pdf/IV_Paper_Sept6_2007.pdf A review of instrumental variables estimation in the applied health sciences. Health Services and Outcomes Research Methodology 2007; 7(3-4):159-179. Part 2A: Endogeneity [ 13/73] Investigation Using an Instrumental Variable Theory : Cholera = 0 1BadWater Other Factors Model : 0 1B C= (C=0/1=no/yes) (Stylized) (B=0/1=good/bad) ( =other factors) Interesting measure of causal effect of bad water : 1 Endogeneity Problem : Cholera prone environment u affects B and . Interpret this to sa y B(u) and (u) are correlated because of u. Confounding Effect : E[C|B] 0 1B because E[|B] 0 E[C|B=1] = 0 1 E[|B=1] E[C|B=0] = 0 E[ |B=0] E[C|B=1] - E[C|B=0] = 1 {E[ |B=1] E[|B=0]} Conclusion : Comparing cholera rates of those with bad water (measurable) to those with good water, P(C|B=1) - P(C|B=0), does not reveal the water effect. Part 2A: Endogeneity [ 14/73] Instrumental Variable : L = 1 if water supplied by Lambeth L = 0 if water supplied by Southwark/Vauxhall Relevant? Is E[B|L=1] E[B|L=0]? That is Snow's theory, that the water supply is partly the culprit, and because of their location, Lambeth provided purer water than Southwark. Exogenous? Is E[|L=1]-E[|L=0]=0? Water supply is randomly suppplied to houses. Homeowners do not even know which supplier is providing their water. "Assignment is random." Using the IV in E[C|L] = 0 1E[B | L] E[ | L] : E[C | L 1] 0 1E[B | L 1] E[ | L 1] E[C | L 0] 0 1E[B | L 0] E[ | L 0] Estimating Equation : E[C | L 1] E[C | L 0] 1 E[B | L 1] E[B | L 0] E[ | L 1] E[ | L 0] (zero because L is exogenous) Part 2A: Endogeneity [ 15/73] IV Estimator : E[C | L 1] E[C | L 0] 1 E[B | L 1] E[B | L 0] 1 E[C | L 1] E[C | L 0] (Note : nonzero denominator is the relevance condition.) E[B | L 1] E[B | L 0] Operational : P(C|L=1) = Proportion of observations supplied by Lambeth that have Cholera P(C|L=0) = Proportion of observations supplied by Southwark that have Cholera P(B | L 1) Pr oportion of observations supplied by Lambeth with Bad Water P(B | L 0) Pr oportion of observations supplied by Southwark with Bad Water Estimate : b1 P(C | L 1) P(C | L 0) Cov(C, L) (broadly) P(B | L 1) P(B | L 0) Cov(B, L) Part 2A: Endogeneity [ 16/73] Cornwell and Rupert Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP WKS OCC IND SOUTH SMSA MS FEM UNION ED LWAGE = = = = = = = = = = = work experience weeks worked occupation, 1 if blue collar, 1 if manufacturing industry 1 if resides in south 1 if resides in a city (SMSA) 1 if married 1 if female 1 if wage set by union contract years of education log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155. See Baltagi, page 122 for further analysis. The data were downloaded from the website for Baltagi's text. Part 2A: Endogeneity [ 17/73] Specification: Quadratic Effect of Experience Part 2A: Endogeneity [ 18/73] The Effect of Education on LWAGE LWAGE 1 2EDUC 3EXP 4EXP 2 ... ε What is ε? Ability, Motivation,... + everything else EDUC = f(GENDER, SMSA, SOUTH, Ability, Motivation,...) Part 2A: Endogeneity [ 19/73] What Influences LWAGE? LWAGE 1 2EDUC( X, Ability, Motivation,...) 3EXP 4EXP 2 ... ε(Ability, Motivation) Increased Ability is associated with increases in EDUC( X, Ability, Motivation,...) and ε(Ability, Motivation) What looks like an effect due to increase in EDUC may be an increase in Ability. The estimate of 2 picks up the effect of EDUC and the hidden effect of Ability. Part 2A: Endogeneity [ 20/73] An Exogenous Influence LWAGE 1 2EDUC( X, Z, Ability, Motivation,...) 3EXP 4EXP 2 ... ε(Ability, Motivation) Increased Z is associated with increases in EDUC( X, Z, Ability, Motivation,...) and not ε(Ability, Motivation) An effect due to the effect of an increase Z on EDUC will only be an increase in EDUC. The estimate of 2 picks up the effect of EDUC only. Z is an Instrumental Variable Part 2A: Endogeneity [ 21/73] Instrumental Variables Structure LWAGE (ED,EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) ED (MS, FEM) Equation explains the endogeneity Reduced Form: LWAGE[ ED (MS, FEM), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ] Part 2A: Endogeneity [ 22/73] Two Stage Least Squares Strategy Reduced Form: LWAGE[ ED (MS, FEM,X), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ] Strategy (1) Purge ED of the influence of everything but MS, FEM (and the other variables). Predict ED using all exogenous information in the sample (X and Z). (2) Regress LWAGE on this prediction of ED and everything else. Standard errors must be adjusted for the predicted ED Part 2A: Endogeneity [ 23/73] OLS Part 2A: Endogeneity [ 24/73] Weak Instruments? The weird results for the coefficient on ED may be due to the instruments, MS and FEM being dummy variables. There is not much variation in these variables and not much covariation with the other variables. Part 2A: Endogeneity [ 25/73] The Source of the Endogeneity LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + ED = f(MS,FEM, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u Part 2A: Endogeneity [ 26/73] Can We Remove the Endogeneity? LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u + LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u + Strategy Estimate u Add u to the equation. ED is correlated with u+ because it is correlated with u. ED is uncorrelated with when u is in the equation. Part 2A: Endogeneity [ 27/73] Auxiliary Regression for ED to Obtain Residuals IVs Exog. Vars Part 2A: Endogeneity [ 28/73] OLS with Residual (Control Function) Added 2SLS Part 2A: Endogeneity [ 29/73] A Warning About Control Functions Sum of squares is not computed correctly because U is in the regression. A general result. Control function estimators usually require a fix to the estimated covariance matrix for the estimator. Part 2A: Endogeneity [ 30/73] IV Estimators * Consistent bIV = (Z’X)-1Z’y = (Z’X/n)-1 (Z’X/n)β+ (Z’X/n)-1Z’ε/n = β+ (Z’X/n)-1Z’ε/n β * Asymptotically normal (same approach to proof as for OLS) * Inefficient – to be shown. * By construction, the IV estimator is consistent. We have an estimator that is consistent when least squares is not. Part 2A: Endogeneity [ 31/73] IV Estimation Why use an IV estimator? Suppose that X and are correlated. Then least squares is neither unbiased nor consistent. Recall the proof of consistency of least squares: b = + (X’X/n)-1(X’/n). Plim b = requires plim(X’/n) = 0. If this does not hold, the estimator is inconsistent. Part 2A: Endogeneity [ 32/73] A Popular Misconception If only one variable in X is correlated with , the other coefficients are consistently estimated. False. Suppose only the first variable is correlated with ε 1 0 Under the assumptions, plim(X'ε /n) = . Then ... . q11 1 21 0 q plim b - β = plim(X'X /n)-1 1 ... ... K 1 . q 1 times the first column of Q-1 The problem is “smeared” over the other coefficients. Part 2A: Endogeneity [ 33/73] Consistency and Asymptotic Normality of the IV Estimator ˆ = (Z'X )-1Z'y β = β + (Z'X )-1Z'ε ˆ - β ) = plim (Z'X/N)-1 (Z'ε/N) plim (β = Q-1 ZX 0 ˆ - β] = Q-1 Asy.Var[Z'ε/N]Q-1 Asy.Var[β ZX XZ -1 -1 -1 = (1/N)Q-1 ZX [plim Z'ΩZ/N] Q XZ = (1/ N)Q ZX Ω ZZ Q XZ (Assuming "well behaved" data) ˆ - β ) = (Z'X/N)-1 N i,t zit it = (Z'X/N)-1 N w N (β N Invoke Slutsky and Lindberg-Feller for N w to assert asymptotic normality. ˆ] with Estimate Asy.Var[β ˆ )2 i,t (y it x itβ N or (N-K) [Z'X ]-1 [Z'Z][X Z]-1 Part 2A: Endogeneity [ 34/73] Asymptotic Covariance Matrix of bIV bIV (Z'X)1 Z ' (bIV )(bIV ) ' (Z'X) 1 Z ' 'Z(X'Z)-1 E[(bIV )(bIV ) ' | X, Z] 2 (Z'X) 1 Z ' Z(X'Z)-1 Part 2A: Endogeneity [ 35/73] Asymptotic Efficiency Asymptotic efficiency of the IV estimator. The variance is larger than that of LS. (A large sample type of GaussMarkov result is at work.) (1) It’s a moot point. LS is inconsistent. (2) Mean squared error is uncertain: MSE[estimator|β]=Variance + square of bias. IV may be better or worse. Depends on the data Part 2A: Endogeneity [ 36/73] Two Stage Least Squares How to use an “excess” of instrumental variables (1) X is K variables. Some (at least one) of the K variables in X are correlated with ε. (2) Z is M > K variables. Some of the variables in Z are also in X, some are not. None of the variables in Z are correlated with ε. (3) Which K variables to use to compute Z’X and Z’y? Part 2A: Endogeneity [ 37/73] Choosing the Instruments Choose K randomly? Choose the included Xs and the remainder randomly? Use all of them? How? A theorem: (Brundy and Jorgenson, ca. 1972) There is a most efficient way to construct the IV estimator from this subset: (1) For each column (variable) in X, compute the predictions of that variable using all the columns of Z. (2) Linearly regress y on these K predictions. This is two stage least squares Part 2A: Endogeneity [ 38/73] Algebraic Equivalence Two stage least squares is equivalent to (1) each variable in X that is also in Z is replaced by itself. (2) Variables in X that are not in Z are replaced by predictions of that X using All other variables in X that are not correlated with ε All the variables in Z that are not in X. Part 2A: Endogeneity [ 39/73] 2SLS Algebra ˆ Z(Z'Z)-1 Z'X X ˆ ˆ ) 1 X'y ˆ b2SLS ( X'X But, Z(Z'Z)-1 Z'X = (I - MZ ) X and (I - MZ ) is idempotent. ˆ ˆ = X'(I - MZ )(I - MZ ) X = X'(I - MZ )X so X'X ˆ ) 1 X'y ˆ = a real IV estimator by the definition. b2SLS ( X'X ˆ /n) = 0 since columns of X ˆ are linear combinations Note, plim(X' of the columns of Z, all of which are uncorrelated with b2SLS X'(I - MZ ) X ]-1 X'(I - MZ ) y Part 2A: Endogeneity [ 40/73] Asymptotic Covariance Matrix for 2SLS General Result for Instrumental Variable Estimation E[(bIV )(bIV ) ' | X, Z] 2 (Z'X) 1 Z ' Z(X'Z)-1 ˆ = (I - MZ ) X Specialize for 2SLS, using Z = X ˆ ) 1 X ˆ 'X ˆ (X'X ˆ )-1 E[(b2SLS )(b2SLS ) ' | X, Z] 2 ( X'X ˆ ˆ )1 X ˆ 'X ˆ (X'X ˆ ˆ )-1 2 (X'X ˆ ˆ )1 2 ( X'X Part 2A: Endogeneity [ 41/73] 2SLS Has Larger Variance than LS A comparison to OLS ˆ 'X ˆ )-1 Asy.Var[2SLS]=2 ( X Neglecting the inconsistency, Asy.Var[LS] =2 ( X ' X)-1 (This is the variance of LS around its mean, not β) Asy.Var[2SLS] Asy.Var[LS] in the matrix sense. Compare inverses: ˆ'X ˆ] {Asy.Var[LS]}-1 - {Asy.Var[2SLS]} -1 (1 / 2 )[ X ' X - X (1 / 2 )[X ' X - X '(I MZ ) X]=(1 / 2 )[ X ' MZ X] This matrix is nonnegative definite. (Not positive definite as it might have some rows and columns which are zero.) Implication for "precision" of 2SLS. The problem of "Weak Instruments" Part 2A: Endogeneity [ 42/73] On Sat, May 3, 2014 at 4:48 PM, … wrote: Dear Professor Greene, I am giving an Econometrics course in Brazil and we are using your great textbook. I got a question which I think only you can help me. In our last class, I did a formal proof that var(beta_hat_OLS) is lower or equal than var(beta_hat_2SLS), under homoscedasticity. We know this assertive is also valid under heteroscedasticity, but a graduate student asked me the proof (which is my problem). Do you know where can I find it? Part 2A: Endogeneity [ 43/73] Part 2A: Endogeneity [ 44/73] Part 2A: Endogeneity [ 45/73] Part 2A: Endogeneity [ 46/73] Estimating σ2 Estimating the asymptotic covariance matrix a caution about estimating 2. ˆ, Since the regression is computed by regressing y on x one might use 2 n ˆ 2sls ) ˆ 1n i1 (y i x'b ˆ uses x This is inconsistent. Use 2 n ˆ 1n i1 (y i x'b2sls ) uses x (Degrees of freedom correction is optional. Conventional, but not necessary.) Part 2A: Endogeneity [ 47/73] Robust estimation of VC Counterpart to the White estimator allows heteroscedasticity “Predicted” X ˆ )2 x ˆ ˆ )-1 i,t (y it xitβ ˆ ˆ )-1 ˆ it x ˆ it (X'X Est.Asy.Var[]=(X'X “Actual” X Part 2A: Endogeneity [ 48/73] 2SLS vs. Robust Standard Errors +--------------------------------------------------+ | Robust Standard Errors | +---------+--------------+----------------+--------+ |Variable | Coefficient | Standard Error |b/St.Er.| +---------+--------------+----------------+--------+ B_1 45.4842872 4.02597121 11.298 B_2 .05354484 .01264923 4.233 B_3 -.00169664 .00029006 -5.849 B_4 .01294854 .05757179 .225 B_5 .38537223 .07065602 5.454 B_6 .36777247 .06472185 5.682 B_7 .95530115 .08681261 11.000 +--------------------------------------------------+ | 2SLS Standard Errors | +---------+--------------+----------------+--------+ |Variable | Coefficient | Standard Error |b/St.Er.| +---------+--------------+----------------+--------+ B_1 45.4842872 .36908158 123.236 B_2 .05354484 .03139904 1.705 B_3 -.00169664 .00069138 -2.454 B_4 .01294854 .16266435 .080 B_5 .38537223 .17645815 2.184 B_6 .36777247 .17284574 2.128 B_7 .95530115 .20846241 4.583 Part 2A: Endogeneity [ 49/73] Endogeneity Test? (Hausman) Exogenous Endogenous OLS Consistent, Efficient Inconsistent 2SLS Consistent, Inefficient Consistent Base a test on d = b2SLS - bOLS Use a Wald statistic, d’[Var(d)]-1d What to use for the variance matrix? Hausman: V2SLS - VOLS Part 2A: Endogeneity [ 50/73] Hausman Test Part 2A: Endogeneity [ 51/73] Hausman Test: One at a Time? Part 2A: Endogeneity [ 52/73] Endogeneity Test: Wu Considerable complication in Hausman test (text, pp. 234-237) Simplification: Wu test. Regress y on X and X̂ estimated for the endogenous part of X. Then use an ordinary Wald test. Part 2A: Endogeneity [ 53/73] Wu Test Since this is 2SLS using a control function, the standard errors should have been adjusted to carry out this test. (The sum of squares is too small.) Part 2A: Endogeneity [ 54/73] Regression Based Endogeneity Test An easy t test. (Wooldridge 2010, p. 127) y it x it δ qit it Z = a set of M instruments. Write q = Zπ + v Can be estimated by ordinary least squares. Endogeneity concerns correlation between v and . Add ˆ v = q - zˆ to the equation and use OLS y it x it δ qit ˆ v it + {it error} Simple t test on whether equals 0. ˆ Even easier, algebraically identical, (Wu, 1973), add q to the equation and do the same tes t. Part 2A: Endogeneity [ 55/73] Testing Endogeneity of WKS (1) Regress WKS on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,MS. U=residual, WKSHAT=prediction (2) Regress LWAGE on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,WKS, U or WKSHAT +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant -9.97734299 .75652186 -13.188 .0000 EXP .01833440 .00259373 7.069 .0000 19.8537815 EXPSQ -.799491D-04 .603484D-04 -1.325 .1852 514.405042 OCC -.28885529 .01222533 -23.628 .0000 .51116447 SOUTH -.26279891 .01439561 -18.255 .0000 .29027611 SMSA .03616514 .01369743 2.640 .0083 .65378151 WKS .35314170 .01638709 21.550 .0000 46.8115246 U -.34960141 .01642842 -21.280 .0000 -.341879D-14 +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant -9.97734299 .75652186 -13.188 .0000 EXP .01833440 .00259373 7.069 .0000 19.8537815 EXPSQ -.799491D-04 .603484D-04 -1.325 .1852 514.405042 OCC -.28885529 .01222533 -23.628 .0000 .51116447 SOUTH -.26279891 .01439561 -18.255 .0000 .29027611 SMSA .03616514 .01369743 2.640 .0083 .65378151 WKS .00354028 .00116459 3.040 .0024 46.8115246 WKSHAT .34960141 .01642842 21.280 .0000 46.8115246 Part 2A: Endogeneity [ 56/73] General Test for Endogeneity Extending the Wu test y = X 1β1 + X 2β2 + ε There are M K 2 instruments Z. ˆ 2 column To carry out the test, compute X by column with regressions on Z. Compute by OLS the extended model ˆ 2 + error y = X 1β1 + X 2β2 + X Use an F test to test H0 : 0 Part 2A: Endogeneity [ 57/73] Alternative to Hausman’s Formula? H test requires the difference between an efficient and an inefficient estimator. Any way to compare any two competing estimators even if neither is efficient? Bootstrap? (Maybe) Part 2A: Endogeneity [ 58/73] Part 2A: Endogeneity [ 59/73] Weak Instruments Symptom: The relevance condition, plim Z’X/n not zero, is close to being violated. Detection: Standard F test in the regression of xk on Z. F < 10 suggests a problem. F statistic based on 2SLS – see text p. 351. Remedy: Not much – most of the discussion is about the condition, not what to do about it. Use LIML? Requires a normality assumption. Probably not too restrictive. Part 2A: Endogeneity [ 60/73] Weak Instruments (cont.) ˆ = β + [Cov(Z, X)]-1Cov(Z, ε) plim β If Cov(Z, ε) is "small" but nonzero, small Cov(Z, X ) may hugely magnify the effect. IV is not only inefficient, may be very badly biased by "weak" instruments. Solutions? Can one "test" for weak instruments? Part 2A: Endogeneity [ 61/73] Weak Instruments Which is better? LS is inconsistent, but probably has smaller variance LS may be more precise IV is consistent, but probably has larger variance ˆ ] = Q-1 Ω Q-1 Asy.Var[β ZX ZZ XZ Q-1XZ may be large. (Compared to what?) Strange results with "small" QZX IV estimator tends to resemble OLS (bias) (not a function of sample size). Contradictory result. Suppose z is perfectly correlated with x. IV MUST be the same as OLS. Part 2A: Endogeneity [ 62/73] A study of moral hazard Riphahn, Wambach, Million: “Incentive Effects in the Demand for Healthcare” Journal of Applied Econometrics, 2003 Did the presence of the ADDON insurance influence the demand for health care – doctor visits and hospital visits? For a simple example, we examine the PUBLIC insurance (89%) instead of ADDON insurance (2%). Part 2A: Endogeneity [ 63/73] Application: Health Care Panel Data German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Data downloaded from Journal of Applied Econometrics archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). DOCTOR HOSPITAL HSAT DOCVIS HOSPVIS PUBLIC ADDON HHNINC HHKIDS EDUC AGE MARRIED EDUC = 1(Number of doctor visits > 0) = 1(Number of hospital visits > 0) = health satisfaction, coded 0 (low) - 10 (high) = number of doctor visits in last three months = number of hospital visits in last calendar year = insured in public health insurance = 1; otherwise = 0 = insured by add-on insurance = 1; otherswise = 0 = household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) = children under age 16 in the household = 1; otherwise = 0 = years of schooling = age in years = marital status = years of education Part 2A: Endogeneity [ 64/73] Evidence of Moral Hazard? Part 2A: Endogeneity [ 65/73] Regression Study Part 2A: Endogeneity [ 66/73] Endogenous Dummy Variable Doctor Visits = f(Age, Educ, Health, Presence of Insurance, Other unobservables) Insurance = f(Expected Doctor Visits, Other unobservables) Part 2A: Endogeneity [ 67/73] Approaches (Parametric) Control Function: Build a structural model for the two variables (Heckman) (Semiparametric) Instrumental Variable: Create an instrumental variable for the dummy variable (Barnow/Cain/ Goldberger, Angrist, current generation of researchers) (?) Propensity Score Matching (Heckman et al., Becker/Ichino, Many recent researchers) Part 2A: Endogeneity [ 68/73] Heckman’s Control Function Approach Y = xβ + δT + E[ε|T] + {ε - E[ε|T]} λ = E[ε|T] , computed from a model for whether T = 0 or 1 Magnitude = 11.1200 is nonsensical in this context. Part 2A: Endogeneity [ 69/73] Instrumental Variable Approach Construct a prediction for T using only the exogenous information Use 2SLS using this instrumental variable. Magnitude = 23.9012 is also nonsensical in this context. Part 2A: Endogeneity [ 70/73] Propensity Score Matching Create a model for T that produces probabilities for T=1: “Propensity Scores” Find people with the same propensity score – some with T=1, some with T=0 Compare number of doctor visits of those with T=1 to those with T=0. Part 2A: Endogeneity [ 71/73] Treatment Effect Earnings and Education: Effect of an additional year of schooling Estimating Average and Local Average Treatment Effects of Education when Compulsory Schooling Laws Really Matter Philip Oreopoulos AER, 96,1, 2006, 152-175 Part 2A: Endogeneity [ 72/73] Treatment Effects and Natural Experiments Part 2A: Endogeneity [ 73/73] How do panel data fit into this? We can use the usual models. Observations are surely correlated. We can use far more elaborate models We can study effects through time The same individual is observed more than once Unobserved heterogeneity that appears in the disturbance in a cross section remains persistent across observations (on the same ‘unit’). Procedures must be adjusted. Dynamic effects are likely to be present. Part 2A: Endogeneity [ 74/73] Appendix: Structure and Regression Part 2A: Endogeneity [ 75/73] Least Squares Revisited b = (X'X )-1 X'y = β + (X'X )-1 X'ε plim b = β + plim (X'X/N)-1plim(X'ε/N) = β + Q-1 XX γ -1 -1 b - plim b = β + (X'X )-1 X'ε - (β + Q-1 γ ) = ( X'X/ N) ( X'ε/ N) Q XX XX γ -1 -1 Asy.Var[b - Q-1 XX γ] = Q XX Asy.Var[X'ε/N]Q XX -1 -1 -1 = (1/N)Q-1 XX [plim X'ΩX/N] Q XX = (1 /N) Q XX Ω XX Q XX (Assuming "well behaved" data) x N(b - plim b ) = (X'X/N)-1 N i,t it it Q-1 γ XX N = (X'X/N) -1 N (w - plim w) Invoke Slutsky and Lindberg-Feller for N (w - plim w) to assert asymptotic normality. b is also asymptotically normally distributed, but inconsistent. Part 2A: Endogeneity [ 76/73] Inference with IV Estimators (1) Wald Statistics: ˆ - q)' {Est.Asy.Var[β]} ˆ -1 (Rβ ˆ - q) (Rβ (E.g., the usual 't-statistics') (2) A type of F statistic: ˆ )'( y Xβ ˆ ) without restrictions Compute SSUA=( y Xβ u u ˆ ˆ R )'( y Xβ ˆ ˆ R ) with restrictions Compute SSR=( y Xβ ˆ ˆ U )'( y Xβ ˆ ˆ U ) without restrictions Compute SSU=( y Xβ (SSR SSU) / J F= ~ F[J,N K] SSUA/(N-K) Part 2A: Endogeneity [ 77/73] Comparing OLS and IV Least squares b plim b = β + Q-1XX Asy.Var[b] = Q-1XX Ω XX Q-1XX Precision = Mean squared error[b|β] = Asy.Var[b] + [plim(b - β)][plim(b - β)] = Q-1XX Ω XX Q-1XX Q-1XX Q-1XX ˆ Instrumental variables β ˆ=β plim β ˆ ] = Q-1 Ω Q-1 Asy.Var[β ZX ZZ XZ ˆ|β] Precision = Mean squared error[β -1 = Q-1 Ω Q ZX ZZ XZ Part 2A: Endogeneity [ 78/73] Testing for Endogeneity(?) A test for endogeneity? Consider one variable: y it x it δ qit it ==> y = Xβ + ε q may be endogenous. Z = a set of M instruments. (1) OLS: b = (X'X)-1 X'y Inconsistent if Cov[q,] 0. Consistent and efficient if Cov[q,] = 0. ˆ = (X'X ˆ ˆ ˆ )-1 X'y Always consistent. (2) 2SLS: β Inefficient if Cov[q,] = 0. -1 1 ˆ 2 2 ˆ - b]'{ ˆ ˆ )-1 - b]' Hausman test? [β ˆ ,OLS (X'X ) } [β ˆ ,2SLS (X'X (a) Need to use the same estimator of 2 . 1 2 ˆ ,2SLS ˆ - b]' ˆ - b]' {(X'X ˆ ˆ )-1 (X'X )-1 } 1 [β [β (b) Even with this fix, the resulting matrix is singular. Part 2A: Endogeneity [ 79/73] Structure vs. Regression Reduced Form vs. Stuctural Model Simultaneous equations origin Q(d) = a0 + a1P + a2I + e(d) (demand) Q(s) = b0 + b1P + b2R + e(s) (supply) Q(.) = Q(d) = Q(s) What is the effect of a change in I on Q(.)? (Not a regression) Reduced form: Q = c0 + c1I + c2R + v. (Regression) Modern concepts of structure vs. regression: The search for causal effects. Part 2A: Endogeneity [ 80/73] Implications The structure is the theory The regression is the conditional mean There is always a conditional mean It may not equal the structure It may be linear in the same variables What is the implication for least squares estimation? LS estimates regressions LS does not necessarily estimate structures Structures may not be estimable – they may not be identified. Part 2A: Endogeneity [ 81/73] Structure and Regression Simultaneity? What if E[ε|x]≠0 y=x+ε, x=δy+u. Cov[x, ε]≠0 x is not the regression? What is the regression? Reduced form: Assume ε and u are uncorrelated. y = [/(1- δ)]u + [1/(1- δ)]ε x= [1/(1- δ)]u + [δ /(1- δ)]ε Cov[x,y]/Var[x] = [u2 2 ] /[u2 2 2 ] w (1 w )(1 / ) where w=u2 /[u2 2 2 ] The regression is y = x + v, where E[v|x]=0 Part 2A: Endogeneity [ 82/73] Structure vs. Regression Supply = a + b*Price + c*Capacity Demand = A + B*Price + C*Income