Empirical Example Walter Sosa Escudero (wsosa@udesa.edu.ar) Universidad de San Andres - UNLP Panel Data Models In this exercise, we will replicate the results in “Estimating the Economic Model of Crime with Panel Data”, by C. Cornwell and W. Trumbull (1994). The Data • Cornwell and Trumbull estimate an economic model of crime. • Panel dataset of North Carolina counties. • They use single and simultaneous equations panel data estimators to address two sources of endogeneity: unobserved heterogeneity and conventional simultaneity. • The data are county level, so there is a relatively low level of aggregation. Model and Alternative Estimators • The basic assumption is: An individual´s participation in the criminal sector depends on the relative monetary return to illegal activities and the degree to which the criminal justice system is able to affect the probabilities of apprehension and punishment. • Cornwell and Trumbull specify the following crime equation: Rit X ´it P´it i eit i 1,..., N t 1,...,T (1) • where: Rit is the crime rate. X´it contains variables which control for the relative return to legal opportunities. (wcon, wtuc, wtrd, wfir, wser, wser, wmfg, wfed, wsta, wloc, polpc, urban, density, west, central, pctymle, pctmin) P´it contains a set of deterrent variables. (prbarr, prbconv, prbpris, avgsen) i are fixed effects (may be correlated with (X´it, P´it)). eit are typical disturbance terms. Summary of variables Dependant variable Probability of arrest Probability of conviction Probability of prison Sanction severity • The “between” transformation of (1) is: Ri X ´i P´i i ei (2) 1 The data are expressed in county means: Ri T Rit t • The “within” transformation of (1) is: Rit X ´it P´it eit (3) The data are in deviations from means: Rit Rit Ri . (3) Does NOT depend on the county effects. • The authors adopt a log-linear specification so that their estimated coefficients are interpretable as elasticities. “Between” Model: • (2) leads to cross-section estimators which neglect unobserved county heterogeneity. • If unobserved characteristics are correlated with (X´it, P´it), such procedure will produce inconsistent estimates. “Within” Model: • By using (3), both sources of endogeneity may be addressed. • If the only problem is correlation between (X´it, P´it) and unobserved heterogeneity, then consistent estimation is possible by performing least squared on (3). • Conventional simultaneity can be accounted for by using 2SLS to (3). “Between” Model Balanced Panel: N = 90 T=7 Test F: Joint significance, it rejects the null. • With the exception of PP, the elements of ˆ have the correct NEGATIVE signs. • Only the estimated coefficient of PA and PC are statistically significant at the usual significance levels. • The estimated arrest and conviction elasticities are, respectively, -0.65 and –0.53. • For the rest of the variables, only lpolpc, ldensity, west, central and lpctmin are statistically significant at 5%. • For example, if the number of police per capita increases 1%, the crime rate increases in 0.36%. • The “between” estimator is consistent only if (X´it, P´it) is orthogonal to both i and eit. • The “within” estimator is a simple solution to the violation of the orthogonality condition that (X´it, P´it) is uncorrelated with unobserved heterogeneity. Fixed Effects Estimation Balanced Panel: N = 90 T=7 Test F: Joint significance, it rejects the null. Region and urban dummies and percentage minority variable do not vary over time, they are eliminated by the within transformation. Fixed Effects Test: it rejects the null. So, the fixed effects are significative. • Now, the estimated coefficient of PP has the correct (negative) sign and is statistically significant. • The within estimate of the deterrent effect of S is small and statistically insignificant. • Conditioning on the county effects causes the (absolute value of the) estimated deterrent elasticities associated with PA and Pc decrease by 41% and 43%, respectively. Variable lprbarr lprbconv lprbpris lavgsen (PA) (PC) (PP) (S) Between Within % Variation Coefficient Coefficient -0.6475095 -0.5282029 0.2965068 -0.235888 -0.3849533 -0.3006001 -0.1913185 0.0261132 -41 -43 -35 -89 • In the Fixed Effects model, both sources of endogeneity may be addressed. • First, if the only problem is correlation between (X´it, P´it) and unobserved heterogeneity, then consistent estimation is possible by performing OLS on (3). This within estimator can be viewed as an instrumental variables estimator with instruments (deviations from means) that are orthogonal to the effects by construction. • Conventional simultaneity can be accounted for by using 2SLS to estimate (3). • If the constant terms specific for each county were randomly distributed, between counties, we can estimate a Random Effects Model. • In order to estimate a Random Effects Model, it´s necessary to assume that the explanatory variables are uncorrelated to the specific term for each county. • A Hausman test can be constructed to evaluate FE / RE estimates. Hausman Test • RE estimators: INCONSISTENT • FE estimators: CONSISTENT It rejects the null, so there are systematic differences between FE and RE coefficients. Random Effects and Serial Correlation • Bera-Yoon-Sosa Escudero (2001): – BP Test for random effects implicitly assume no autocorrelation. – The presence of random effects confuse the BP test, inducing to reject Ho, even though it is correct. – The same thing happens with the autocorrelation test. – BYS: modified tests. • Joint Test Baltagi-Li (1991) – Test for the joint null of no autocorrelation and no random effects (low power, less informative). • Sosa Escudero (2001): – Joint Test for random effects and positive serial correlation (one-sided, one-directional). Results of the Tests • In the Random Effects tests: the null is H0 : 2 0 in the Random Effects model. • The test rejects this null, so the OLS estimators are NOT BLUE. • In the Serial Correlation tests: the null is H0 : 0 • The test rejects this null In all tests, we reject the null. • Note that the statistics decrease in all the adjusted versions of the tests: • LM Test for random effects, assuming no serial correlation: 672.89. • Adjusted LM test for random effects, which works even under serial correlation: 340.20. • LM test for first order serial correlation, assuming no random effects: 375.04. • Adjusted test for first order serial correlation, which works even under random effects: 42.36. • LM Joint test for random effects and serial correlation: 715.24. This Joint Test rejects the joint null, but is NOT informative about the direction of the misspecification. Instrumental Variables • Conventional simultaneity may exist between the crime rate, the probability of arrest and the number of police per capita. • Counties experiencing rising crime rates, holding police resources constant, would see probabilities of arrest fall. • But, increases in crime may motivate a county to increase policing resources which would increase the probability of arrest. • Now, we also allow for the possibility that PA and the number of police per capita may be correlated with eit. • Applying 2SLS to the “within” model, we address simultaneity as well as unobserved heterogeneity. • We need at least two exogenous instruments (uncorrelated with e and the effects). • We use as instruments a mix of different offense types and per capita tax revenue. 2SLS with Fixed Effects • PA, PC and PP are NOT statistically significant. • Only lwfed and lwloc are statistically significant. • The Fixed Effects are statistically significant. 2SLS to Between Model • PA and PC are statistically significant and have the correct signs. • PP is NOT statistically significant. • PA is 30% lower in 2SLS to “between” than to 2SLS to “within” model. • The statistical consequences of neglecting unobserved heterogeneity in our sample are serious whether single or simultaneous equations estimators are used!