Greene-TrueRandomEffects-APPC2014 - NYU Stern

1/76 True Random Effects in Stochastic Frontier Models William Greene New York University 2/76 Agenda  Skew normality – Adelchi Azzalini      3/76 Stochastic frontier model Panel Data: Time varying and time invariant inefficiency models Panel Data: True random effects models Maximum Simulated Likelihood Estimation Applications of true random effects  Persistent and transient inefficiency in Swiss railroads  A panel data sample selection corrected stochastic frontier model  Spatial effects in a stochastic frontier model http://people.stern.nyu.edu/wgreene/appc2014.pdf Skew Normality 4/76 The Stochastic Frontier Model ln yi    xi  vi  ui , vi ~ N 0, v2  , ui  | U i |, U i ~ N 0, u2  , i  vi  ui = vi  | U i | Convenient parameterization (notation) i  vVi  u | U i | = v N [0,1]i  u | N [0,1] | 5/76 Log Likelihood u   ,  = u2  2v v log L( , , , ) =  N i 1 Skew Normal Density = 6/76  N i 1 2   yi    x i    log   log         ( yi    xi )     log           2  i   i     log                  Birnbaum (1950) Wrote About Skew Normality Effect of Linear Truncation on a Multinormal Population 7/76 Weinstein (1964) Found f() Query 2: The Sum of Values from a Normal and a Truncated Normal Distribution See, also, Nelson (Technometrics, 1964), Roberts (JASA, 1966) 8/76 O’Hagan and Leonard (1976) Found Something Like f() Resembles f() Bayes Estimation Subject to Uncertainty About Parameter Constraints 9/76 ALS (1977) Discovered How to Make Great Use of f() See, also, Forsund and Hjalmarsson (1974), Battese and Corra (1976) Poirier,… Timmer, … several others. 10/76 Azzalini (1985) Figured Out f() And Noticed the Connection to ALS The standard skew normal distribution f() = 2()() 11/76 © 2014 http://azzalini.stat.unipd.it/SN/ 12/76 http://azzalini.stat.unipd.it/SN/abstracts.html#sn99 ALS 13/76 A Useful FAQ About the Skew Normal How to generate pseudo random draws on  1. Draw U ,V from independent N[0,1] 2.  = uV + u | U | 14/76 Random Number Generator For a particular desired  and   2 2 2 Use  u  and  v  = 2 2 1  1  Then    v N (0,1)   u | N (0,1) | 15/76  2   u2 How Many Applications of SF Are There? 16/76 W. D. Walls (2006) On Skewness in the Movies 17/76 Cites Azzalini. 2( z )(z ) SNARCH Model for Financial Crises (2013) “The skew-normal distribution developed by Sahu et al. (2003)…” Does not know Azzalini. 18/76 A Skew Normal Mixed Logit Model (2010) Mixed Logit Model Prob(Choicei  j )  exp(i xij )  J j 1 exp(i xij ) Random Parameters ik    wik Asymmetric (Skewed) Parameter Distribution wik  vik   | U ik |~ SN (0, , ) Greene (2010, knows Azzalini and ALS), Bhat (2011, knows not Azzalini … or ALS) 19/76 Skew Normal Applications  Foundation: An Entire Field  Stochastic Frontier Model  Occasional Modeling Strategy  Culture: Skewed Distribution of Movie Revenues  Finance: Crisis and Contagion  Choice Modeling: The Mixed Logit Model  How can these people find each other?  Where else do applications appear? 20/76 Stochastic Frontier 21/76 The Cross Section Departure Point: 1977 Aigner et al. (ALS) Stochastic Frontier Model yi    x i  vi  ui vi ~ N [0, v2 ] ui | U i | and U i ~ N [0, u2 ] Jondrow et al. (JLMS) Inefficiency Estimator    (i )   uî  E[ui | i ]      2 i 1    (  )   i  i  vi  ui ,   22/76 u i ,   2v  u2 , i  v  The Panel Data Models Appear: 1981 Pitt and Lee Random Effects Approach: 1981 Time yit    x it  vit  ui fixed vit ~ N [0, v2 ], ui | U i | and U i ~ N [0, u2 ] it  vit  ui Counterpart to Jondrow et al. (1982)  (i / )  uî  E [ui | i1 ,..., iT ]  i     1   (  /  ) i   u2 u  T i  i =    ,    1   T  2v 1  T   23/76 Reinterpreting the Within Estimator: 1984 Schmidt and Sickles Fixed Effects Approach: 1984 yit  i  x it  vit vit ~ N [0, v2 ], i semiparametically specified fixed mean, constant variance. Counterpart to Jondrow et al. (1982) uî  max i ( ˆ i )  ˆ i (The cost of the semiparametric specification is the location of the inefficiency distribution. The authors also revisit Pitt and Lee to demonstrate.) 24/76 Time fixed Misgivings About Time Fixed Inefficiency: 1990- Cornwell Schmidt and Sickles (1990) it  0i  1i t  2i t 2 Kumbhakar (1990) uit  [1  exp(bt  ct 2 )]1 | U i | Battese and Coelli (1992, 1995) uit  exp[(t  T )] | U i |, uit  exp[ g (t, T , zit )] | U i | Cuesta (2000) uit  exp[i (t  T )] | U i |, uit  exp[ gi (t , T , zit )] | U i | 25/76 Are the systematically time varying models more like time fixed or freely time varying? A Pooled Model yit    x it  vit  uit Battese and Coelli (1992) uit  exp[ ( t  T )] | U i | yit    x it  vit  | U i | Pitt and Lee (1981) Where is Battese and Coelli? Closer to the pooled model or to Pitt and Lee? Greene (2004): Much closer to the Pitt and Lee model 26/76 In these models with time varying inefficiency, yit    x it  vit  gi (t , z it ) | U i | vit ~ N [0, 2v ] and U it ~ N [0, u2 ], where does unobserved time invariant heterogeneity end up? In the inefficiency! Even with the extensions. 27/76 Skepticism About Time Varying Inefficiency Models: Greene (2004) 28/76     True Random Effects 29/76 True Random and Fixed Effects: 2004 True Random and Fixed Effects Approach: 2004 Time yit  i  x it  vit  uit varying vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ] Time fixed i  Unobserved time invariant heterogeneity, not unobserved time invariant inefficiency Jondrow et al. (JLMS) Inefficiency Estimator (it )      E [uit | it ]      2   it 1    (  )   it  u it 2 2 it  vit  uit ,   ,   v  u , i  v  30/76 Estimation of TFE and TRE Models: 2004 True Fixed Effects: MLE yit  i  x it  vit  uit vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ] i  Unobserved time invariant heterogeneity, not unobserved time invariant inefficiency Just add firm dummy variables to the SF model (!) True Random Effects: Maximum Simulated Likelihood (RPM) yit  (   wi )  x it  vit  uit vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ], wi ~ N [0, 2w ] i  Unobserved time invariant heterogeneity, not unobserved time invariant inefficiency Random parameters stochastic frontier model 31/76 Log likelihood function for stochastic frontier model log L(, , , ) = 32/76  N i 1 2   yi    xi    log   log         ( yi    xi )      log       Simulated log likelihood function for stochastic frontier model with a time invariant random constant term. (TRE model)  2  yit  (    w wir )  x it          N T 1 R   S log L (,,,, w ) =  i 1 log  r 1  t 1  R   ( yit  (   w wir )  x it )          wir  draws from N[0,1]. 33/76 The Most Famous Frontier Study Ever 34/76 The Famous WHO Model  logCOMP= +1logPerCapitaHealthExpenditure + 2logYearsEduc + 3Log2YearsEduc +    = v - u  Schmidt/Sickles FEM  191 Countries. 140 of them observed 1993-1997. 35/76 The Notorious WHO Results 37 36/76 August 12, 2012 37 No, it doesn’t. 37/76 Huffington Post, April 17, 2014 38/76 we are #37 39/76 Greene, W., Distinguishing Between Heterogeneity and Inefficiency: Stochastic Frontier Analysis of the World Health Organization’s Panel Data on National Health Care Systems, Health Economics, 13, 2004, pp. 959-980. 40/76 x  1,log Exp,log Ed ,log 2 Ed z  log PopDen,log PerCapitaGDP, GovtEff ,VoxPopuli, OECD, GINI 41/76 Three Extensions of the True Random Effects Model 42/76 Generalized True Random Effects Model Generalized True Random Effects Stochastic Frontier Model yit    Ai  Bi  xit  vit  uit Transient random components vit  uit Time varying normal - half normal SF Persistent random components Ai  Bi 43/76 Time fixed normal - half normal SF A Stochastic Frontier Model with Short-Run and Long-Run Inefficiency: Colombi, R., Kumbhakar, S., Martini, G., Vittadini, G., University of Bergamo, WP, 2011, JPA 2014, forthcoming. Tsionas, G. and Kumbhakar, S. Firm Heterogeneity, Persistent and Transient Technical Inefficiency: A Generalized True Random Effects Model Journal of Applied Econometrics. Published online, November, 2012. Extremely involved Bayesian MCMC procedure. Efficiency components estimated by data augmentation. 44/76 Generalized True Random Effects Stochastic Frontier Model yit  (   w wi   | ei |)  xit  vit  uit Time varying, transient random components vit ~ N [0, v2 ], uit | U it | and U it ~ N [0, u2 ], Time invariant random components wi ~ N [0,1], ei ~ N [0,1] The random constant term in this model has a closed skew normal distribution, instead of the usual normal distribution. 45/76 Estimating Efficiency in the CSN Model Moment Generating Function for the Multivariate CSN Distribution E[exp(tui ) | y i ]   T 1 (Rri  t,  ) exp  tRri  12 tt   T 1 (Rri ,  )  (...,  )  Multivariate normal cdf. Parts defined in Colombi et al. Computed using GHK simulator.  ei   1 u  0 u i   i1  , t =   ,         u 0  iT  46/76 0 0  1 0   , ...,           0    1 Estimating the GTRE Model 47/76 Colombi et al. Classical Maximum Likelihood Estimator log T (y i  Xi  1T ,   AVA)   log L   i 1   log  ( R ( y  X   1  ,  ))  nq log 2 q i i T   T (...)  T-variate normal pdf. N  q (...,  ))  (T  1)  Multivariate normal integral. Very time consuming and complicated. “From the sampling theory perspective, the application of the model is computationally prohibitive when T is large. This is because the likelihood function depends on a (T+1)-dimensional integral of the normal distribution.” [Tsionas and Kumbhakar (2012, p. 6)] 48/76 Kumbhakar, Lien, Hardaker Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming, JPA, Published online, September, 2012. Three steps based on GLS: (1) RE/FGLS to estimate (,) (2) Decompose time varying residuals using MoM and SF. (3) Decompose estimates of time invariant residuals. 49/76 Maximum Simulated Full Information log likelihood function for the "generalized true random effects stochastic frontier model"  2  yit  (   w wir   | U ir |)  xit        T      t 1   ( y  (   w   | U |)  x )  it w ir ir it         draws from N[0,1]  ,   N 1 R   logLS  ,   =  i 1 log  r 1 R  ,  w  wir |Uir | absolute values of draws from N[0,1] 50/76 WHO Results: 2014 x  1, log Exp, log Ed , log 2 Ed z  log PopDen, log PerCapitaGDP, GovtEff ,VoxPopuli, OECD, GINI it  Ai  Bi  vit  uit 51/76 52/76 Empirical application Cost Efficiency of Swiss Railway Companies 53/76 Model Specification TC = f ( Y1, Y2, PL , PC , PE , N, NS, dt ) C : Total costs Y1 : Passenger-km Y2 : Ton-km PL : Price of labor (wage per FTE) PC : Price of capital (capital costs / total number of PE : Price of electricity N : Network length NS: Number of stations Dt: time dummies 54/76 seats) Data       55/76 50 railway companies Period 1985 to 1997 unbalanced panel with number of periods (Ti) varying from 1 to 13 and with 45 companies with 12 or 13 years, resulting in 605 observations Data source: Swiss federal transport office Data set available at http://people.stern.nyu.edu/wgreene/ Data set used in: Farsi, Filippini, Greene (2005), Efficiency and measurement in network industries: application to the Swiss railway companies, Journal of Regulatory Economics 56/76 57/76 Cost Efficiency Estimates 58/76 Correlations 59/76 MSL Estimation 60/76 Why is the MSL method so computationally efficient compared to classical FIML and Bayesian MCMC for this model?  Conditioned on the persistent effects, the group observations are independent.  The joint conditional distribution is simple and easy to compute, in closed form.  The full likelihood is obtained by integrating over only one dimension. (This was discovered by Butler and Moffitt in 1982.)  Neither of the other methods takes advantage of this result. Both integrate over T+1 dimensions. 61/76 62/76 Equivalent Log Likelihood – Identical Outcome One Dimensional Integration over δi T+1 Dimensional Integration over Rei. 63/76 Simulated [over (w,h)] Log Likelihood  N i 1 1 R  S log   r 1 Gi (ir | , , , , w , h )  R  Very Fast – with T=13, one minute or so 64/76 Also Simulated Log Likelihood GHK simulator is used to approximate the T+1 variate normal integrals. Very Slow – Huge amount of unnecessary computation. 65/76 Computation of the GTRE Model is Actually Fast and Easy 247 Farms, 6 years. 100 Halton draws. Computation time: 35 seconds including computing efficiencies. 66/76 Simulation Variance 67/76 Does the simulation chatter degrade the econometric efficiency of the MSL estimator?  Hajivassiliou, V., “Some practical issues in maximum simulated likelihood,” Simulation-based Inference in Econometrics: Methods and Applications, Mariano, R., Weeks, M. and Schuerman, T., Cambridge University Press, 2008  Speculated that Asy.Var[estimator] = V + (1/R)C  The contribution of the chatter would be of second or third order. R is typically in the hundreds or thousands.  No other evidence on this subject. 68/76 An Experiment Pooled Spanish Dairy Farms Data  Stochastic frontier using FIML.  Random constant term linear regression with constant term equal to  - |w|, w~ N[0,1] This is equivalent to the stochastic frontier model.  Maximum simulated likelihood  500 random draws for the simulation for the base case. Uses Mersenne Twister for the RNG  50 repetitions of estimation based on 500 random draws to suggest variation due to simulation chatter. 69/76 ˆ v  0.10371 ˆ u  0.15573 70/76 Simulation Noise in Standard Errors of Coefficients Chatter .00543 .00590 .00042 .00119 71/76 Quasi-Monte Carlo Integration Based on Halton Sequences Coverage of the unit interval is the objective, not randomness of the set of draws. Halton sequences --- Markov chain p = a prime number, r= the sequence of integers, decomposed as  I i 0 bi p i H(r|p)   i  0 bi p  i 1 , r = r1 ,... (e.g., 10,11,12,...) I For example, using base p=5, the integer r=37 has b0 = 2, b1 = 2, and b3 = 1; (37=1x52 + 2x51 + 2x50). Then H(37|5) = 25-1 + 25-2 + 15-3 = 0.488. 72/76 Is It Really Simulation?  Halton or Sobol sequences are not random  Far more stable than random draws, by a factor of about 10.  There is no simulation chatter  View the same as numerical quadrature  There may be some approximation error. How would we know? 73/76 Halton Sequences Coverage of the unit interval is the objective, not randomness of the set of draws. Halton sequences --- Markov chain p = a prime number, r= the sequence of integers, decomposed as  H(r|p)   i  0 bi p  i 1 , r = r1 ,... (e.g., 10,11,12,...) I 74/76 I i 0 bi p i Haltonized Log Likelihood LogL(, , , , , )   2  yit    xit  i     T      N         2  i    log          i 1  t 1     yit    xit  i                    LogLS (, , , , , )   N 1 R  T   log   R r 1  t 1 i 1   ir   wWir   h | H ir |  2  yit    xit  ir                   yit    xit  ir                Wir   1  Halton[prime( w), r  burn in] H ir   1  Halton[prime(h), r  burn in] 75/76   i         Summary  The skew normal distribution  Two useful models for panel data (and one potentially useful model pending development)  Extension of TRE model that allows both transient and persistent random variation and inefficiency  Sample selection corrected stochastic frontier  Spatial autocorrelation stochastic frontier model  Methods: Maximum simulated likelihood as an alternative to received brute force methods     76/76 Simpler Faster Accurate Simulation “chatter” is a red herring – use Halton sequences Sample Selection 77/76 TECHNICAL EFFICIENCY ANALYSIS CORRECTING FOR BIASES FROM OBSERVED AND UNOBSERVED VARIABLES: AN APPLICATION TO A NATURAL RESOURCE MANAGEMENT PROJECT Empirical Economics: Volume 43, Issue 1 (2012), Pages 55-72 Boris Bravo-Ureta University of Connecticut Daniel Solis University of Miami William Greene New York University 78/76 The MARENA Program in Honduras  Several programs have been implemented to address resource degradation while also seeking to improve productivity, managerial performance and reduce poverty (and in some cases make up for lack of public support).  One such effort is the Programa Multifase de Manejo de Recursos Naturales en Cuencas Prioritarias or MARENA in Honduras focusing on small scale hillside farmers. 79/76 Expected Impact Evaluation 80/76 Methods  A matched group of beneficiaries and control farmers is determined using Propensity Score Matching techniques to mitigate biases that would stem from selection on observed variables.  In addition, we deal with possible self-selection on unobservables arising from unobserved variables using a selectivity correction model for stochastic frontiers introduced by Greene (2010). 81/76 A Sample Selected SF Model di = 1[′zi + hi > 0], hi ~ N[0,12] yi =  + ′xi + i, i ~ N[0,2] (yi,xi) observed only when di = 1. i = vi - ui ui = u|Ui| where Ui ~ N[0,12] vi = vVi where Vi ~ N[0,12]. (hi,vi) ~ N2[(0,1), (1, v, v2)] 82/76 Simulated logL for the Standard SF Model exp[ 12 ( yi    xi  u |Ui |)2 / v2 ] f ( yi | xi ,| U i |)  v 2  f ( yi | xi )   |Ui | exp[ 12 ( yi    xi  u |Ui |)2 / v2 ] p(| Ui |)d | Ui | v 2  2exp[  12 | U i |2 ] p(| U i |)  , |U i |  0. (Half normal) 2 1 R exp[ 12 ( yi    xi  u |Uir |)2 / v2 ] f ( y | xi )   R r 1 v 2  2 2   1 R exp[ 12 ( yi    xi  u |Uir |) / v ]   logLS (,,u ,v ) = i =1 log   r 1  R  2     v  N This is simply a linear regression with a random constant term, αi = α - σu |Ui | 83/76 Likelihood For a Sample Selected SF Model f  yi | ( x i , d i , zi ,| U i |)   exp   12 ( yi    x i  u | U i |)2 / v2 )   v 2    di    ( yi    x i  u | U i |) /   zi     2  1     f  yi | ( x i , d i , zi )   84/76  |U i |      (1  d i ) (  zi )    f  yi | ( xi , d i , zi ,| U i |)  f (| U i |)d | U i | Simulated Log Likelihood for a Selectivity Corrected Stochastic Frontier Model The simulation is over the inefficiency term. log LS (, , u , v , , )   i 1 log N 85/76 1 R  R r 1   exp   12 ( yi    x i  u | U ir |) 2 / v2 )   di   v 2         ( y    x   | U |) /   z    i i u ir  i     2    1          (1  d ) (  z )  i i   JLMS Estimator of ui    exp  12 ( yi  ˆ  ˆ x i  ˆ u | U ir |) 2 / ˆ v2 )      ˆ v 2  fîr      ˆ ( yi  ˆ  ˆ x i  ˆ u | U ir |) / ˆ v  ai      2  1  ˆ     Â = 1  R ( ˆ | U |) fˆ , Bˆ  1  R fˆ i u ir ir i ir R r 1 R r 1 Aî uî  Estimator of E [ui |i ]  Bî R R fîr ˆ ˆ ˆ   r 1 gir | uU ir | where gir  R ,  r 1 gˆ ir  1 ˆ f  r 1 ir 86/76 Closed Form for the Selection Model  The selection model can be estimated without simulation  “The stochastic frontier model with correction for sample selection revisited.” Lai, Hung-pin. Forthcoming, JPA  Based on closed skew normal distribution  Similar to Maddala’s 1982 result for the linear selection model. See slide 42.  Not more computationally efficient.  Statistical properties identical.  Suggested possibility that simulation chatter is an element of inefficiency in the maximum simulated likelihood estimator. 87/76 Closed Form vs. Simulation Spanish Dairy Farms: Selection based on being farm #1-125. 6 periods The theory works. 88/76 Variables Used in the Analysis Production Participation 89/76 Findings from the First Wave 90/76 A Panel Data Model  Selection takes place only at the baseline.  There is no attrition. d i 0  1[zi 0  hi 0 > 0] Sample Selector yit    wi  x it  vit  uit , t  0,1,... Stochastic Frontier Selection effect is exerted on wi ; Corr(hi 0 , wi ,)   P( yit , d i 0 )  P(d i 0 ) P( yit | d i 0 ) Conditioned on the selection (hi 0 ) observations are independent. P( yi 0 , yi1 ,..., yiT | d i 0 )   t 0 P( yit | d i 0 ) T I.e., the selection is acting like a permanent random effect. P( yi 0 , yi1 ,..., yiT , d i 0 )  P( d i 0 ) t 0 P( yit | d i 0 ) T 91/76 Simulated Log Likelihood log LS ,C (, , u , v , ) 1 R   d 1 log  r 1 i R 92/76  T t 0  exp   12 ( yit    xit  u | U itr |) 2 /  v2 )     v 2       ( yit    xit  u | U itr |) /  v  ai 0      2  1      Main Empirical Conclusions from Waves 0 and 1     93/76 Benefit group is more efficient in both years The gap is wider in the second year Both means increase from year 0 to year 1 Both variances decline from year 0 to year 1 94/76 Spatial Autocorrelation 95/76 True Random Spatial Effects  Spatial Stochastic Frontier Models: Accounting for Unobserved Local Determinants of Inefficiency: A.M.Schmidt, A.R.B.Morris, S.M.Helfand, T.C.O.Fonseca, Journal of Productivity Analysis, 31, 2009, pp. 101-112  Simply redefines the random effect to be a ‘region effect.’ Just a reinterpretation of the ‘group.’ No spatial decay with distance.  True REM does not “perform” as well as several other specifications. (“Performance” has nothing to do with the frontier model.) 96/76

Greene-TrueRandomEffects-APPC2014 - NYU Stern

Related documents

Products

Support

Greene-TrueRandomEffects-APPC2014 - NYU Stern

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib