1 Supplemental Material The data used in our analysis are in the form of a panel covering a number of periods (T) and individuals (N – the states). Such data may not be stationary as found in McCarl, Villavicencio and Wu (2008). Thus we first test for a unit root (See section 1 below) and find that the hypothesis of stationarity is rejected (see testing results in Table S1). Given we found nonstationarity we must address it. The concept of co-integration provides a powerful tool for coping with nonstationarity in regressions (Hamilton, 1994). Co-integration refers to the phenomenon that the error term of a regression over nonstationary variables can itself be stationary allowing inference of meaningful causal relationship from regressions over nonstationary time series. Thus next we test the hypothesis of co-integration. In particular, we test whether a linear combination of the variables given in the our models is stationary, or co-integrated meaning we need to use an error-correction model. (see a Section 2 or Engle, R.F. and C.W.J. Granger (1987) and Greene (2003) for details). Our tests fail to reject co-integration within our proposed model thus indicating we can do statistical inference over regression results and that we need to use an error correction model. We then use an error correction model to fit the proposed model. An error correction model is a dynamical system with the characteristics that the deviation of the current state from its long-run relationship will be fed into its short-run dynamics (Engle and Granger, 1987). This family of models is useful for estimating both short term and long term effects of one time series on another, especially when dealing with integrated data. Lastly, we examined the sensitivity of our results using alternative specifications. Since one focus of this paper is to examine the impact of research investment on agricultural productivity, we reestimate our models with simplified parameterization of research funding by excluding research and funding shares. We also experimented with alternative specification of regions by dividing the Mountain region into Mountain-North and Mountain-South, and the Pacific region into Pacific-North and Pacific-South. The results reported in the text are found to be quantitatively similar to these alternative estimates. 2 1 Panel Unit Root and Cointegration Tests Unit root tests could be applied to the data for each individual state. However, we assume that the data for all states exhibit similar structural and inter-temporal relationships, and thus conduct the test taking into account the panel structure of our data explicitly. We use three unit root tests: Levin, Lin and Chu (LLC) Test The Levin, Lin and Chu (2002) test examines the null hypothesis that a family of time series contains a unit root versus the alternative that the time series are stationary. The test is similar to an Augmented Dickey-Fuller (ADF) test but is applied in a panel framework. This test assumes independence across cross sections, which does not necessarily hold; and that all cross sections have or do not have a unit root, which is very restrictive. Im, Pesaran and Shin (IPS) Test Im, Pesaran and Shin (2003) propose an alternative testing procedure that averages the individual unit root test statistics. The null hypothesis is that each series in the panel has a unit root, H 0 : i 0 and the alternative hypothesis states that some individual series have unit roots while others are stationary. Breitung Test Breitung (2000) suggests a test statistic with greater power. The test involves performing a pooled regression on the error terms and then testing using the t-statistic for H 0 : 0 . The results from the application of the unit root tests are presented in Table S1. There the results indicate that the hypothesis of stationarity is rejected. 2 Addressing Panel Cointegration Next we provide the technical details of the panel cointegration tests used. Through this section, the regressor X includes all variables specified in formula (5) in the text; for instance, it includes temperature and its square. Following Greene (2003), suppose that a simplified model in which two nonstationary variables yt and z t are cointegrated, with a cointegrating vector [1, ] . Then yt , z t , and ( yt z t ) are stationary, where ∆ represents first difference. Therefore, the error correction model (ECM) 3 yt zt ( yt 1 zt 1 ) t (1) describes the variation in yt around its long-run trend in terms of the variation in z t around its long-run trend, and the error correction ( yt z t ) , which is the equilibrium error in the model of cointegration. This model is stable because the implied variables are stationary. The same assumption that we make to establish cointegration implies (and is implied by) the existence of an ECM as shown by the Granger representation theorem (see Hamilton, 1994). In the more general framework of a multivariate and heterogeneous panel model, the error correction equation can be expressed as: (2) p 1 q 1 j 1 j 0 yit i ( yit 1 iX it ) ijyit j ij X it j i it where the parameter i is the error-correcting speed of adjustment term. It is expected that i 0 , in which case there is evidence of cointegration. The vector i captures the long-run relationship between the variables, and the other estimated parameters (ij , ij ) characterize the short-run dynamics of the implied variables. Pesaran, Shin and Smith (1999) proposed a pooled-mean-group (PMG) estimator that combines both pooling and averaging: the estimator allows the intercept, short-run coefficients, and error variances to differ across individuals but constrains the long-run coefficients to be equal across individuals. Since model (2) is nonlinear in the parameters, they developed a maximum likelihood method to estimate the parameters. The log likelihood function takes the form (3) lT ( , , ) T N 1 N 1 ln(2 i2 ) 2 {yi i i ( )}H i {yi i i ( )} 2 i 1 2 i 1 i where i ( ) yit 1 X i i , H i I T Wi (WiWi )Wi , I T is an identity matrix of order T , and Wi (y it 1 ,, y it p 1 , X i , X it 1 ,, X it q 1 ) . The estimators can be computed using the usual Newton-Raphson algorithm, which needs first and second derivatives of the likelihood function, or an iterative “back substitution” algorithm which requires only first derivative computations. More details are given in Pesaran, Shin and Smith (1999). Three different cointegration tests were used the Kao, Pedroni and Westerlund tests as desciberd below. 4 Kao Tests This is a residual-based Dickey-Fuller (DF) kind of test. It is based on testing whether the residuals of the panel estimation are stationary or not. Kao (1999) proposed DF and ADF tests of unit root for the residuals eit as a test for the null of no cointegration. The DF test is applied to the fixed effect residuals. Pedroni Tests Pedroni (1999) proposed several tests and critical values for the null hypothesis of panel cointegration, which allow not only the dynamics and fixed effects to differ across members of the panel, but also that they allow the cointegrating vector to differ across members. These tests are applied over the regression residuals from the hypothesized cointegrating regression. Westerlund Tests Westerlund (2007) proposes four panel tests of the null hypothesis of no cointegration that are based on structural rather than residual dynamics. These structural kind of test does not impose any common factor restriction,1 which is a main reason associated to loss of power for residualbased cointegration tests. However, Westerlund tests are more restrictive than Pedroni’s residualbased tests in the sense that the former do not allow endogenous regressors in the model. Test results The results of our co-integration tests are given in Table S.2 below. The panel co-integration hypothesis is not rejected, lending support to our use of the panel error correction models in our estimations. 3 Elasticities for research investments This model is expressed in a double-logarithmic form such that the estimated coefficient i represents the elasticity of TFP with respect to variables of interest (RPUBSPILL, EXT, RPRI). The funding shares (SFF and GR) are multiplied with the public agricultural research capital (RPUB) such that the elasticity of TFP with respect to RPUB depends on the funding composition: 1 Kremers, Ericsson and Dolado (1992) define common factor restriction to the fact that residual-based tests require the long-run cointegrating vector for the variables in their levels being equal to the short-run adjustment process for the variables in their differences. 5 ln(TFP) / ln( RPUB) 2 3 SFF 4 ( SFF ) 2 5GR 6 (GR) 2 . (4) Similarly, the effect on TFP of a one percent change in SFF (or GR) is not constant and it can include nonlinear impacts of funding composition: ln(TFP) / ln( SFF ) ( 3 2 4 SFF ) ln RPUB (5) (6) 4 ln(TFP) / ln(GR) ( 5 26GR) ln RPUB . Alternative specifications To investigate the stability of our results we estimated the model using alternative construction of regions. To assess the sensitivity of our results to the construction of research funding variables and region specifications, we remove the funding and grant shares and further divide the mountain and pacific regions into a north sub-region and a south sub-region. The estimation results are reported in Table S3. It is seen that the overall results are similar to those reported in the text. References Engle, R.F. and C.W.J. Granger, “Co-Integration and Error Correction: Representation, Estimation, and Testing”, Econometrica, Vol. 55, No. 2 (Mar., 1987), pp. 251-276. 6 Table S1. Panel Unit Root Test: Summary Sample: 1970 1999 Cross Sections: 48 Individual effects Level LTFP Statistic P-value Null: Unit root (assumes common unit root process) Levin, Lin & Chu t* 1.34 0.909 Breitung t-stat 2.70 0.997 Obs. Individual effects & linear trends Level Statistic P-value Obs. 1329 1281 -11.87 -1.28 0.000 0.100 1367 1319 Null: Unit root (assumes individual unit root process) Im, Pesaran and Shin W-stat 7.52 1.000 1329 -12.62 0.000 1367 LRPUB Levin, Lin & Chu t* -8.34 Breitung t-stat 1.57 Im, Pesaran and Shin W-stat 0.10 0.000 0.941 0.542 1257 1209 1257 0.94 -8.01 -7.39 0.827 0.000 0.000 1265 1217 1265 LRPUB SF Levin, Lin & Chu t* -1.40 Breitung t-stat -1.06 Im, Pesaran and Shin W-stat -2.45 0.080 0.145 0.007 1353 1305 1353 -3.94 -3.86 -5.43 0.000 0.000 0.000 1350 1302 1350 LRPUB SF2 Levin, Lin & Chu t* -1.45 Breitung t-stat -1.58 Im, Pesaran and Shin W-stat -2.85 0.073 0.057 0.002 1354 1306 1354 -4.57 -3.73 -5.89 0.000 0.000 0.000 1356 1308 1356 LRPUB GR Levin, Lin & Chu t* -0.63 Breitung t-stat -2.25 Im, Pesaran and Shin W-stat -0.45 0.265 0.012 0.326 1371 1323 1371 -2.38 1.50 -2.38 0.009 0.933 0.009 1361 1313 1361 LRPUB GR2 Levin, Lin & Chu t* -1.13 Breitung t-stat -1.97 Im, Pesaran and Shin W-stat 0.78 0.130 0.025 0.783 1357 1309 1357 -2.60 0.41 -1.99 0.005 0.658 0.024 1352 1304 1352 LEXT Levin, Lin & Chu t* -8.57 Breitung t-stat -2.17 Im, Pesaran and Shin W-stat -4.62 0.000 0.015 0.000 1369 1321 1369 -7.52 -0.55 -8.37 0.000 0.292 0.000 1365 1317 1365 LRPUBSPILL Levin, Lin & Chu t* -6.87 Breitung t-stat 3.96 Im, Pesaran and Shin W-stat 3.03 0.000 1.000 0.999 1288 1240 1288 11.88 -10.45 0.26 1.000 0.000 0.601 1281 1233 1281 7 Table S1 continued Individual effects Level Statistic P-value Obs. Individual effects & linear trends Level Statistic P-value Obs. 0.000 0.000 0.000 1338 1290 1338 -24.92 0.45 -25.92 0.000 0.675 0.000 1344 1296 1344 LTEMP Levin, Lin & Chu t* -24.45 Breitung t-stat -22.90 Im, Pesaran and Shin W-stat -21.21 0.000 0.000 0.000 1373 1325 1373 -23.78 2.65 -20.56 0.000 0.996 0.000 1356 1308 1356 LPREC Levin, Lin & Chu t* -30.46 Breitung t-stat -18.68 Im, Pesaran and Shin W-stat -28.49 0.000 0.000 0.000 1372 1324 1372 -26.37 -3.56 -24.58 0.000 0.000 0.000 1366 1318 1366 LINTENS Levin, Lin & Chu t* -28.00 Breitung t-stat -19.79 Im, Pesaran and Shin W-stat -28.65 0.000 0.000 0.000 1385 1337 1385 -24.51 -7.43 -26.71 0.000 0.000 0.000 1377 1329 1377 LPRI Levin, Lin & Chu t* -27.50 Breitung t-stat -26.01 Im, Pesaran and Shin W-stat -26.05 ** Probabilities for Fisher tests are computed using an asymptotic Chi-square distribution. All other tests assume asymptotic normality. 8 Table S2. Cointegration Test: Summary Sample: 1970 1999 Cross Sections: 48 Pedroni cointegration tests panel v-stat panel rho-stat panel pp-stat panel adf-stat Constant Statistic -0.82 -4.60 -20.10 -9.88 P-value 0.205 0.000 0.000 0.000 Constant & Trend Statistic P-value -3.76 0.000 -2.45 0.007 -23.80 0.000 -9.69 0.000 group rho-stat -2.22 0.013 -0.03 0.489 group pp-stat -22.28 0.000 -26.89 0.000 group adf-stat -8.24 0.000 -9.12 0.000 **All reported values are distributed N(0,1) under null of unit root or no cointegration. **Panel stats are unweighted by long run variances. Kao cointegration tests Constant Statistic P-value DFrho -31.88 0.000 DFt -17.59 0.000 **Stats are distributed N(0,1) under null of no cointegration. Constant & Trend Statistic P-value -33.94 0.000 -18.64 0.000 Westerlund cointegration tests Lags: 1 – 2 Leads: 0 – 1 Average AIC selected lag length: 1.98 Average AIC selected lead length: .96 Constant Constant & Trend Statistic Value Z-value P-value Value Z-value Gt -4.06 -11.71 0.000 -4.23 -10.39 Ga -0.24 11.50 1.000 -0.13 13.81 Pt -22.25 -6.80 0.000 -25.95 -7.75 Pa -2.56 6.16 1.000 -1.99 9.57 **Z-values are distributed N(0,1) under null of no cointegration. P-value 0.000 1.000 0.000 1.000 9 Table S3. Regression results under alternative region specifications Dependent Variable ln (Ag. TFP) ln (Public Ag. Research Capital) ln (Public Extension Capital) ln (Public Ag. Research Capital Spilling) ln (Private Ag. Research Capital) Trend Model A Coef. P>|z| Model B Coef. P>|z| 0.0941 0.002 0.002 -0.0235 0.168 0.0929 0.0243 0.4937 0.000 0.000 -0.1358 0.0029 0.004 0.348 ln (Temperature) × D1 -0.2497 0.027 ln (Temperature) × D2 -0.0536 0.807 ln (Temperature) × D3 -0.0200 0.877 ln (Temperature) × D4 -0.0320 0.843 ln (Temperature) × D5 ln (Temperature) × D6 ln (Temperature) × D7 ln (Temperature) × D6_1 ln (Temperature) × D6_2 ln (Temperature) × D7_1 ln (Temperature) × D7_2 ln Total Precipitation ln Precipitation Intensity Intercept -0.4812 -0.1129 0.0161 0.065 0.497 0.966 0.4953 0.1347 0.0029 0.2498 0.0543 0.0192 0.0312 0.4811 0.0288 0.2587 0.1891 0.904 1.1413 0.0352 0.0255 0.283 0.019 0.0349 0.020 -0.0246 0.080 0.154 0.004 0.345 0.027 0.805 0.881 0.847 0.065 0.272 0.628 0.070 Notes: Model A: estimation with alternative funding parameterization. Model B: estimation with alternative funding parameterization and alternative region definition. In Model A, D6 and D7 are dummy variables for the Mountain and Pacific regions, in Model B, D6_1 and D6_2 refer to the Mountain-North and Mountain-South regions and D7_1 and D7_2 refer to the Pacific-North and Pacific-South regions.