Beyond OLS: A non-technical guide to econometric methods1 The workhorse method in quantitative social science has long been Ordinary Least Squares (OLS). It’s popularity is due to several reasons. The first is that given a set of assumptions, which we will come back to later, it yields the best, linear, unbiased estimators (blue), where “best” refers to the estimator with the lowest variance. Second, the estimators based on OLS are relatively easy to compute. Third, they are relatively straightforward to interpret. With the advent of powerful statistical software packages, the second argument is no longer as crucial as it once was. Therefore, with the recognition that the restrictive assumptions underlying the OLS-estimators as BLUE often do not hold, social scientists are increasingly using more complicated techniques when analyzing data. In this short paper, I will first review the crucial assumptions underlying the OLS-as-BLUE result. Then, I will go on and look at some more advanced techniques used by social scientists in a non-technical jargon, and explain why they are superior to OLS in different circumstances. This is only a cursory text aimed at providing an initial presentation of these techniques, and the rationale underlying them. Those who are interested in learning more about these techniques, or who are contemplating using these techniques in their own analysis, can confront for example (after increasing degree of technical difficulty) Kennedy (2003), Guajarati (2003) or Greene (2003). At some point, learning more about these techniques requires a certain degree of mathematical knowledge. However, I believe it is possible to comprehend the core of these techniques without indulging in too technical/mathematical language. My aim is to give the readers enough understanding to be able to read and understand empirical social science papers and books that use these techniques, and maybe even be able to critically assess such works. Readers are assumed to have basic knowledge of OLS and elementary statistics. When is OLS BLUE? Peter Kennedy (2003:48-9) lists up the five main assumptions underlying the so-called Classical Linear Regression model, under which OLS gives the best, linear, unbiased estimators. Let me re-list these assumptions and the possible violations as identified by Kennedy: A1) The dependent variable can be calculated as a linear function of a set of specific independent variables and an error term. This assumption is crucial if the estimators are to be interpreted as decent “guesses” on effects from independent variables on the dependent. Violations of this assumptions lead to what is generally known as “specification errors”. One should always approach quantitative empirical studies in the social sciences with the question “is the regression equation specified correctly?” One particular type of specification error is excluding relevant regressors. This is for example crucial when investigating the effect of one particular independent variable, let’s say democracy, on a 1 Some of the material is taken from an earlier working paper (Knutsen, 2008) 1 dependent variable, let’s say growth. If one important variable, let’s say level of income, is missing from the regression equation, one risks facing omitted variable bias. The estimated effect of democracy can now be systematically over- or understated, because income level affects both democracy and economic growth. The democracy coefficient will pick up some of the effect that is really due to income, on economic growth. Identifying all the right “control variables” is a crucial task, and disputes over proper control variables can be found everywhere in the social sciences. Another variety of this specification error is including irrelevant controls. If one for example wants to estimate the total, and not only the “direct”, effect from democracy on growth, one should not include variables that are theoretically expected to be intermediate variables. That is, one should not include variables through which democracy affects growth. One example could be a specific type of policy, A. If one controls for policy A, one controls away the effect of democracy on growth that is due to democracies being more likely to push through policy A. If one controls for Policy A, one does not estimate the total effect of democracy on growth. Another specification error that can be conducted is assuming a linear relationship when the relationship really is non-linear. In many instances, variables are not related in a fashion that is close to linearity. Transformations of variables can however often be made that allows an analyst to stay within an OLS-based framework. If one suspects a U- or inversely U-shaped relationship between two variables one can square the independent variable before entering it into the regression model. If one suspects that the effect of an increase in the independent variable is larger at lower levels of the independent variable, one can log-transform the independent variable. The effect of an independent variable might also be dependent upon the specific values taken by other variables, or be different in different parts of the sample. Interaction terms and delineations of the sample are two suggested ways to investigate such matters. A2) The expected value of the disturbance term is zero If this property is violated, there will be bias in the intercept, because OLS as a procedure forces the average of the error terms to be zero. Note that the assumption is about the underlying structure of the world, and the OLS-procedure follows this assumption and forces the error terms into a certain structure. A3) The disturbance terms have the same variance and are not correlated2 Violations of these two properties result in two problems that are very commonly associated with quantitative, empirical studies, namely heteroskedasticity and autocorrelation. Heteroskedasticity implies that the disturbances do not have the same variances, and the variances are often a function of specific independent variables. There might be higher disturbance terms for poor countries in a given model on economic growth than for rich countries, to take one example. Autocorrelation implies that the disturbance terms are systematically correlated in one way or another. Growth in year t for a country is correlated with growth in year t+1. In these cases, OLS is not BLUE, and more efficient (smaller variance in the coefficient estimates) techniques can be utilized. 2 Notice that ”disturbance term” refers to the real world deviations from the real world linear relationship, whereas ”error term” refers to the estimated deviation from the estimated linear relationship. 2 A4) Observations on the independent variable can be considered fixed in repeated samples. The strictest interpretation of this assumption is that values on the independent variables are manipulated by the researcher, as in experiments. However, a less strict form of exogeneity of the independent variables is sufficient for OLS to function properly: The independent variables should not be affected by the dependent variable. In the language of causality, the dependent variable must be an effect and not a cause of the independent variable. In many cases, two variables can be both causes and effects of each other. If so, OLS regressions will give biased results. For example, if high GDP-levels increase average investment rates and high investment rates increase GDP-levels, an OLS equation where investment rate is the independent variable and GDP the dependent variable will systematically overstate the effect of investment rates on GDP. Another problem related to assumption A4 is measurement error. If there is an unsystematic measurement error in the dependent variable, we will get unbiased OLS-estimates. But, we will get estimates with larger standard errors. If we however have unsystematic measurement errors in the independent variable, we will get biased estimates. In a bivariate regression, such measurement errors will tend to draw the coefficients towards zero, and this bias is known as the attenuation bias. Readers are encouraged to take their favorite social science research question and think about how systematic measurement errors might bias estimated relationships. A5) There are more observations than independent variables and there are no exact linear relationships between independent variables The violation of the first part of the assumption points to the “degree of freedom” problem, and the violation of the second to the “perfect multi-colinearity” problem. I will not dig deep into these important issues here, but for a treatment on how these problems might affect also qualitative social science research, see King et al (1994). Let me give one example of the degree of freedom problem, where we have more independent variables than observations: If we have two countries, where one experienced revolution and the other did not, and these countries differ in degree of development, institutional structure and cultural background, it is impossible to discern which of these three variables those were crucial to the existence of revolution in one country and non-existence in the other. If we face negative degrees of freedom or perfect multi-colinearity, a software-package will refuse to calculate results. However, one should note that also approximations to such situations will create problems for inference. Few degrees of freedom or high multi-colinearity will tend to give high standard deviations for coefficient estimates, thereby reducing the chance of obtaining significant coefficients. When it comes to multi-colinearity, if for example a high level of literacy and a high urbanization degree are strongly correlated with each other and with the probability of democratization, it is hard to discern what particular effects the two variables have on democratization. See Kennedy (2003) for a nice treatment of these issues. Pooled data: Combining cross-section and time-series information As social scientists have increasingly recognized, restricting the amount of information one uses when drawing inferences to cross sectional snapshots or cross-sectional averages over long time periods is often unnecessary. Analysis of cross-sectional data is therefore often substituted by analysis of data with a pooled cross-section time-series (PCSTS) structure. The different cross3 sectional units have observations on several time periods, and this vastly increases the amount of information available for inference. We can then draw on information from both cross-sectional and temporal variation when making inferences. However, when one incorporates a time-structure in the data, the problem of autocorrelation of disturbance terms flies directly in the analyst’s face. OLS is therefore no longer appropriate, and one must switch to PCSTS methods. However, even with a pure cross-sectional data structure, OLS analysis often encounters the other problem related to A2), namely heteroskedasticity. Readers might know that this problem can be reduced by applying Weighted Least Squares (WLS) instead of OLS in cross-section studies. Fortunately, PCSTS methods can also deal with heteroskedasticity. OLS with Panel Corrected Standard Errors There are several varieties of PCSTS, but I will focus only on one, namely OLS with Panel Corrected Standard Errors (PCSE). According to Beck and Katz (1995), OLS with PCSE is the most proper version of PCSTS for data sets with relatively many cross-sectional units and relatively short time-series. This is the situation for most data sets used in political economic research, and I will therefore not dwell on the other varieties of PCTS. OLS with PCSE allows us to estimate coefficients even when we face unbalanced panels (time series are not equally long for all cross section units). Luckily for those reading or doing quantitative social science research, OLS with PCSE, as the name indicates, builds on the familiar OLS-framework. Learning OLS with PCSE does therefore not require too much effort. This method can be used in several software packages, for example STATA. Essentially, the calculation of estimates is based on an OLS-procedure. However, the technique can take into account that disturbances in period t can be autocorrelated (within panels or generally) with the disturbances in period t-1. It can also take into account that disturbances might have different variances (that is they are heteroskedastic) between different panels. It can also deal with the problem of a disturbance term in one cross section unit at time t being correlated with the disturbances in other cross section units at time t. This latter phenomenon is called contemporaneous correlation. Let me provide an example: If we run a model on economic growth, the procedure takes into account that the disturbance term for Germany in year t (unexplained growth; maybe extraordinary low growth because of a recession) is correlated with the unexplained growth in Germany in year t-1 and with unexplained growth in France in t. The procedure also incorporates the possibility that Germany might have a lower variation in its disturbance term than France. These features of OLS with PCSE mitigate the problems related to A2), which would plague “ordinary” OLS. Moreover, the interpretation of the coefficients in OLS with PCSE is exactly the same as that of interpretation of OLS-coefficients: An increase in x1 by one unit, holding all other independent variables constant, increases the predicted y with β1. Therefore, there is not much more to be said here about OLS with PCSE. Interested readers are encouraged to check out Beck and Katz (1995) and Sayrs (1989). Panel data methods, Fixed Effects and Random Effects We now move on to two techniques that are much used in contemporary social science research, namely Fixed- and Random Effects. These two techniques resemble each other closely, and we start with Fixed Effects, which is easier to grasp. We remain in a panel data structure, with cross-section 4 units observed at many time points. The main objective is of course still to estimate or test the effects of particular independent variables on the dependent variable. However, if we have different assumptions of “how the world looks”, we will be obliged to use different methodologies. In the OLS with PCSE, it can be somewhat simplistically said that differences in X going together with observed differences in Y were used for inference independently of whether the differences were observed along the time dimension within a unit or between two units at the same or different time points. Let us concretize by assuming that we have a model where democracy affects economic growth, which we want to investigate empirically. In the OLS with PCSE set-up, the fact that Afghanistan had a low level of democracy and low economic growth in 1987 and that Norway had a high level of democracy and a high level of growth in 2003 was used as information for inference. The same went for information relying on comparisons of Norway in 1850 and 2003. We do of course control for other variables, but the main point is that both cross-sectional and temporal variation is used as information for drawing inferences. What if there are non-observed country specific factors that we do not include in the regression analysis framework those determine both the rate of growth and the degree of democracy in a country? More generally, what if there are non-observed factors that are specific for each cross section unit those affect both the independent and dependent variables? In that case, OLS with PCSE is inappropriate, since we should have controlled for such cross-section unit specific effects. If we believe that we have not identified all variables that are relevant to the analysis, and that we therefore have such cross-section unit specific effects, one solution is to run a so-called Fixed Effects regression. This analysis incorporates dummy-variables for all the cross-section units. Thereby, going back to our example again, one will in Fixed Effects only infer the effect of democracy on economic growth from investigating variation within nations along the time dimension as they become more or less democratic. In this sense, Fixed Effects analysis is a very restrictive analysis, since it does not allow us to infer anything about causal effects from cross-national variation. We still investigate the effect of democracy on growth, but we are not allowed to use information from the AfghanistanNorway comparison, for example. The main benefit is that we reduce the possibility of omitted variable bias. We take away the possibility that unidentified variables correlated with national features are driving our results, and thereby biasing our estimates. In Fixed Effects, we can also incorporate dummies for different time periods, thereby reducing the possibility of time-specific effects driving results. One can run Fixed Effects with dummies only on cross-sections, only on time periods, or on both. The choice depends on our assumptions of the workings of the world in our particular research question. However, we risk wasting a lot of information when using Fixed Effects to draw inferences. What if the difference between growth rates in Afghanistan and Norway are partly due to the fact that Norway is more democratic? In that case, on our quest to reduce omitted variable bias we risk wasting valuable information. Fixed Effects risks “throwing the baby out with the bath water” (Beck and Katz, 2001).This contributes to reduced efficiency in the Fixed Effects estimators. Since they are not using all relevant information, the estimators tend to have larger standard errors than if we were to use techniques that utilized more information. It is therefore more likely that we commit type II errors, by not identifying relevant effects. 5 Fixed Effects analysis assumes that each cross-section unit has its own specific intercept (the specific dummy variable-coefficient plus the common intercept) in the regression. Random Effects analysis moderates this assumption. Random Effects, like Fixed Effects, creates a different intercept for each cross-section unit, “but it interprets these differing intercepts in a novel way. This procedure views the different intercepts as having been drawn from a bowl of possible intercepts, so they may be interpreted as random … and treated as though they were part of the error term” (Kennedy, 2003:304). Under the assumption that the intercept is truly randomly selected, that is they will have to be uncorrelated with the independent variables, Random Effects gives increased efficiency to the estimates when compared to Fixed Effects. That is, the coefficients will have smaller standard errors. However, Random Effects will be biased if the error term is correlated with any of the independent variables. This last segment can be hard to grasp. It is sufficient for the reader to know, that Random Effects also controls for country specific effects, but is more efficient than Fixed Effects under certain assumptions (country specific effects are not correlated with independent variables). In practice, this can often lead to Random Effects finding significant effects when Fixed Effects does not. However, if the country specific effects are highly correlated with certain independent variables, Random Effects might be biased, and one should use Fixed Effects. Both Fixed Effects and Random Effects can be calculated in STATA, and there are several varieties of both techniques. The estimation procedure can for example rely on GLS or Maximum Likelihood estimation procedures (if you want to know more about these general estimation procedures, you can look them up in any econometric textbook, but these differences are of little relevance to us here). One can also incorporate the possibilities of heteroskedasticity or autocorrelation into the estimation procedure. The endogeneity problem: What if Y affects X? 2SLS! The issue of reverse causality permeates many studies in the social sciences, thereby rendering A4) false. In the study of the economic effects of democracy, for example, this problem comes to the fore. Economic factors are likely to influence political organization, and a correlation between democracy and economic growth cannot readily be attributed to the causal effect of democracy on growth. Lagging the independent variable could be seen as one way to try to deal with the issue of reverse causality, by exploiting the temporal sequence of cause and effect. However, there exist other and more solid statistical solutions. One proposed solution, very often used in econometric literature is to find so-called instrumental variables, or instruments, for endogenous independent variables. There are two requirements for a variable to be a valid and “decent” instrument for an endogenous independent variable. First, the instrument should be correlated with the independent variable. If the correlation is low we will often face very large standard errors for the estimated coefficients when using Instrumental Variable analysis. Second, an instrument should not be directly related to the dependent variable. This means that the instrument should only be correlated with the dependent variable through the independent variable it instruments for. The intuition behind the procedure is that we only utilize the “exogenous” part of the variation in the independent variable that is related to the exogenous instrument. We thereby get a better estimate of the causal effect from the independent variable on the dependent. If this second condition is not satisfied, the resulting estimates from the analysis will not be consistent. 6 Figure 2: Causal structure underlying Instrumental Variable analysis Instrument(-s) Endogenous independent varia ble Dependent variable A common technique based on the use of instrumental variables is Two Stage Least Squares (2SLS). There can be more than one instrument incorporated in a 2SLS analysis. Let us exemplify such analysis by once again turning to the question of whether democracy increases economic growth rates. We now recognize the problem that democracy might be endogenous to growth, and we need to find a proper exogenous instrument for democracy. The procedure followed is to first use OLS on an equation where democracy (the endogenous independent variable in the original regression) is the dependent variable, and the instrument(s) and the control variables from the original regression are entered as right hand side variables. We then take the predicted, instead of the actual, democracy values from this first regression and enter them into the original regression equation. We do regression analysis in two stages, where democracy is the dependent variable in the first stage, and economic growth is the dependent variable in the second stage. The instruments only enter into the regression equation on the first stage, but the regular control variables are used in both stages. The 2SLS procedure does not give us unbiased estimates, but it gives us consistent estimates. Consistency implies that the expected value of the estimator approaches the real value as the number of observations increases (asymptotical unbiasedness). This should make us weary of relying on 2SLS in small samples. 2SLS can be used on cross-sectional data, but it also has panel data versions. Both of these estimation procedures are incorporated in the STATA-package. One “problem” with 2SLS is that we tend to get relatively large standard errors for the coefficient of the endogenous independent variable, especially if the correlation with the instrument is low. It is therefore often difficult to find significant 2SLS results. Another problem is that it is very difficult to find truly exogenous instruments that are not directly related to the dependent variable. I encourage the reader to look up Acemoglu et al (2001), who utilizes settler mortality for colonists in former colonies as an instrument for institutional structure, in the study of how institutions affect economic development. Their main point is that the settler mortality levels decades ago have no direct link to level of development today. Settler mortality is however related to institutional structure today, since it affected the probability of colonizers settling down in colonies and building institutional structures. These historical institutional structures, because of institutional inertia, affect the nature of institutions today. The instrument is therefore correlated with the independent variable of interest, institutions, but the instrument is not directly linked to the dependent variable, development. It can therefore readily be used in a 2SLS framework, and the estimated effect found, it is argued, is not open to the attack that the coefficient is due to development affecting institutions. Non-linearity: Matching Regression-based techniques assume linear effects, clearly stated in A1), and this could be problematic when investigating particular social science research questions. Imposing a linearity 7 restriction might be a too strong assumption to make, thereby leading to a too crude estimation procedure. Recently, there has been some interest in so-called matching-techniques among researchers studying political-economic topics. Persson and Tabellini (2003) use matching in their study of the economic effects of different forms of constitutions and electoral systems. Matching is a so-called non-parametric estimation technique, where we relax assumptions of functional form. We do not have to make an initial assumption on whether the relationship is linear or have any other particular functional form, and we do not have to assume that the effect is independent of values on contextual variables. As is so often the case in econometric work however, relaxing strict assumptions bears with it a cost in terms of reduced efficiency; that is, we tend to get relatively large standard errors for the estimates. This is analogous to the situation when we move from OLS-based PCSTS to the more robust but less efficient 2SLS technique (there is no free lunch, as economists like to point out). Matching-techniques draw on experimental logic: “The central idea in matching is to approach the evaluation of causal effects as one would in an experiment. If we are willing to make a conditional independence assumption, we can largely re-create the conditions of a randomized experiment, even though we have access only to observational data” (Persson and Tabellini, 2003:138). The main underlying idea is that we split our independent variables into two groups, the control variables, and the treatment variable, which we are interested in investigating the effect from. Further, we need to dichotomize the treatment variable and assume so-called conditional independence; that is, we need to assume that the selection on the treatment variable is uncorrelated with the dependent variable. If specific units self-select to a certain value on the treatment variable, this will pose trouble for inference. Matching is based on the underlying idea that we should compare the most similar units, for example most similar countries. In this sense, the logic does not only reflect that of experiments, but also that of the Most Similar Systems logic utilized in small-n studies in comparative politics (John Stuart Mill’s “Method of Difference”). We make “local” comparisons over units that are relatively similar on all variables but the treatment variable we are interested in investigating (for example political regime dichotomized to democracy and dictatorship). Then we estimate the effect of the treatment variable after we have looked at how the matched units differ on the dependent variable. As Persson and Tabellini (2003:139) puts it, we try to find “twins” or a “close set of close relatives” to each observation, but these most similar countries need to differ on the treatment variable, for example degree of democracy, as used in previous examples of studies on how democracy affects economic growth. The estimated effect of democracy is computed for each of the pairwise comparisons made, and we then go on and calculate an average of these effects to get our final (generalized) estimate. “Matching allows us to draw inferences from local comparisons only: as we compare countries with similar values of X [characteristics in terms of values on the control variables], we rely on counterfactuals that are not very different from the factuals observed” (Persson and Tabellini, 2003:139). One example of a good pairwise match in the democracy-growth example might be Benin and Togo. Both have relatively similar values on potential control variables such as colonizer (France), location (West Africa), level of development etc. However, Benin can be classified as democratic, and Togo as dictatorial on the treatment variable. 8 There are several different versions of the matching procedure, and there are specifications that need to be made before one can start estimating. The literature on the use of matching in social sciences is growing rapidly at this point in time, and I will not survey the different techniques, since the point here is to show the general logic. Matching can for example be performed with and without replacement. Replacement indicates that the same unit can be used as a match several times. Benin can for example be used as a match for both Togo and Guinea. Packages used for matchingestimation can be downloaded from the internet and used on STATA. One example is the package related to the “nnmatch” command. One crucial specification that we must make initially is the number of “similar” experiences we want to compare with when estimating effects from the treatment variable. We can use one or more than one match for each unit. A second specification is whether we want to adjust for possible biases in specific ways (Abadie and Imbens, 2002). A third is how we want to calculate the standard errors. When it comes to the first, using several cases as matches of course increases the amount of information one bases inferences upon, but we are then at risk of comparing a unit with other units that are relatively dissimilar to each other. When it comes to bias-adjustment, one bias-adjustment procedure has been specified by Abadie and Imbens (2002), but there are several others available. When it comes to standard errors, there is often good reason to believe that the standard errors are heteroskedastic, and it is therefore in many instances recommended to use “robust standard errors”. STATA calculates such robust standard errors by running through a second matching process, and in this second stage matches are done between observations that have similar values on the treatment variable. The resulting standard errors are heteroskedasticity-consistent. Conclusion A Move from cross-sectional to pooled cross-sectional time-series data increases the amount of information one can use when drawing inferences. Quantitative researchers in the social sciences therefore increasingly use such data structures. One problem for students and researchers educated only in OLS, or alternatively in WLS, is that these techniques run into serious problems under such a data structure, with autocorrelation being a main scourge. Fortunately, there are techniques closely resembling the logic of OLS that can be used to analyze such situations. One simple extension is the OLS with Panel Corrected Standard Errors. Estimators are here calculated on the basis of both crosssectional and temporal variation, with for example country-year being the unit of analysis. However, if there are non-observed country-specific effects that strongly drive results, analysts are encouraged to switch to Fixed Effects, or alternatively the more lenient Random Effects. Fixed Effects incorporate dummies for each cross-section units, and should be beloved by those who claim that each country is so special that inter-country comparisons cannot be used for inference. This is however a very strong claim, often bordering to nihilism (Beck and Katz, 2001), and Fixed Effects might therefore waste a lot of valuable information. Endogeneity is a general problem in the social sciences, and I sketched up a procedure that is constructed for dealing with endogenous independent variables, namely 2SLS. 2SLS yields consistent estimates, but standard errors are generally very large. Moreover, finding proper instruments is a very difficult task, and this could be one of the reasons 2SLS has not diffused more widely into political science. Matching draws on experimental logic, and this type of analysis allows analysts to 9 not assume linearity of effect. Every unit is compared with one or more similar units that differ on the treatment variable of interest, and treatment effects are estimated and finally averaged up to an average treatment effect. These two latter techniques are arguably more complex than OLS, and in order to understand them properly, interested readers are encouraged to dig into the literature on these techniques. 2SLS is widely used in economics, and matching is more used in psychology, medicine and biology. However, these techniques are superior to simpler techniques in several situations, and my guess is that one will see more widespread use of 2SLS and matching in political science in the years to come. 10 Literature Abadie, Alberto and Guido Imbens (2002). “Simple an bias-corrected matching estimators for average treatment effects”. National Bureau of Economic Research, Cambridge, MA. Technical Working Paper 283. Acemoglu, Daron, Simon Johnson and James A. Robinson (2001). “The Colonial Origins of Comparative Development: An Empirical Investigation”. American Economic Review 91: 1369-1401. Beck, Nathaniel and Jonathan N. Katz (1995). “What to do (and not to do) with Time-Series – CrossSection Data”. The American Political Science Review 89: 634-647. Beck, Nathaniel and Jonathan Katz (2001). “Throwing Out the Baby with the Bath Water: A Comment on Green, Kim, and Yoon”, International Organization 55(2): 487-495. Greene, William H. (2003). Econometric Analysis. 5th edition. Upper Saddle River: Prentice Hall. Guajarati, Damodar N. (2003). Basic Econometrics. 4th edition. New York: Mc Graw-Hill. Kennedy, Peter (2003). A Guide to Econometrics. 5th edition. Cambridge, MA: The MIT Press. King, Gary, Robert O. Keohane and Sidney Verba (1994). Designing Social Inquiry, Princeton: Princeton University Press. Knutsen, Carl Henrik (2008a). “The Economic Effects of Democracy – An Empirical Analysis”. Oslo: University of Oslo, Department of Political Science. Working Paper. Persson, Torsten and Guido Tabellini (2003). The Economic Effects of Constitutions. Cambridge, MA.: The MIT Press. Sayrs Lois W. (1989). “Pooled Time Series Analysis”. Quantitative Applications in the Social Sciences Series No 70. Sage University Paper. 11