GEE Models of Judicial Behavior Christopher J. W. Zorn Department of Political Science Emory University Atlanta, GA 30322 czorn@emory.edu Version 1.0: April 1, 1998 Abstract The assumption of independent observations in judicial decision making flies in the face of our theoretical understanding of the topic. In particular, two characteristics of judicial decision making on collegial courts introduce heterogeneity into successive decisions: individual variation in the extent to which different jurists maintain consistency in their voting behavior over time, and the ability of one judge or justice to influence another in their decisions. This paper addresses these issues by framing judicial behavior in a time-series cross-section context and using the recently developed technique of generalized estimating equations (GEE) to estimate models of that behavior. Because the GEE approach allows for flexible estimation of the conditional correlation matrix within cross-sectional observations, it permits the researcher to explicitly model interjustice influence or over-time dependence in judicial decisions. I utilize this approach to examine two issues in judicial decision making: latent interjustice influence in civil rights and liberties cases during the Burger Court, and temporal consistency in Supreme Court voting in habeas corpus decisions in the postwar era. GEE estimators are shown to be comparable to more conventional pooled and TSCS techniques in estimating variable effects, but have the additional benefit of providing empirical estimates of time- and panelbased heterogeneity in judicial behavior. Paper prepared for presentation at the Annual Meeting of the Midwest Political Science Association, April 23-25, 1998, Chicago, Illinois. Thanks to Erik Naft for providing some of the data used here, and to Neal Beck, Greg Caldeira, Mike Giles, Tom Walker and John Wright for helpful comments and discussions. This is a preliminary version; please cite extensively without the author’s permission. 1. Introduction Fifteen years have passed since Gibson’s (1983) call for an “integrated” theory of judicial decision making. A central concern of Gibson’s plea was the need to focus on individual decision makers, including models which test theories from other levels of analysis at the individual level. Since that time, scholars of judicial behavior have responded with an increasingly complex array of models designed to incorporate background and socialization variables, attitudes, roles, fact patterns and precedent, and institutional and strategic considerations into explanations of judicial activity. But while the theoretical richness of this literature has grown immensely during this period, little development has occurred in the way in which we incorporate these developments into our empirical work. This lag in the development of models of judicial behavior becomes clear when we examine the modus operandi of the archetypical judicial behavior study. Such studies often consider data on a particular area of the law (e.g. search and seizure, obscenity) and posit a model of the decision process based on one or more of the set of factors outlined above. The variable of interest — the decision — is more often than not dichotomous, and the unit of analysis is, variably, the decision of the Court or the vote of the various judges or justices in those cases. A probit or logit model is estimated via MLE, probabilities compared, and conclusions are drawn. Implicit in this formulation, however, is the assumption that the observations are conditionally independent, a claim with important implications, both statistical and substantive, for the conclusions we draw. I argue here that this assumption of conditional independence flies in the face of our knowledge of judicial behavior. This is true for at least two reasons: the influence of consistency in decision making, and the impact of informal interjustice influence on judicial decision making; I examine these two issues here. More generally, our understanding of judicial politics implies a wide range of sources of heterogeneity in these observations, all of which have a potentially critical influence on our understanding of decision making. Moreover, these factors are of greatest concern in modeling individual-level decisions, arguably the most fruitful ground for analyzing judicial politics. 1 The plan of the paper is as follows. In section 2, I motivate the discussion by examining these two important sources of heterogeneity in judicial decisions, and investigating the implications of those influences for specification and estimation of judicial decision making models. Section 3 introduces a potential solution to those problems, the technique of generalized estimating equation (GEE) models for time-series, cross-sectional data, and discusses its statistical properties. I then apply these models to the two applications discussed. In section 4, I examine interjustice influence in a simple attitudinal model of Supreme Court decision making in civil rights and liberties cases during the fifth Burger Court (OT1981-1985). Section 5 presents an analysis of the importance of over-time voting consistency in a more complex, “integrated” model of merits decisions in Supreme Court cases involving habeas corpus. In both applications, the GEE results are compared and contrasted with those which obtain from commonly-used models which fail to account for the heterogeneity in question. Section 6 closes the paper with a discussion of the general question of cross-observation dependence and suggestions for additional applications of the model, both within and beyond the area of judicial politics. 2. Heterogeneity in Judicial Decision Making There are at least two compelling arguments refuting the notion that individuallevel decisions on the Supreme Court are homogenous. The first I term interjustice influence: the impact of the informal, and largely unobserved, bargaining, persuasion, and even coercion among the Court’s members. On a small, collegial decision making body such as the Court, we would expect such influences to be substantial, and previous work in this area has found that justices do influence one another to a large degree (e.g., Murphy 1964). Biographical and anecdotal accounts of the workings of the Court point to the effectiveness of some justices in bringing their brethren around to their own way of thinking (e.g. Mason 1956, 1964). Among these, Justice William Brennan stands out: Brennan’s powers of persuasion have achieved close to legendary status (e.g. Baum 1995, 175-6; Hopkins 1991). More empirical and quantitative studies have also documented the impact of interjustice influence on Supreme Court decision making. Spaeth and Altfield’s (1985) study of special opinions on the Warren and Burger Courts, for example, found that while 2 the impact of such influence was not particularly strong during those periods and most such instances involved justices of similar ideology, a majority of influence relationships were unidirectional — indicating that “justices do differ in their power of persuading others of the correctness of their views” (Spaeth and Altfield 1985, 82; see also Altfield and Spaeth 1984). More recently, a series of studies by Maltzman and Wahlbeck (1996a,b; Wahlbeck, Spriggs and Maltzman 1998) have begun to investigate these internal dynamics in more depth, focusing on strategic influences on such activity as opinion assignment and opinion coalitions. A second factor advising against homogeneity is the effect of individual justices’ consistency on their decision making. Justices of the Supreme Court, as a rule, exhibit homogeneous voting patterns over the course of their careers. Baum, for example, finds that, while individual voting change “played a surprising large role” in the changing policies of the Vinson, Warren and Burger Courts, his study also confirmed that membership change comprised the largest contribution to overall policy change (1992, 21-2; see also Baum 1988). This general consistency in individual voting behavior can be ascribed to a number of sources: stability in justices’ attitudes and perspectives on the law, a normative desire on the part of the justices to maintain and support positions that they have taken in the past,1 and the influence of past decisions (i.e., precedent) on the justices’ current behavior. Irrespective of its source, however, such consistency implies that decisions in subsequent cases are unlikely to be independent. Nonetheless, informed modeling decisions can advance our understanding of the reasons for this dependence. As an example, consider the impact of precedent on justices’ voting decisions. The importance of precedent in Supreme Court decision making is currently a hotly debated topic (see e.g. Brenner and Stier 1996; Brisbin 1996; Knight and Epstein 1996; Segal and Spaeth 1996; Songer and Lindquist 1996). A major issue of operationalization in this debate concerns the observational equivalence of voting according to precedent or ideology in cases where the two are in agreement; this equivalence makes untangling these separate effects 1 Segal and Spaeth, for example, point out Justice Harlan II’s tendency to “never ... accede to any statement supportive of a majority opinion from which he dissented” (1996, n.1). 3 difficult. In principal, however, one could begin to address this problem by considering the extent to which justices’ behavior was consistent across similar cases over time after controlling for the influence of ideology. I take a first step in this direction in one of the applications below. From these discussions, we can discern three different possible sources of heterogeneity in judicial decision making. First, unobserved factors which vary from one justice to another I denote within-justice effects. Included in this category would be variations in behavior resulting from background, socialization, and personal characteristics which change little or none over time. The impact of these variables on decision making will thus vary across justices, but will, ceteris paribus, remain relatively constant for a particular justice across different cases. A second source of heterogeneity consists of influence relationships among the justices of the Court; I label these acrossjustice effects. Bargaining, pressure and coercion among the justices undoubtedly influence the actions they take, and thus also the observed patterns of the votes and the decisions of the Court. Finally, there are over-time influences on the justices’ decisions; these include such factors as the Court’s agenda and precedent. These influences are casespecific; they are common across justices, varying instead over the cases the justices hear. Moreover, they include as special cases such previously studied variables as case facts and legal influences. The “trick” to dealing with heterogeneity in judicial decision making is addressing the different sources of heterogeneity simultaneously. For simplicity, I begin with an abstract, general model of Supreme Court decision making at the individual level. Concerns relating to interjustice influence and consistency are also present at more aggregated levels of analysis, e.g. the decision of the Court, or the proportion of decisions made in a certain direction by a particular justice or Court in a given year. Similarly, these sources of heterogeneity will also be influential, albeit to different degrees, in lower appellate courts as well: while interjudge forces will typically be absent in trial courts, for example, matters relating to precedent and consistency clearly play an even greater role there (Baum 1986). By focusing on individual decisions in the Supreme Court, however, I offer the greatest degree of comparability to previous work; the models here readily generalize to other settings as well. 4 Consider a binary decision process, where a justice must decide between two possible outcomes; these may be a vote to grant or deny certiorari, a vote to affirm or reverse on the merits, or any other such decision. Denote the outcome of that decision for a specific justice j in a given case i as Yij . We might expect the probability of a given outcome on Yij to vary according to some function of a vector of k independent variables; these factors may vary over cases (e.g. case facts), justices (e.g. ideology), or both. We may write this general model as: Prob(Yij) f (Xij k k) (1) As is generally the case in models which vary over more than one unit of analysis (e.g. those involving time-series cross-sectional data), the simplest model one can estimate also involves making the greatest number of assumptions about the data-generating process. If, conditional on the independent variables X, the impact of those independent variables on the decision is assumed to be constant across both justices and cases, and if the observations analyzed are assumed to be independent both across i and j (i.e., across justices and cases), then an estimate of the impact of X on Y can be obtained by simply “pooling” the observations and estimating a standard logit or probit model. This approach has been widely used in empirical studies of judicial decision making, both of the U.S. Supreme Court (e.g. Maltzman and Wahlbeck 1996b; Segal and Spaeth 1993) and of other judicial fora (e.g. Brace and Hall 1993, 1997;2 Songer and Davis 1990; Songer and Haire 1992). The assumptions on which it is based are stringent; in particular, the assumptions of uniform behavior and no dependence within or across either cases or justices are, for the reasons stated previously and a host of others, highly unrealistic. I therefore consider in turn a number of other alternatives which have been used to address the issue of heterogeneity in this and other contexts. Consider first within-justice heterogeneity; that is, variations among justices in their behavior in the same cases. One possibility is that, while all justices respond in 2 Brace and Hall include separate dummies for varying time periods, and may in one sense therefore be considered a variation on a fixed-effects specification. 5 similar ways to the same stimuli, each varies in his or her initial predisposition to behave in a certain way. That is, there is variation in the model’s intercept terms for different justices. Formally, we may write: Prob(Yij) f (Xij k k j) (2) where the ’s are separate intercepts for each justice. This is the standard fixed-effects model; justice-specific heterogeneity in the intercepts are conditioned out of the model via separate estimates of and . Like the general pooled model, the fixed-effects model has received some use in the judicial politics literature(e.g. Hall 1992)3, although its application is not as widespread as other types.4 A more commonly found generalization allows both intercept and slope coefficients to vary over justices or time points: Prob(Yij) f (Xij k j k) (3) Here justices respond differently to the case stimuli presented to them, and the model explicitly allows for estimation of those differences. Maintaining our assumption of no conditional correlation across cases, this model amounts to simply running separate regressions for each justice in question, a technique widely used in the judicial politics literature (e.g. Boucher and Segal 1995; Caldeira and Wright 1996; Flemming and Wood 1997; Segal 1986). Such models have the advantage of computational simplicity and 3 Hall’s analysis incorporates fixed effects for time-points, rather than justices, but the two approaches are conceptually similar. 4 The relative absence of fixed-effects models from judicial politics research is somewhat surprising, at least at the Supreme Court level. The fact that such models often have relatively fixed (and small) numbers of justices and large (and increasing) numbers of cases would seem to make fixed-effects specifications a natural choice. 6 explicit treatment of differences. At the same time, they are not especially parsimonious, and comparisons of coefficient values is not advised.5 While these approaches represent intuitive ways of getting at variation within different justices, a common trait of all these models is their failure to allow for dependencies across justices. Because they maintain the assumption of the independence of votes, none consider the possibility of conditional correlation across the votes of different justices. Thus if, holding constant the effect of the independent variables, justices engage in logrolling, persuasion, coercion, or other forms of interpersonal influence, this information is ignored by the model, with a corresponding loss in efficiency of the estimates. One potential solution to this problem lies in “seemingly-unrelated” regression models (e.g. Zellner 1962), where separate equations are estimated with the assumption of a correlated error structure. However, few of these models have been modified to address the discrete choice variables typically encountered in judicial decision making (but see King 1989). The issue of case-specific heterogeneity raises a separate set of concerns. Even assuming we have adequate measures for case differences (e.g. case facts, lower court disposition, etc.), other factors contribute to cross-case heterogeneity. In particular, the foregoing discussions of consistency and precedent imply that, among the decisions of a particular justice j we might expect that cases will not be serially independent. In addition to case-specific fixed- and random-effects models, a natural way of incorporating these effects is through a simple dynamic specification: Prob(Yij) f (Xij k k Yi1,j') (4) Here, the probability of a particular decision is a function of the independent variables, and the immediately previous decision.6 Equation (4) represents the general AR(1) form of this 5 Other models, less widely used in judicial politics but popular elsewhere, include random-effects models, which treat individual effects as products of a unit-specific mean effect and a stochastic component (e.g. Hsaio 1996), and random-coefficient models. 6 See Beck and Katz (1996) for a discussion of the use of lagged endogenous variables in time-series cross-sectional models generally. Beck and Tucker (1997) argue that such 7 model. As is the case for justice-specific heterogeneity, one could further generalize the nature of the relationship by allowing the impact of the lagged endogenous variable to vary by justice (i.e., ' = 'j). Addressing both justice-specific and case-specific heterogeneity would seem to suggest a combination of these models. So, for example, one might estimate separate models for each justice’s response to case-specific variables, while including a dynamic component for the effect of precedent and allowing the errors to be correlated. This piecemeal approach to model estimation is, at best, questionable on substantive grounds, and at worst incorrect and misleading. Instead, I suggest an alternative, unified approach to addressing these sources of heterogeneity below. 3. Generalized Estimating Equation Models: An Overview7 I now introduce the generalized estimating equations (GEE) approach to time-series cross-sectional models. The method of GEE is a marginal, or population-averaged, approach to estimation. Population-averaged models differ from the more common clusterspecific approaches described above. Cluster-specific approaches model the probability distribution of the dependent variable as a function of the covariates and a cluster-specific parameter. This latter term may be either estimated concurrently with the model (as in the fixed-effects approach) or be assumed to follow some stochastic distribution (as in the random-effects specification). Population-averaged approaches, by contrast, model the marginal (or population-averaged) expectation of the dependent variable as a function of the covariates.8 These differences have ramifications for the interpretation of the estimated parameters, as I discuss below. techniques are of more limited utility in models with binary dependent variables (BTSCS), preferring instead a modified event-history approach using period-specific dummy variables. In contrast, Beck and Katz (1997) note that, while lagged dependent variables are an imperfect correction for serially correlated errors in BTSCS models, they are appropriate when the model specification calls for endogeneity. I address these issues further in the applications, below. 7 This section draws extensively on the presentation of Zeger and Liang (1986) and Fitzmaurice et. al. (1993). 8 For a good discussion of this distinction, see Neuhaus et. al. (1991). 8 The GEE model has its roots in the quasi-likelihood methods introduced by Wedderburn (1974) and developed and extended by McCullagh and Nelder (1983, 1989).9 While standard maximum likelihood analysis requires that we specify the conditional distribution of the dependent variable, quasi-likelihood requires only that we postulate the relationship between the expected value of the dependent variable and the covariates and between the mean and variance of the dependent variable. This generalized linear models (GLM) approach has received widespread use in cross-sectional analysis. Following notation similar to that of Zeger and Liang (1986), I consider a model of observations on a dependent variable Yit and k covariates Xit, where i indexes the N observations i = 1, 2, ... N and t indexes the T time points t = 1, 2,... T. Let Yi = (Yi1, Yi2, ... YiT) denote the corresponding column vector of observations on the dependent variable for observation i, and Xi indicate the T×k matrix of covariates for observation i.10 Writing E(Yi)=)i, we define a function h which specifies the relationship between Yi and Xi: L K ; L (5) where is a k×1 vector of parameters, and the inverse of h is known as the “link” function. Likewise, the variance Vi of Yi is specified as a function g of the mean. In the crosssectional case (i.e., T = 1), we write: 9 L J 1 L (6) where 1 is a scale parameter typically set to one for discrete distributions of Yi and to chisquared divided by the residual degrees of freedom for continuous Yi. The quasi-likelihood estimate of is then the solution to a set of k “quasi-score” differential equations: 9 An excellent summary of this approach can be found in Heyde (1997). For notational simplicity, I assume here that Ti = Ti’ ~ i g i’; i.e. that the “panels” are balanced. This need not, however, be the case for estimation of the models presented here; balanced panels are, however, necessary for likelihood-based “mixed parameter” models (Fitzmaurice et. al. 1993). 10 9 4 N 1 0 M L L 0 9 < L L L (7) N Note that E[Q()] = 0 and Cov[Q()] = (0)i/0k)’V-1(0)i/0k). The function Q() thus behaves like the derivative of a log-likelihood (i.e., a score function); estimation is typically via a generalized weighted least-squares approach. For cases where T > 1, some provision must be made for dependence across t. Liang and Zeger’s (1986) solution was to specify a T×T matrix Ri() of the “working” correlations across t for a given Yi. While Ri() can thus vary across observations, it is assumed to be fully specified by the vector of unknown parameters , which have a structure determined by the investigator and which are constant across observations. This correlation matrix then enters the variance term of equation (7): 9 L $ 5 $ 1 L L L (8) where the Ai are T×T diagonal variance matrices of Yi with g()it) as the tth diagonal element. Substitution of (8) into (7) yields the GEE estimator; from this discussion, it is clear that the GEE is an extension of the GLM approach, and that the former reduces to the latter when T = 1. Since the Vi’s are functions of both and , estimation typically is accomplished by an iterative procedure (e.g. iteratively reweighted least squares). We can reexpress (7) as a function of alone by substituting a consistent estimate r of , and a consistent estimate of 1, into (7). We then solve for and calculate standardized residuals, which are in turn used to consistently estimate and 1. These two steps are iterated until the estimates reach convergence. The GEE model set out here has a number of attractive properties for applied researchers. Because the first two terms of (7) are not dependent on Yi, the score equations converge to zero (and thus have consistent roots) so long as E[Yi - µi] = 0. Assuming that the link function inv(h(•)) is correct, GEE estimates of will thus be consistent in N (Liang 10 and Zeger 1986). Moreover, N1/2(GEE - ) is asymptotically multivariate normal, and the covariance matrix of the estimates can be consistently estimated by the inverse of the Hessian, evaluated at the values of and produced by (7).11 Most important, the asymptotic consistency of the GEE estimates of holds even in the presence of misspecification of the “working” correlation structure ; thus, the GEE offers the potential of providing asymptotically unbiased estimates of the regression parameters of primary interest even in cases where the nature of the time dependence is unknown.12 A additional advantage of the GEE approach is the broad range of options available for specifying time dependence. Fitzmaurice et. al. (1993) discuss four common specifications of the “working” correlation matrix Ri() for observations Yis and Yit: 1. Ri() = I, a T × T identity matrix. This “working independence” assumption is equivalent to assuming no over-time correlation, and yields estimates equivalent to those from simple “pooled” models. No estimate of is obtained. 2. Ri() = ' if t g s. This is the “exchangeable” correlation structure; values of Yi are assumed to covary equally across all time points. This assumption corresponds to “random effects” models which estimate an over-time correlation parameter. In this specification, is a scalar, which is estimated in the model. 3. Ri() = '|t-s| if t g s, an autoregressive (AR(1)) specification. The within-observation correlation over time is an exponential function of the lag length; as is typically the case in AR models, we assume |'| < 1.0. In an autoregressive specification, is a vector (1, ', '2, ... )’ which is the same across observations. 4. Ri() = st, an unstructured of “pairwise” correlation structure. That is, no constraints are placed on the correlations across time points; instead, they are Zeger and Liang (1986) note that in most cases it is possible to estimate and without estimating 1 directly, provided that the elements of the correlation matrix R be multiples of the parameters . 11 Note, however, that the consistency of the variance estimate for the GEE does depend on proper specification of the working correlation structure. In such cases, Liang and Zeger (1986) propose a “robust” estimate of the variance-covariance matrix of GEE, analogous to that derived by White (1980), which is consistent even under misspecification of the correlation matrix. For reasons of brevity, I reserve treatment of robust standard errors in the GEE context for a future paper. 12 11 estimated without restriction from the data. in this context is a T×T matrix containing the T(T - 1)/ 2 pairwise correlations for all possible combinations of time points. In addition to these, a number of other specifications of the working correlation matrix are possible, including stationary and nonstationary models of varying orders. Alternatively, the researcher may specify Ri() explicitly; this options is especially valuable for testing the robustness of estimates to the correlation specification. More recently, Zhao and Prentice (1991) have extended the GEE model to allow parameterization of the working correlation matrix as a function of a set of covariates = f(Zit), and to allow joint estimation of and . Provided that the specification of both the mean and the correlation structure are correct, this “GEE2" approach permits more efficient estimation of the parameters of interest, as well as allowing the possible of substantial substantive insight into the determinants of over-time correlation. Its primary drawback is that, because of the simultaneous estimation process, consistent estimates of depend on proper specification of the correlation matrix; thus, the GEE2 estimator lacks the robustness of the GEE in estimating the regression parameters in the presence of time dependence of unknown form. The potential advantages of the GEE approach for estimating individual-level models of judicial decision making are several.13 First, these models allow for explicit incorporation and modeling of interdependencies across cases and justices through specification of the working correlation matrix. At the same time, the parameter estimates obtained through application of these models are robust to misspecification of those interdependencies, an important trait since our knowledge of the nature of those relationships is imperfect at best. Moreover, these models are capable of handling unbalanced data (for example, that due to recusals), and are relatively robust in the presence of such missing data.14 13 For two recent applications of GEE models to judicial politics, see Caldeira et. al. (1996, 1997). 14 A thorough treatment of missing data in these and similar models can be found in Fitzmaurice et. al. 1993. 12 In the following two sections I examine GEE models of judicial behavior, comparing empirical estimates of factors in judicial decision making obtained from GEE models with those of more commonly-used models for pooled data. In Section 4, I consider a specifically “attitudinal” model, focused on the impact of judicial ideology on decision making in civil rights and liberties cases. There, I focus on the issue of interjustice influence, showing how GEE models may be used to both control for and estimate these influences within a model of decision making. Section 5 turns to an “integrated” model of decision making in habeas corpus decisions; this model is similar to a wide range of such studies in the recent judicial behavior literature. There the emphasis is on the impact of consistency: that is, justicespecific, over-time dependence in judicial decisions. 4. Interjustice Influence and Judicial Decision Making I begin the empirical part of the paper with an analysis of influence relationships among justices of the Supreme Court. The data considered are the civil rights and liberties decisions of the seventh Burger Court (OT1981-1985); data are all such cases decided by the Court between the appointment of Justice O’Connor and the retirement of Chief Justice Burger, and are drawn from Harold Spaeth’s Supreme Court Judicial Database (Spaeth 1997). Excluding missing data, the number of such cases is 767; with an average of 8.57 votes per case, this yields a total of 6572 individual votes for analysis. The dependent variable is the justice’s votes in civil rights and liberties decisions, coded 0 for a liberal outcome and 1 for a conservative one. To facilitate model comparisons, I examine a simple, pure attitudinal model of Supreme Court decision making (e.g. Segal and Spaeth 1993). The model posits that justices’ votes are determined by their political ideologies. In therefore include as the sole independent variable modified Segal/Cover (1989) measures of judicial ideology (Epstein and Mershon 1996). These range from 0.0 (for the most conservative justices) to 1.0 (for the most liberal), with the expectation that this measure will exhibit a significant negative impact on the justices’ votes. Model comparison results are presented in Table 1. For comparison purposes, I begin the analysis by examining a simple pooled model, where votes are assumed to be independent across both justices and cases. Comparison is facilitated by the fact that this pooled model is equivalent to a GEE specification with an independent working correlation 13 structure. Consistent with expectations, judicial ideology has a substantively large and statistically significant effect on voting, and the model as a whole is a considerable improvement over the null model. The estimated probability of a conservative vote by the most conservative justice is 0.83; this value declines to 0.29 for the most liberal members of the Court. This simple model predicts 74.22 percent of the votes correctly, compared to the null of 63.04 percent, for a reduction in error of 30.3 percent. As noted above, a simple way of controlling for justice-specific heterogeneity is to allow for each justice to have a separate intercept term. This amounts to estimating a fixed-effects model, with separate dummy variables for each justice save one; unconditional fixed-effects probit is an asymptotically consistent estimator in T (Hsiao 1996). Results of estimating this fixed-effects specification are presented in column two of Table 1. Chief Justice Burger was excluded from the analysis and is thus the category for comparison. Moreover, because Justices Marshall and Brennan, and Justices Burger and Blackmun, have identical ideology scores, their fixed effects cannot be separated; thus, these pairs of justices “share” the same intercept. The fixed-effects model also exhibits a substantial improvement over the null; moreover, a likelihood-ratio test indicates that it is also a substantial improvement over the pooled model presented in the first column (32(7) = 412.42, p < .001). Again, the impact of ideology is large and statistically significant; assuming the fixed effects to be zero, the probability of the most liberal justice voting conservatively is 0.23, a value which increases to 0.88 for the most conservative member of the Court. Five of the seven estimated fixed effects (not shown) are also statistically significant, with justices Blackmun, Powell and Stevens significantly more likely to case a liberal vote and White and O’Connor more likely to cast a conservative one. At the same time, the fixed effects model fails to improve on the predictive ability of the pooled model; this is due to the fact that this particular fixed-effects model lacks explanatory variables which vary over cases. Both the pooled and fixed-effects models are unable to capture any interjustice dynamics. Accordingly, I next turn to two GEE models where cross-justice influence may be estimated. Treating each case as a single observation, and each justice’s vote as a separate realization on that observation, allows the GEE specification to account for within-case correlations among the justices. The first such model used the exchangeable 14 correlation structure; that is, it assumes that the correlation between justices is equal for all pairs of justices, and estimates a single scalar value for that correlation. The second model allows for an unstructured correlation structure; pairwise correlations between justices are allowed to vary, and are estimated simultaneously with, and conditional on, the other model parameters. These results are presented in columns three and four of Table 1. In comparing the GEE models to the more commonly-used counterparts, two differences are immediately apparent. First, the overall values of the coefficients are generally smaller; this is particularly true for the pairwise model, where both the intercept and the coefficient for the ideology variable are between twenty and thirty percent smaller than the other model estimates. This is due to two factors. First, as lucidly demonstrated by Neuhaus et. al (1991), estimated covariate effects for population-averaged models such as the GEE will generally be smaller than those estimated by cluster-specific approaches. Second, recall that the GEE approach accounts for the effect of inter-justice influence on decision making. To the extent that that influence occurs between justices of similar ideological dispositions, the estimated coefficients on ideology in the GEE will be attenuated relative to those in models where this effect is not estimated. Put differently, because both factors have the same effect on voting, simple pooled and fixed-effects models conflate influence among ideologically similar justices with the influence of that ideology itself; GEE models are better able to separate those effects. The result of this decrease in the parameter estimates is a reduced impact on the predicted probabilities generated by the GEE models. These differences are illustrated graphically in Figure 1. While the change in probability associated with moving from most conservative to most liberal is 0.51 for the exchangeable GEE, that difference is a reduced 0.29 in the model estimated with a pairwise correlation structure. A second noticeable difference between the GEE and previous models is the reduced predictive ability of the two models. Both models correctly predict 62.26 percent of the cases correctly, a decrease in predictive power over even the null model. This lack of predictive ability arises from the inability of the models to accurately categorize the predictions across both types of outcomes, rather than from any systematic over- or underprediction of a single category. The marginal percentages for liberal and 15 conservative votes are 37.0 and 63.0 percent, respectively; the two cluster-specific models predict 21.2 and 78.8 percent in each respective marginal category, while the GEE models come closer to the true marginals with 43.9 and 56.1 percent. The GEE models also provide estimates of the extent of interjustice influence, conditional on the estimated parameter values. In the exchangeable specification, the estimated rho is 0.484; this can be interpreted as the average (in the sense of “typical”) interjustice correlation in voting, conditional on the effect of ideology. Thus, justices appear to exercise some nontrivial degree of influence on each other’s decisions. Because the model is relatively simple, however, its likely that a good deal of this correlation is due to institutional, contextual, and case-specific factors which are excluded from the model; one would expect this average correlation to be lower in a more fully specified model of judicial behavior. In the model with unstructured, pairwise correlations, the estimated interjustice effects are presented with Table 1. The results of this estimation are largely consistent with the conventional wisdom regarding patterns of influence relationships on the Burger Court. Thus, Justices Marshall and Brennan exhibit high, positive correlations, as do the conservative bloc of Rehnquist, O’Connor, White and Burger. We also find high correlations among the Court’s moderates: Justices Blackmun, Stevens, and also Powell. Perhaps most surprising is that only one correlation, that between Justices Marshall and Rehnquist, is negative, and that only slightly so. This again can be attributed to the fact that the estimates are conditional on the model parameters, which in turn points to the impact of model specification: presumably, a good deal of the interjustice correlation is due to case facts and other variables which render the justices’ decisions more highly correlated than they would be were those factors explicit in the model. The results of this initial model comparison are suggestive: GEE approaches offer at least the potential for greater understanding of judicial behavior, and in particular for extracting additional information out of individual-level data on judicial decision making. The extent to which this potential is realized, however, depends critically on, among other things, correct model specification. Particularly in the GEE context, where the estimates of this influence are conditional on the other model estimates, complete and proper model specification is critical to obtaining useful estimates of the auxiliary parameters. To assess 16 the usefulness of the GEE approach in a more complete model, as well as to examine its utility in dealing with temporal heterogeneity, I turn to a second application in judicial behavior. 5. Decision Making Models with Individual Heterogeneity As noted above, the importance of precedent in Supreme Court decision making has been the subject of considerable dispute. What has not been recognized is that the debate is of both substantive and methodological importance; to the extent that precedent implies consistency in consecutive votes of the justices or decisions of the Court, it suggests a particular form of longitudinal dependence. That dependence must in turn be considered if we are accurately to model judicial behavior. In this section of the paper I conceptualize consistency and precedent at the individual level: that is, the extent to which a justice’s voting behavior is consistent with his or her previous decisions in similar cases. The idea of consistency, particularly as it applies to precedent, must necessarily be evaluated within a series of similar cases. Here I embed my examination of precedent in a more general “integrated” model of judicial decision making in the area of habeas corpus. Petitions for habeas corpus “challenge the constitutionality of a person’s detention and request release” (Flango 1994, 1); these claims are thus premised on the assertion that a constitutional violation took place prior to or during the defendant’s trial, conviction or sentencing. I posit a model of Supreme Court decision making in habeas corpus cases, based on both legal and attitudinal factors.15 Legal factors center around the constitutional basis for the claim, and include those based on ineffective counsel and trial court error, as well as the Fourth, Fifth, Sixth and Eighth Amendments; general Due Process claims relating to the Fourteenth Amendment are excluded, and constitute the baseline category for comparison. Extralegal and attitudinal considerations include the presence of the U.S. as a party to the case, the presence of a per curiam decision, the presence of multiple (i.e., second and subsequent) petitions for relief, and the ideological disposition of the justice casting the vote, captured by rescaled Segal/Cover scores (Epstein and Mershon 1996). All 15 For a full exposition of the model, see Naft and Zorn (1998). 17 other variables are coded 0 in the absence of the indicated characteristic and 1 for cases where it is present. The data consist of all habeas corpus decisions handed down by the Warren, Burger and Rehnquist Courts during the 1953-1995 Terms. There were 109 such cases; disaggregating by individual justice’s votes and excluding missing data, this leaves 961 observations on which to base our analysis. The dependent variable is the reported vote on the merits, coded 0 if the vote was in favor of the habeas corpus claimant and 1 if the justice voted against the claimant. In contrast to Section 4, here I treat the justice as the primary unit of analysis (i.e., the “i”), and consider cases as repeated observations on the decisions of the justices. In the context of the GEE approach, this allows me to estimate the within-justice correlation across different cases, controlling for the effects of the (generally case-specific) independent variables. Table 2 presents results for five separate models of the vote in habeas corpus cases. For purposes of comparison, I again begin by estimating a “pooled” probit model of justices’ votes, treating all such votes as conditionally independent; as noted, this is exactly equivalent to the GEE with an assumed “independent” correlation structure. Despite the possibility of correlation across cases, Liang and Zeger (1986) show that this pooled model will produce estimates which are consistent and asymptotically normal. Because it fails to account for interdependence over time, however, the estimates of the standard errors will generally be inconsistent: standard errors of the fixed-over-time covariates will tend to be underestimated, while those of time-varying covariates will be overestimated. Results of the pooled probit model indicate that only one of the legal variables exerts a substantial impact on justices’ votes: claims based on an assertion of ineffective representation at the trial level are significantly more likely to receive a vote favorable to the claimant than the baseline case. In contrast, all of the attitudinal factors save one attain statistical significance. Both the presence of the U.S. as a party and multiple petitions decrease the probability of a pro-claimant vote. Justice ideology is the largest influence on the vote: liberal justices are very much more inclined to cast a pro-claimant vote than are their conservative counterparts. Finally, the model does a fairly good job of predicting the votes of the justices, resulting in a nearly 30 percent reduction in error from a null model. 18 For comparison purposes, I also present the results of estimating the same equation with two additional, commonly used models for dealing with time-series cross-sectional data. As noted previously, the fixed-effects model addresses justice-specific heterogeneity by estimating separate intercepts for each justice. The results for the fixed-effects model are generally similar to those of the pooled model: most variables exhibit slightly greater effects on the vote once individual-justice variations are controlled for, while standard errors also generally increase as a result of the fewer degrees of freedom. A likelihood ratio test indicates that inclusion of the fixed effects improves the model significantly (32(22)= 129.55, p < .01), and its predictive power is also substantially better than the pooled model alone. On the other hand, the results also highlight a potential weakness of the fixed effects specification: in the absence of variables which vary across observations, fixed effects cannot distinguish individual effects. Thus, as with the previous analysis, the fixed effects model treats justices with the same ideology score as equivalent. Column three of Table 2 presents estimates from a random effects model, where justice-specific effects follow a specific random distribution and are uncorrelated with the independent variables (Butler and Moffitt 1982). Estimation is accomplished through Gaussian-Hermite quadrature with six support points; results were not generally sensitive to variations in the number of points used. The findings are broadly similar to those of the fixed-effects estimator, with similar coefficients and nearly identical standard errors. The exceptions are the variable for justice ideology, which is estimated more precisely than in the fixed effects case, and in predictive model fit, where it performs slightly worse than the simple pooled model. The significant, positive estimate of rho indicates that we may confidently reject the null hypothesis that the disturbances “within” a single justice are uncorrelated. None of the models presented thus far grapples with the main question of interest: the issue of time dependence due to the impact of judicial consistency and legal precedent. The finding in the random effects specification that errors for a particular justice are correlated goes to the general issue of independence in the justices votes, but tells us little about the form that consistency takes. Accordingly, I specify two alternative models which explicitly model the impact of a justice’s previous votes on his or her current vote. The first is a simple pooled probit model, with the inclusion of a variable for the justice’s vote in the 19 most recent habeas corpus case it decided. This model is far from perfect; substantively, it does not differentiate among claims involving different areas of the law, while statistically it suffers from all the drawbacks associated with dynamic BTSCS specifications outlined in Beck and Tucker (1997). Nonetheless, the model has intuitive appeal, and will provide a point of comparison for the GEE specification. Results of a dynamic pooled probit specification appear in column four of Table 2. The first case for each justice was dropped because of missing data on the previous decision variable; estimates are therefore based on 916 observations. The findings are broadly similar to previous models, both in terms of the coefficients and their standard errors. Likewise, the addition of the dynamic term increases the predictive ability of the model over the earlier pooled model by an appreciable amount. Most important, we see a significant, positive impact of the prior decision on the justices’ voting behavior; holding other variables constant at their means, a previously conservative vote increases the probability of a conservative vote in the present case by 0.24.16 By this initial estimate, then, evidence of the impact of consistency (in the form of prior votes) in the area of habeas corpus decisions is apparent. Finally, I estimate a GEE model of decision making on the same data. As suggested above, we expect that justices’ decisions in older cases will have less impact on current decisions than do more recent ones. With this in mind, and in the interest of parsimony, I specify the working within-observation correlation matrix to follow an AR(1) format: for a particular justice, the correlation between two decisions at times t and s (t>s) is estimated as '|t-s|, and is the same across justices. These results are presented in column five of Table 2. Coefficient estimates for the GEE specification are for the most part similar to those in previous models, generally tracking most closely with those derived from the simple pooled estimator. The variable for Eighth Amendment claims is marginally significant 16 Including an interaction of this variable with justice ideology resulted in a negative and insignificant coefficient (p = .17, two-tailed) and no substantial changes in the remaining coefficients. Previous votes do not, at least in this preliminary investigation, appear to vary systematically in their impact on voting across justices with differing ideological positions. 20 (p<.05, one-tailed) in the GEE model, providing weak confirmation for the idea that such claims tend to be “last gasp” efforts of litigants with no other more hopeful avenue for an appeal. Standard errors for the coefficients are slightly smaller than for previous models, with the exception of the variable for ideology; this fact conforms with Liang and Zeger’s (1986) general statement that cluster-specific pooled models overestimate the variability in time-varying variables (here, all variables save ideology) while underestimating the variance of time-stationary variables. The empirical estimate of ' is close to the coefficient estimated in the dynamic model;17 successive cases correlate conditionally at 0.34, those two cases apart at 0.16, and so on, suggesting that the impact of a particular habeas corpus vote on similar votes in the future has a relatively short-memory, and dies out quickly as new cases come along. It is especially important to note that this effect is conditional on the estimated values of the coefficients, and vice-versa. Thus, to the extent that individual-level consistency in voting due to ideological stability is controlled for in the model, the significant, positive estimate of rho measures the average overall degree of consistency stemming from other factors. While precedent may be only one of these potential factors, these results are at least suggestive that justices may place some emphasis on stare decisis in making their decisions. Finally, as was the case in the more parsimonious model investigated in section 4, the GEE consistently underperforms its more commonly-used counterparts in predicting voting outcomes. Unlike the previous analysis, however, the model does represent a real improvement over the null, and the extent of difference between the GEE and the others is of smaller magnitude. This suggests that model specification plays a crucial part in the predictive ability of estimates obtained via GEE estimation. This is not surprising; as noted before, the estimates of the within-observation correlations in the GEE are conditional on the parameter estimates. Thus, to the extent that the model is wellspecified, the auxiliary parameters will also be better estimated. Intuitively, careful model specification allows the GEE estimator to more accurately parse out the effects on the 17 As noted above, no estimated standard errors may be obtained for this parameter. 21 dependent variable between the coefficients on the independent variable(s) and the parameters of the working correlation matrix. 6. Conclusion Judges, and especially justices of the U.S. Supreme Court, do not make their decisions in a vacuum. Over and above the wide range of factors investigators typically include in their models of judicial behavior, a host of other influences lead one to the inescapable conclusion that votes, judgments, and other decisions on collegial courts are far from independent. That this fact is clearly implied by much of what we know about what courts do is unsurprising; more surprising is the near-absence of any concerted efforts at accounting for, let alone modeling, the nature of that dependence. This paper represents an initial step in that direction. Here I suggest that a class of generalized estimating equation models developed for investigating panel data in medical research can be applied to address these issues. In particular, the ability of GEE models to provide empirical estimates of conditional cross-observation correlations within a specific unit of observation can and should be exploited by researchers in judicial politics to address these questions of decision heterogeneity. From the two analyses presented, the potential usefulness and limitations of the GEE model for studying judicial behavior begin to become apparent. By explicitly incorporating substantively-driven assumptions about over-time and cross-justice dependency in justices’ decision making, the GEE model makes more complete, theoretically-informed use of the information contained in the repeated “observations” that constitute justices’ decisions in a sequence of cases. This added information offers the possibility of both more accurate and precise estimation of the variability of our coefficient estimates, as well as a data-based, empirical measure of the magnitude of such factors’ impacts on justices’ decisions. Moreover, the GEE’s ability to provide robust, consistent estimates in the face of uncertainty over the proper specification of the working correlation matrix suggests that these models may be particularly useful when the nature of the dependency under investigation is unknown. At the same time, my results also point out limitations of the GEE approach. For example, GEE models did not, as a rule, offer the same level of predictive success as their alternatives examined here, though the differences were attenuated by more fully specified 22 models. This does suggest, however, the importance of model specification in using the GEE approach. In particular, since the estimates of the “nuisance” correlation parameters are conditional on the estimated coefficients for the independent variables, problems with specification will also potentially impact upon the conclusions one draws about intraobservation dependence. This means that, even more than is usually the case, researchers must be confident in their model specification before drawing any hard conclusions from the GEE’s auxiliary parameters. A related issue stems from the GEE’s inability to account for heterogeneity along more than one “dimension” at a time. In the analyses presented here, for example, I focus in turn on interjustice influence and temporal dependence, but do not address the two issues simultaneously. While estimating both temporal and spatial effects is beyond the scope of the GEE, the model remains a useful tool. For one, it remains an improvement over other “pooled” models by allowing greater exploitation of information along one of those axes. Moreover, to the extent that heterogeneity in the observations remains due to the impact of the other, the GEE allows for the calculation of “robust” (heteroscedasticityconsistent, e.g. White 1980) standard errors for the model estimates. Thus the researcher may evaluate which specific problem he or she wishes to evaluate explicitly in the GEE context, then employ robust standard errors to mitigate the effects of heterogeneity due to the remaining source(s) of variability. With these caveats, the GEE clearly offers the potential for expanded inquiry into the area of judicial decision making. Moreover, the method shows promise for application in other areas of political science as well. For example, in the field of international relations, dyadic conflict data often exhibit many of the same general varieties of heterogeneity as the data discussed here, and other researchers have already begun to investigate the applicability of the GEE in that context (e.g. Beck and Katz 1997). More broadly, as the availability of time-series cross-section data in political science increases, GEE methods offer an attractive alternative for dealing with issues of dependence in data analysis. 23 Bibliography Altfield, Michael F. and Harold J. Spaeth. 1984. Measuring Influence on the U.S. Supreme Court. Jurimetrics Journal 24(2):236-247. Barnhart, Huiman X. and John M. Williamson. 1997. Goodness-of-fit Tests for GEE Modeling with Binary Responses. Biometrika forthcoming. Baum, Lawrence. 1986. American Courts. Boston: Houghton-Mifflin. Baum, Lawrence. 1988. Measuring Policy Change on the U.S. Supreme Court. American Political Science Review 82(September):905-912. Baum, Lawrence. 1992. Membership Change and Collective Voting Change in the United States Supreme Court. Journal of Politics 54(February):3-24. Baum, Lawrence. 1995. The Supreme Court. Fifth Edition. Washington, D.C.: Congressional Quarterly. Beck, Nathaniel. 1996. Reporting Heteroskedasticity Consistent Standard Errors. The Political Methodologist 7(2):4-6. Beck, Nathaniel and Jonathan Katz. 1997. The Analysis of Binary Time-Series Cross-Section Data and/or The Democratic Peace. Paper presented at the Annual Meeting of the Political Methodology Society, Columbus, OH. Beck, Nathaniel and Richard Tucker. 1996. Conflicts in Time and Space. Paper presented at the Annual Meeting of the American Political Science Association, San Francisco, CA. Boucher, Robert L., Jr., and Jeffrey A. Segal. 1995. Supreme Court Justices as Strategic Decision-Makers: Aggressive Grants and Defensive Denials on the Vinson Court. Journal of Politics 57(August):812-823. Brace, Paul R. and Melinda Gann Hall. 1993. Integrated Models of Judicial Dissent. Journal of Politics 55(November):914-935. Brace, Paul R. and Melinda Gann Hall. 1997. The Interplay of Preferences, Case Facts, Context, and Rules in the Politics of Judicial Choice. Journal of Politics 59(November):1206-31. Brenner, Saul and Marc Steir. 1996. Retesting Segal and Spaeth’s Stare Decisis Model. American Journal of Political Science 40(November):1036-48. 24 Brisbin, Richard A. 1996. Slaying the Dragon: Segal, Spaeth and the Function of Law in Supreme Court Decision Making. American Journal of Political Science 40(November):1004-17. Butler, J. S. and Robert Moffitt. 1982. A Computationally Efficient Quadrature Procedure for the One-Factor Multinomial Probit Model. Econometrica 50(May):761-64. Caldeira, Gregory A. and John R. Wright. 1996. Nine Little Law Firms: Agenda-Building on the Supreme Court. Manuscript: Ohio State University. Caldeira, Gregory A., John R. Wright and Christopher J. W. Zorn. 1996. Strategic Voting and Gatekeeping in the Supreme Court. Paper presented at the First Annual Conference on the Scientific Study of Law and Courts, St. Louis, MO. Caldeira, Gregory A., John R. Wright and Christopher J. W. Zorn. 1997. Sophisticated Judicial Behavior: Agenda-Setting Via the Discuss List. Paper presented at the Annual Meeting of the American Political Science Association, Washington, D.C. Epstein, Lee, and Carol Mershon. 1996. Measuring Political Preferences. American Journal of Political Science 40(February):261-94. Fitzmaurice, Garrett M., Nan M. Laird, and Andrea G. Rotnitzky. 1993. Regression Models for Discrete Longitudinal Responses. Statistical Science 8(3):284-309. Flango, Victor E. 1994. Habeas Corpus in State and Federal Courts. Williamsburg, VA: National Center for State Courts. Flemming, Roy B. and B. Dan Wood. 1997. The Public and the Supreme Court: Individual Justice Responsiveness to American Policy Moods. American Journal of Political Science 41(April):468-98. Gibson, James L. 1983. From Simplicity to Complexity: The Development of Theory in the Study of Judicial Behavior. Political Behavior 5(1):7-49. Hall, Melinda Gann. 1992. Electoral Politics and Strategic Voting in State Supreme Courts. Journal of Politics 54(May):427-46. Heyde, Christopher C. 1997. Quasi-Likelihood And Its Application: A General Approach to Optimal Parameter Estimation. New York: Springer-Verlag. Hopkins, W. Wat. 1991. Mr. Justice Brennan and Freedom of Expression. New York: Praeger. Hsiao, Cheng. 1996. Logit and Probit Models. In László Mátyás and Patrick Sevestre, eds. The Econometrics of Panel Data: A Handbook of the Theory with Applications. 2nd Revised Ed. Dordrecht: Kluwer Academic Publishers. 25 King, Gary. 1989. "Event Count Models for International Relations: Generalizations and Applications." International Studies Quarterly 33:123-47. Knight, Jack and Lee J. Epstein. 1996. The Norm of Stare Decisis. American Journal of Political Science 40(November):1018-35. Liang, Kung-Yee and Scott L. Zeger. 1986. Longitudinal Data Analysis Using Generalized Linear Models. Biometrika 73(1):13-22. Maltzman, Forrest and Paul J. Wahlbeck. 1996a. May it Please the Chief? Opinion Assignments in the Rehnquist Court. American Journal of Political Science 40(May):421-43. Maltzman, Forrest and Paul J. Wahlbeck. 1996b. Strategic Policy Considerations and Voting Fluidity on the Burger Court. American Political Science Review 90(September):581-92. Mason, Alpheus T. 1956. Harlan Fiske Stone, Pillar of the Law. New York: Viking. Mason, Alpheus T. 1964. William Howard Taft: Chief Justice. New York: Simon and Schuster. McCullagh, P. and J. A. Nelder. 1983. Quasi-Likelihood Functions. Annals of Statistics 11(1):59-67. McCullagh, P. and J. A. Nelder. 1989. Generalized Linear Models. 2nd Ed. London: Chapman and Hall. Murphy, Walter F. 1964. Elements of Judicial Strategy. Chicago: University of Chicago Press. Naft, Erik and Christopher J. W. Zorn. 1998. The Supreme Court and Habeas Corpus: An Integrated Model. Manuscript: Emory University. Nelder, J. A. and R. W. M. Wedderburn. 1972. Generalized Linear Models. Journal of the Royal Statistical Society B 54(1):3-40. Neuhaus, J. M., J. D. Kalbfleisch and W. W. Hauck. 1991. A Comparison of ClusterSpecific and Population-Averaged Approaches for Analyzing Correlated Binary Data. International Statistical Review 59(1):25-35. Pendergast, J. F., S. J Gange, M. A. Newton, M. P. Lindstrom and M. R. Fisher. 1996. A Survey of Methods for Analyzing Clustered Binary Response Data. International Statistical Review 64(1):89-118. 26 Prentice, Ross L. and L. P. Zhao. 1991. Estimating Equations for Parameters in Mean and Covariances of Multivariate Discrete and Continuous Responses. Biometrics 47:825-839. Rohde, David W. And Harold J. Spaeth. 1976. Supreme Court Decision Making. San Francisco: W. H. Freeman and Co. Segal, Jeffrey A. 1986. Supreme Court Justices as Human Decision Makers: An Individual Level Analysis of the Search and Seizure Cases. Journal of Politics 47(November):938. Segal, Jeffrey A., and Albert J. Cover. 1989. Ideological Values and the Votes of U. S. Supreme Court Justices. American Political Science Review 83(June):557-566. Segal, Jeffrey A., and Harold J. Spaeth. 1993. The Supreme Court and the Attitudinal Model. New York: Cambridge University Press. Segal, Jeffrey A. and Harold J. Spaeth. 1996. The Influence of Stare Decisis on the Votes of United States Supreme Court Justices. American Journal of Political Science 40(November):971-1003. Segal, Jeffrey A., Lee Epstein, Charles M. Cameron, and Harold J. Spaeth. 1995. Ideological Values and the Votes of U. S. Supreme Court Justices Revisited. Journal of Politics 57(August):812-823. Songer, Donald R. and Sue Davis. 1990. The Impact of Party and Region on Voting Decisions in the United States Courts of Appeals, 1955-1986. Western Political Quarterly 42(June):317-334. Songer, Donald R. and Susan Haire. 1992. Integrating Alternative Approaches to the Study of Judicial Voting: Obscenity Cases in the U.S. Courts of Appeals. American Journal of Political Science 36(November):963-82. Songer, Donald R. and Stefanie A. Lindquist. 1996. Not the Whole Story: The Impact of Justices’ Values on Supreme Court Decision Making. American Journal of Political Science 40(November):1049-63. Spaeth, Harold J. 1997. The United States Supreme Court Judicial Database, 1953-1995. Ann Arbor: Inter-University Consortium for Political and Social Research. Spaeth, Harold J. and Michael F. Altfield. 1985. Influence Relationships Within the Supreme Court: A Comparison of the Warren and Burger Courts. Western Political Quarterly 37(March):70-83. 27 Spiller, Pablo T., and Matthew L. Spitzer. 1995. Where Is the Sin in Sincere? Sophisticated Manipulation of Sincere Judicial Voters (With Applications to Other Voting Environments). Journal of Law, Economics, and Organization 7(1):32-63. Wahlbeck, Paul J., James F. Spriggs II and Forrest Maltzman. 1998. Marshaling the Court: Bargaining and Accommodation on the U.S. Supreme Court. American Journal of Political Science 42(January):294-315. White, Halbert. 1980. A Heteroscedasticity-Consistent Covariance Matrix and a Direct Test for Heteroscedasticity. Econometrica 48:817-838. Wedderburn, R. W. M. 1974. Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method. Biometrica 61:439-447. Zeger, Scott L. and Kung-Yee Liang. 1986. Longitudinal Data Analysis for Discrete and Continuous Outcomes. Biometrics 42(1):121-130. Zellner, A. 1962. An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests of Aggregation Bias. Journal of the American Statistical Association 58:977992. 28 Table 1 Model Comparisons for Civil Liberties Voting on the Burger Court, OT1981 - 1985 Pooled Probit Fixed- Effects Probit* GEE Model, Exchangeable GEE Model, Pairwise (Constant) 0.949 (0.026) 1.163 (0.062) 0.959 (0.040) 0.709 (0.041) Justice’s Conservatism -1.511 (0.049) -1.916 (0.086) -1.427 (0.035) -0.775 (0.039) ' - - 0.484 - lnL -3814.85 -3608.64 n/a† n/a† Variables Note: N = 6572. Numbers in parentheses are estimated standard errors. The dependent variable is the justices’ votes in civil liberties cases, coded 0 for liberal votes and 1 for conservative votes. *Due to collinearity in the ideology variable, Justice Marshall was not estimated separately. †Since GEE models are estimated via a quasi-likelihood procedure, the results are not directly comparable to more conventional MLE models. Estimated Matrix of Intracase, Interjustice Correlations, Pairwise GEE Model Burger Brennan White Marshall Blackmun Powell Rehnquist Brennan 0.112 White 0.606 0.139 Marshall 0.084 0.873 0.122 Blackmun 0.486 0.580 0.533 0.600 Powell 0.608 0.262 0.597 0.225 0.681 Rehnquist 0.558 0.018 0.502 -0.022 0.313 0.499 Stevens 0.385 0.578 0.451 0.536 0.839 0.559 0.306 O’Connor 0.641 0.101 0.616 0.057 0.497 0.658 0.584 29 Stevens 0.428 Table 2 Model Comparisons for Supreme Court Voting in Habeas Corpus Cases, 1953-1995 Pooled Probit Fixed Effects Probit* Random Effects Probit Pooled Dynamic Probit Autoregressive GEE Probit (Constant) 0.824 (0.122) 0.985 (0.249) 1.086 (0.166) 0.315 (0.147) 0.709 (0.132) Ineffective Counsel -0.572 (0.169) -0.639 (0.180) -0.650 (0.179) -0.618 (0.176) -0.471 (0.149) Trial Court Error -0.006 (0.129) -0.016 (0.136) -0.015 (0.135) 0.045 (0.133) 0.090 (0.109) Fourth Amendment -0.084 (0.193) -0.083 (0.203) -0.094 (0.201) -0.007 (0.202) -0.027 (0.176) Fifth Amendment -0.240 (0.144) -0.252 (0.153) -0.254 (0.152) -0.227 (0.149) 0.015 (0.122) Sixth Amendment -0.124 (0.167) -0.128 (0.179) -0.122 (0.178) -0.197 (0.176) -0.125 (0.150) Eighth Amendment 0.025 (0.157) 0.051 (0.167) 0.032 (0.166) 0.105 (0.163) 0.236 (0.140) Multiple Petitions 0.499 (0.141) 0.602 (0.151) 0.590 (0.150) 0.633 (0.147) 0.673 (0.123) U.S. as a Party 0.436 (0.140) 0.554 (0.153) 0.547 (0.151) 0.543 (0.149) 0.559 (0.131) Per Curiam Opinion 0.059 (0.105) 0.111 (0.113) 0.104 (0.112) 0.056 (0.108) -0.036 (0.101) Justice Ideology -1.716 (0.138) -2.002 (0.382) -2.268 (0.222) -1.396 (0.151) -1.721 (0.192) Justice’s Votet-1 - - - 0.621 (0.098) - '† - - 0.199 (0.036) - 0.338 (n/a) % Predicted Correctly 66.81 73.15 64.93 68.89 62.54 PRE 29.7 43.2 25.8 34.1 20.7 lnL -566.863 -502.089 -525.845 -521.113 n/a‡ 961 961 961 916 961 Variables N Note: Numbers in parentheses are standard errors. * The fixed-effects probit model includes 22 justice-specific constant terms (not shown); additionally, six justices were dropped from the estimation due to perfect agreement on their ideology scores. †' is the estimated intrajustice correlation in the random component for the random-effects model, and the estimated AR(1) nuisance parameter for the GEE model. ‡ Since GEE models are estimated via a quasi-likelihood procedure, the results are not directly comparable to more conventional MLE models. 30 Figure 1 Impact of Ideology on Probability of a Liberal Vote for Four Models in Civil Liberties Cases, OT1981-1985 .9 .8 .7 .6 .5 .4 .3 .2 0 .25 .5 31 .75 1