GEE Models of Judicial Behavior

advertisement
GEE Models of Judicial Behavior
Christopher J. W. Zorn
Department of Political Science
Emory University
Atlanta, GA 30322
czorn@emory.edu
Version 1.0: April 1, 1998
Abstract
The assumption of independent observations in judicial decision making flies in the
face of our theoretical understanding of the topic. In particular, two characteristics of
judicial decision making on collegial courts introduce heterogeneity into successive
decisions: individual variation in the extent to which different jurists maintain consistency
in their voting behavior over time, and the ability of one judge or justice to influence
another in their decisions. This paper addresses these issues by framing judicial behavior
in a time-series cross-section context and using the recently developed technique of
generalized estimating equations (GEE) to estimate models of that behavior. Because the
GEE approach allows for flexible estimation of the conditional correlation matrix within
cross-sectional observations, it permits the researcher to explicitly model interjustice
influence or over-time dependence in judicial decisions. I utilize this approach to examine
two issues in judicial decision making: latent interjustice influence in civil rights and
liberties cases during the Burger Court, and temporal consistency in Supreme Court voting
in habeas corpus decisions in the postwar era. GEE estimators are shown to be
comparable to more conventional pooled and TSCS techniques in estimating variable
effects, but have the additional benefit of providing empirical estimates of time- and panelbased heterogeneity in judicial behavior.
Paper prepared for presentation at the Annual Meeting of the Midwest Political Science
Association, April 23-25, 1998, Chicago, Illinois. Thanks to Erik Naft for providing some of
the data used here, and to Neal Beck, Greg Caldeira, Mike Giles, Tom Walker and John
Wright for helpful comments and discussions. This is a preliminary version; please cite
extensively without the author’s permission.
1. Introduction
Fifteen years have passed since Gibson’s (1983) call for an “integrated” theory of
judicial decision making. A central concern of Gibson’s plea was the need to focus on
individual decision makers, including models which test theories from other levels of
analysis at the individual level. Since that time, scholars of judicial behavior have
responded with an increasingly complex array of models designed to incorporate
background and socialization variables, attitudes, roles, fact patterns and precedent, and
institutional and strategic considerations into explanations of judicial activity. But while
the theoretical richness of this literature has grown immensely during this period, little
development has occurred in the way in which we incorporate these developments into our
empirical work.
This lag in the development of models of judicial behavior becomes clear when we
examine the modus operandi of the archetypical judicial behavior study. Such studies
often consider data on a particular area of the law (e.g. search and seizure, obscenity) and
posit a model of the decision process based on one or more of the set of factors outlined
above. The variable of interest — the decision — is more often than not dichotomous, and
the unit of analysis is, variably, the decision of the Court or the vote of the various judges
or justices in those cases. A probit or logit model is estimated via MLE, probabilities
compared, and conclusions are drawn. Implicit in this formulation, however, is the
assumption that the observations are conditionally independent, a claim with important
implications, both statistical and substantive, for the conclusions we draw.
I argue here that this assumption of conditional independence flies in the face of our
knowledge of judicial behavior. This is true for at least two reasons: the influence of
consistency in decision making, and the impact of informal interjustice influence on judicial
decision making; I examine these two issues here. More generally, our understanding of
judicial politics implies a wide range of sources of heterogeneity in these observations, all
of which have a potentially critical influence on our understanding of decision making.
Moreover, these factors are of greatest concern in modeling individual-level decisions,
arguably the most fruitful ground for analyzing judicial politics.
1
The plan of the paper is as follows. In section 2, I motivate the discussion by
examining these two important sources of heterogeneity in judicial decisions, and
investigating the implications of those influences for specification and estimation of judicial
decision making models. Section 3 introduces a potential solution to those problems, the
technique of generalized estimating equation (GEE) models for time-series, cross-sectional
data, and discusses its statistical properties. I then apply these models to the two
applications discussed. In section 4, I examine interjustice influence in a simple
attitudinal model of Supreme Court decision making in civil rights and liberties cases
during the fifth Burger Court (OT1981-1985). Section 5 presents an analysis of the
importance of over-time voting consistency in a more complex, “integrated” model of merits
decisions in Supreme Court cases involving habeas corpus. In both applications, the GEE
results are compared and contrasted with those which obtain from commonly-used models
which fail to account for the heterogeneity in question. Section 6 closes the paper with a
discussion of the general question of cross-observation dependence and suggestions for
additional applications of the model, both within and beyond the area of judicial politics.
2. Heterogeneity in Judicial Decision Making
There are at least two compelling arguments refuting the notion that individuallevel decisions on the Supreme Court are homogenous. The first I term interjustice
influence: the impact of the informal, and largely unobserved, bargaining, persuasion, and
even coercion among the Court’s members. On a small, collegial decision making body
such as the Court, we would expect such influences to be substantial, and previous work in
this area has found that justices do influence one another to a large degree (e.g., Murphy
1964). Biographical and anecdotal accounts of the workings of the Court point to the
effectiveness of some justices in bringing their brethren around to their own way of
thinking (e.g. Mason 1956, 1964). Among these, Justice William Brennan stands out:
Brennan’s powers of persuasion have achieved close to legendary status (e.g. Baum 1995,
175-6; Hopkins 1991).
More empirical and quantitative studies have also documented the impact of
interjustice influence on Supreme Court decision making. Spaeth and Altfield’s (1985)
study of special opinions on the Warren and Burger Courts, for example, found that while
2
the impact of such influence was not particularly strong during those periods and most
such instances involved justices of similar ideology, a majority of influence relationships
were unidirectional — indicating that “justices do differ in their power of persuading
others of the correctness of their views” (Spaeth and Altfield 1985, 82; see also Altfield and
Spaeth 1984). More recently, a series of studies by Maltzman and Wahlbeck (1996a,b;
Wahlbeck, Spriggs and Maltzman 1998) have begun to investigate these internal dynamics
in more depth, focusing on strategic influences on such activity as opinion assignment and
opinion coalitions.
A second factor advising against homogeneity is the effect of individual justices’
consistency on their decision making. Justices of the Supreme Court, as a rule, exhibit
homogeneous voting patterns over the course of their careers. Baum, for example, finds
that, while individual voting change “played a surprising large role” in the changing
policies of the Vinson, Warren and Burger Courts, his study also confirmed that
membership change comprised the largest contribution to overall policy change (1992, 21-2;
see also Baum 1988). This general consistency in individual voting behavior can be
ascribed to a number of sources: stability in justices’ attitudes and perspectives on the law,
a normative desire on the part of the justices to maintain and support positions that they
have taken in the past,1 and the influence of past decisions (i.e., precedent) on the justices’
current behavior.
Irrespective of its source, however, such consistency implies that decisions in
subsequent cases are unlikely to be independent. Nonetheless, informed modeling
decisions can advance our understanding of the reasons for this dependence. As an
example, consider the impact of precedent on justices’ voting decisions. The importance of
precedent in Supreme Court decision making is currently a hotly debated topic (see e.g.
Brenner and Stier 1996; Brisbin 1996; Knight and Epstein 1996; Segal and Spaeth 1996;
Songer and Lindquist 1996). A major issue of operationalization in this debate concerns
the observational equivalence of voting according to precedent or ideology in cases where
the two are in agreement; this equivalence makes untangling these separate effects
1
Segal and Spaeth, for example, point out Justice Harlan II’s tendency to “never ...
accede to any statement supportive of a majority opinion from which he dissented” (1996,
n.1).
3
difficult. In principal, however, one could begin to address this problem by considering the
extent to which justices’ behavior was consistent across similar cases over time after
controlling for the influence of ideology. I take a first step in this direction in one of the
applications below.
From these discussions, we can discern three different possible sources of
heterogeneity in judicial decision making. First, unobserved factors which vary from one
justice to another I denote within-justice effects. Included in this category would be
variations in behavior resulting from background, socialization, and personal
characteristics which change little or none over time. The impact of these variables on
decision making will thus vary across justices, but will, ceteris paribus, remain relatively
constant for a particular justice across different cases. A second source of heterogeneity
consists of influence relationships among the justices of the Court; I label these acrossjustice effects. Bargaining, pressure and coercion among the justices undoubtedly
influence the actions they take, and thus also the observed patterns of the votes and the
decisions of the Court. Finally, there are over-time influences on the justices’ decisions;
these include such factors as the Court’s agenda and precedent. These influences are casespecific; they are common across justices, varying instead over the cases the justices hear.
Moreover, they include as special cases such previously studied variables as case facts and
legal influences.
The “trick” to dealing with heterogeneity in judicial decision making is addressing
the different sources of heterogeneity simultaneously. For simplicity, I begin with an
abstract, general model of Supreme Court decision making at the individual level.
Concerns relating to interjustice influence and consistency are also present at more
aggregated levels of analysis, e.g. the decision of the Court, or the proportion of decisions
made in a certain direction by a particular justice or Court in a given year. Similarly,
these sources of heterogeneity will also be influential, albeit to different degrees, in lower
appellate courts as well: while interjudge forces will typically be absent in trial courts, for
example, matters relating to precedent and consistency clearly play an even greater role
there (Baum 1986). By focusing on individual decisions in the Supreme Court, however, I
offer the greatest degree of comparability to previous work; the models here readily
generalize to other settings as well.
4
Consider a binary decision process, where a justice must decide between two
possible outcomes; these may be a vote to grant or deny certiorari, a vote to affirm or
reverse on the merits, or any other such decision. Denote the outcome of that decision for a
specific justice j in a given case i as Yij . We might expect the probability of a given
outcome on Yij to vary according to some function of a vector of k independent variables;
these factors may vary over cases (e.g. case facts), justices (e.g. ideology), or both. We may
write this general model as:
Prob(Yij) f (Xij k k)
(1)
As is generally the case in models which vary over more than one unit of analysis
(e.g. those involving time-series cross-sectional data), the simplest model one can estimate
also involves making the greatest number of assumptions about the data-generating
process. If, conditional on the independent variables X, the impact of those independent
variables on the decision is assumed to be constant across both justices and cases, and if
the observations analyzed are assumed to be independent both across i and j (i.e., across
justices and cases), then an estimate of the impact of X on Y can be obtained by simply
“pooling” the observations and estimating a standard logit or probit model. This approach
has been widely used in empirical studies of judicial decision making, both of the U.S.
Supreme Court (e.g. Maltzman and Wahlbeck 1996b; Segal and Spaeth 1993) and of other
judicial fora (e.g. Brace and Hall 1993, 1997;2 Songer and Davis 1990; Songer and Haire
1992). The assumptions on which it is based are stringent; in particular, the assumptions
of uniform behavior and no dependence within or across either cases or justices are, for the
reasons stated previously and a host of others, highly unrealistic. I therefore consider in
turn a number of other alternatives which have been used to address the issue of
heterogeneity in this and other contexts.
Consider first within-justice heterogeneity; that is, variations among justices in
their behavior in the same cases. One possibility is that, while all justices respond in
2
Brace and Hall include separate dummies for varying time periods, and may in one
sense therefore be considered a variation on a fixed-effects specification.
5
similar ways to the same stimuli, each varies in his or her initial predisposition to behave
in a certain way. That is, there is variation in the model’s intercept terms for different
justices. Formally, we may write:
Prob(Yij) f (Xij k k j)
(2)
where the ’s are separate intercepts for each justice. This is the standard fixed-effects
model; justice-specific heterogeneity in the intercepts are conditioned out of the model via
separate estimates of and . Like the general pooled model, the fixed-effects model has
received some use in the judicial politics literature(e.g. Hall 1992)3, although its application
is not as widespread as other types.4
A more commonly found generalization allows both intercept and slope coefficients
to vary over justices or time points:
Prob(Yij) f (Xij k j k)
(3)
Here justices respond differently to the case stimuli presented to them, and the model
explicitly allows for estimation of those differences. Maintaining our assumption of no
conditional correlation across cases, this model amounts to simply running separate
regressions for each justice in question, a technique widely used in the judicial politics
literature (e.g. Boucher and Segal 1995; Caldeira and Wright 1996; Flemming and Wood
1997; Segal 1986). Such models have the advantage of computational simplicity and
3
Hall’s analysis incorporates fixed effects for time-points, rather than justices, but
the two approaches are conceptually similar.
4
The relative absence of fixed-effects models from judicial politics research is
somewhat surprising, at least at the Supreme Court level. The fact that such models often
have relatively fixed (and small) numbers of justices and large (and increasing) numbers of
cases would seem to make fixed-effects specifications a natural choice.
6
explicit treatment of differences. At the same time, they are not especially parsimonious,
and comparisons of coefficient values is not advised.5
While these approaches represent intuitive ways of getting at variation within
different justices, a common trait of all these models is their failure to allow for
dependencies across justices. Because they maintain the assumption of the independence
of votes, none consider the possibility of conditional correlation across the votes of different
justices. Thus if, holding constant the effect of the independent variables, justices engage
in logrolling, persuasion, coercion, or other forms of interpersonal influence, this
information is ignored by the model, with a corresponding loss in efficiency of the
estimates. One potential solution to this problem lies in “seemingly-unrelated” regression
models (e.g. Zellner 1962), where separate equations are estimated with the assumption of
a correlated error structure. However, few of these models have been modified to address
the discrete choice variables typically encountered in judicial decision making (but see
King 1989).
The issue of case-specific heterogeneity raises a separate set of concerns. Even
assuming we have adequate measures for case differences (e.g. case facts, lower court
disposition, etc.), other factors contribute to cross-case heterogeneity. In particular, the
foregoing discussions of consistency and precedent imply that, among the decisions of a
particular justice j we might expect that cases will not be serially independent. In addition
to case-specific fixed- and random-effects models, a natural way of incorporating these
effects is through a simple dynamic specification:
Prob(Yij) f (Xij k k Yi1,j')
(4)
Here, the probability of a particular decision is a function of the independent variables, and
the immediately previous decision.6 Equation (4) represents the general AR(1) form of this
5
Other models, less widely used in judicial politics but popular elsewhere, include
random-effects models, which treat individual effects as products of a unit-specific mean
effect and a stochastic component (e.g. Hsaio 1996), and random-coefficient models.
6
See Beck and Katz (1996) for a discussion of the use of lagged endogenous variables
in time-series cross-sectional models generally. Beck and Tucker (1997) argue that such
7
model. As is the case for justice-specific heterogeneity, one could further generalize the
nature of the relationship by allowing the impact of the lagged endogenous variable to vary
by justice (i.e., ' = 'j).
Addressing both justice-specific and case-specific heterogeneity would seem to
suggest a combination of these models. So, for example, one might estimate separate
models for each justice’s response to case-specific variables, while including a dynamic
component for the effect of precedent and allowing the errors to be correlated. This
piecemeal approach to model estimation is, at best, questionable on substantive grounds,
and at worst incorrect and misleading. Instead, I suggest an alternative, unified approach
to addressing these sources of heterogeneity below.
3. Generalized Estimating Equation Models: An Overview7
I now introduce the generalized estimating equations (GEE) approach to time-series
cross-sectional models. The method of GEE is a marginal, or population-averaged,
approach to estimation. Population-averaged models differ from the more common clusterspecific approaches described above. Cluster-specific approaches model the probability
distribution of the dependent variable as a function of the covariates and a cluster-specific
parameter. This latter term may be either estimated concurrently with the model (as in
the fixed-effects approach) or be assumed to follow some stochastic distribution (as in the
random-effects specification). Population-averaged approaches, by contrast, model the
marginal (or population-averaged) expectation of the dependent variable as a function of
the covariates.8 These differences have ramifications for the interpretation of the
estimated parameters, as I discuss below.
techniques are of more limited utility in models with binary dependent variables (BTSCS),
preferring instead a modified event-history approach using period-specific dummy
variables. In contrast, Beck and Katz (1997) note that, while lagged dependent variables
are an imperfect correction for serially correlated errors in BTSCS models, they are
appropriate when the model specification calls for endogeneity. I address these issues
further in the applications, below.
7
This section draws extensively on the presentation of Zeger and Liang (1986) and
Fitzmaurice et. al. (1993).
8
For a good discussion of this distinction, see Neuhaus et. al. (1991).
8
The GEE model has its roots in the quasi-likelihood methods introduced by
Wedderburn (1974) and developed and extended by McCullagh and Nelder (1983, 1989).9
While standard maximum likelihood analysis requires that we specify the conditional
distribution of the dependent variable, quasi-likelihood requires only that we postulate the
relationship between the expected value of the dependent variable and the covariates and
between the mean and variance of the dependent variable. This generalized linear models
(GLM) approach has received widespread use in cross-sectional analysis.
Following notation similar to that of Zeger and Liang (1986), I consider a model of
observations on a dependent variable Yit and k covariates Xit, where i indexes the N
observations i = 1, 2, ... N and t indexes the T time points t = 1, 2,... T. Let Yi = (Yi1, Yi2, ...
YiT) denote the corresponding column vector of observations on the dependent variable for
observation i, and Xi indicate the T×k matrix of covariates for observation i.10 Writing
E(Yi)=)i, we define a function h which specifies the relationship between Yi and Xi:
—L
K ; L
(5)
where is a k×1 vector of parameters, and the inverse of h is known as the “link” function.
Likewise, the variance Vi of Yi is specified as a function g of the mean. In the crosssectional case (i.e., T = 1), we write:
9 L
J— 1
L
(6)
where 1 is a scale parameter typically set to one for discrete distributions of Yi and to chisquared divided by the residual degrees of freedom for continuous Yi. The quasi-likelihood
estimate of is then the solution to a set of k “quasi-score” differential equations:
9
An excellent summary of this approach can be found in Heyde (1997).
For notational simplicity, I assume here that Ti = Ti’ ~ i g i’; i.e. that the “panels”
are balanced. This need not, however, be the case for estimation of the models presented
here; balanced panels are, however, necessary for likelihood-based “mixed parameter”
models (Fitzmaurice et. al. 1993).
10
9
4 N
1 0—
M
L
L
0
9 < — L
L
L
(7)
N
Note that E[Q()] = 0 and Cov[Q()] = (0)i/0k)’V-1(0)i/0k). The function Q() thus behaves
like the derivative of a log-likelihood (i.e., a score function); estimation is typically via a
generalized weighted least-squares approach.
For cases where T > 1, some provision must be made for dependence across t. Liang
and Zeger’s (1986) solution was to specify a T×T matrix Ri() of the “working” correlations
across t for a given Yi. While Ri() can thus vary across observations, it is assumed to be
fully specified by the vector of unknown parameters , which have a structure determined
by the investigator and which are constant across observations. This correlation matrix
then enters the variance term of equation (7):
9 L
$ 5 $ 1
L
L
L
(8)
where the Ai are T×T diagonal variance matrices of Yi with g()it) as the tth diagonal
element. Substitution of (8) into (7) yields the GEE estimator; from this discussion, it is
clear that the GEE is an extension of the GLM approach, and that the former reduces to
the latter when T = 1.
Since the Vi’s are functions of both and , estimation typically is accomplished by
an iterative procedure (e.g. iteratively reweighted least squares). We can reexpress (7) as a
function of alone by substituting a consistent estimate r of , and a consistent estimate of
1, into (7).
We then solve for and calculate standardized residuals, which are in turn
used to consistently estimate and 1. These two steps are iterated until the estimates
reach convergence.
The GEE model set out here has a number of attractive properties for applied
researchers. Because the first two terms of (7) are not dependent on Yi, the score equations
converge to zero (and thus have consistent roots) so long as E[Yi - µi] = 0. Assuming that
the link function inv(h(•)) is correct, GEE estimates of will thus be consistent in N (Liang
10
and Zeger 1986). Moreover, N1/2(GEE - ) is asymptotically multivariate normal, and the
covariance matrix of the estimates can be consistently estimated by the inverse of the
Hessian, evaluated at the values of and produced by (7).11 Most important, the
asymptotic consistency of the GEE estimates of holds even in the presence of
misspecification of the “working” correlation structure ; thus, the GEE offers the potential
of providing asymptotically unbiased estimates of the regression parameters of primary
interest even in cases where the nature of the time dependence is unknown.12
A additional advantage of the GEE approach is the broad range of options available
for specifying time dependence. Fitzmaurice et. al. (1993) discuss four common
specifications of the “working” correlation matrix Ri() for observations Yis and Yit:
1.
Ri() = I, a T × T identity matrix. This “working independence” assumption is
equivalent to assuming no over-time correlation, and yields estimates equivalent to
those from simple “pooled” models. No estimate of is obtained.
2.
Ri() = ' if t g s. This is the “exchangeable” correlation structure; values of Yi are
assumed to covary equally across all time points. This assumption corresponds to
“random effects” models which estimate an over-time correlation parameter. In this
specification, is a scalar, which is estimated in the model.
3.
Ri() = '|t-s| if t g s, an autoregressive (AR(1)) specification. The within-observation
correlation over time is an exponential function of the lag length; as is typically the
case in AR models, we assume |'| < 1.0. In an autoregressive specification, is a
vector (1, ', '2, ... )’ which is the same across observations.
4.
Ri() = st, an unstructured of “pairwise” correlation structure. That is, no
constraints are placed on the correlations across time points; instead, they are
Zeger and Liang (1986) note that in most cases it is possible to estimate and without estimating 1 directly, provided that the elements of the correlation matrix R be
multiples of the parameters .
11
Note, however, that the consistency of the variance estimate for the GEE does
depend on proper specification of the working correlation structure. In such cases, Liang
and Zeger (1986) propose a “robust” estimate of the variance-covariance matrix of GEE,
analogous to that derived by White (1980), which is consistent even under misspecification
of the correlation matrix. For reasons of brevity, I reserve treatment of robust standard
errors in the GEE context for a future paper.
12
11
estimated without restriction from the data. in this context is a T×T matrix
containing the T(T - 1)/ 2 pairwise correlations for all possible combinations of time
points.
In addition to these, a number of other specifications of the working correlation
matrix are possible, including stationary and nonstationary models of varying orders.
Alternatively, the researcher may specify Ri() explicitly; this options is especially valuable
for testing the robustness of estimates to the correlation specification. More recently, Zhao
and Prentice (1991) have extended the GEE model to allow parameterization of the
working correlation matrix as a function of a set of covariates = f(Zit), and to allow joint
estimation of and . Provided that the specification of both the mean and the correlation
structure are correct, this “GEE2" approach permits more efficient estimation of the
parameters of interest, as well as allowing the possible of substantial substantive insight
into the determinants of over-time correlation. Its primary drawback is that, because of
the simultaneous estimation process, consistent estimates of depend on proper
specification of the correlation matrix; thus, the GEE2 estimator lacks the robustness of
the GEE in estimating the regression parameters in the presence of time dependence of
unknown form.
The potential advantages of the GEE approach for estimating individual-level
models of judicial decision making are several.13 First, these models allow for explicit
incorporation and modeling of interdependencies across cases and justices through
specification of the working correlation matrix. At the same time, the parameter estimates
obtained through application of these models are robust to misspecification of those
interdependencies, an important trait since our knowledge of the nature of those
relationships is imperfect at best. Moreover, these models are capable of handling
unbalanced data (for example, that due to recusals), and are relatively robust in the
presence of such missing data.14
13
For two recent applications of GEE models to judicial politics, see Caldeira et. al.
(1996, 1997).
14
A thorough treatment of missing data in these and similar models can be found in
Fitzmaurice et. al. 1993.
12
In the following two sections I examine GEE models of judicial behavior, comparing
empirical estimates of factors in judicial decision making obtained from GEE models with
those of more commonly-used models for pooled data. In Section 4, I consider a specifically
“attitudinal” model, focused on the impact of judicial ideology on decision making in civil
rights and liberties cases. There, I focus on the issue of interjustice influence, showing how
GEE models may be used to both control for and estimate these influences within a model
of decision making. Section 5 turns to an “integrated” model of decision making in habeas
corpus decisions; this model is similar to a wide range of such studies in the recent judicial
behavior literature. There the emphasis is on the impact of consistency: that is, justicespecific, over-time dependence in judicial decisions.
4. Interjustice Influence and Judicial Decision Making
I begin the empirical part of the paper with an analysis of influence relationships
among justices of the Supreme Court. The data considered are the civil rights and liberties
decisions of the seventh Burger Court (OT1981-1985); data are all such cases decided by
the Court between the appointment of Justice O’Connor and the retirement of Chief
Justice Burger, and are drawn from Harold Spaeth’s Supreme Court Judicial Database
(Spaeth 1997). Excluding missing data, the number of such cases is 767; with an average
of 8.57 votes per case, this yields a total of 6572 individual votes for analysis.
The dependent variable is the justice’s votes in civil rights and liberties decisions,
coded 0 for a liberal outcome and 1 for a conservative one. To facilitate model comparisons,
I examine a simple, pure attitudinal model of Supreme Court decision making (e.g. Segal
and Spaeth 1993). The model posits that justices’ votes are determined by their political
ideologies. In therefore include as the sole independent variable modified Segal/Cover
(1989) measures of judicial ideology (Epstein and Mershon 1996). These range from 0.0
(for the most conservative justices) to 1.0 (for the most liberal), with the expectation that
this measure will exhibit a significant negative impact on the justices’ votes.
Model comparison results are presented in Table 1. For comparison purposes, I
begin the analysis by examining a simple pooled model, where votes are assumed to be
independent across both justices and cases. Comparison is facilitated by the fact that this
pooled model is equivalent to a GEE specification with an independent working correlation
13
structure. Consistent with expectations, judicial ideology has a substantively large and
statistically significant effect on voting, and the model as a whole is a considerable
improvement over the null model. The estimated probability of a conservative vote by the
most conservative justice is 0.83; this value declines to 0.29 for the most liberal members of
the Court. This simple model predicts 74.22 percent of the votes correctly, compared to the
null of 63.04 percent, for a reduction in error of 30.3 percent.
As noted above, a simple way of controlling for justice-specific heterogeneity is to
allow for each justice to have a separate intercept term. This amounts to estimating a
fixed-effects model, with separate dummy variables for each justice save one;
unconditional fixed-effects probit is an asymptotically consistent estimator in T (Hsiao
1996). Results of estimating this fixed-effects specification are presented in column two of
Table 1. Chief Justice Burger was excluded from the analysis and is thus the category for
comparison. Moreover, because Justices Marshall and Brennan, and Justices Burger and
Blackmun, have identical ideology scores, their fixed effects cannot be separated; thus,
these pairs of justices “share” the same intercept. The fixed-effects model also exhibits a
substantial improvement over the null; moreover, a likelihood-ratio test indicates that it is
also a substantial improvement over the pooled model presented in the first column (32(7) =
412.42, p < .001). Again, the impact of ideology is large and statistically significant;
assuming the fixed effects to be zero, the probability of the most liberal justice voting
conservatively is 0.23, a value which increases to 0.88 for the most conservative member of
the Court. Five of the seven estimated fixed effects (not shown) are also statistically
significant, with justices Blackmun, Powell and Stevens significantly more likely to case a
liberal vote and White and O’Connor more likely to cast a conservative one. At the same
time, the fixed effects model fails to improve on the predictive ability of the pooled model;
this is due to the fact that this particular fixed-effects model lacks explanatory variables
which vary over cases.
Both the pooled and fixed-effects models are unable to capture any interjustice
dynamics. Accordingly, I next turn to two GEE models where cross-justice influence may
be estimated. Treating each case as a single observation, and each justice’s vote as a
separate realization on that observation, allows the GEE specification to account for
within-case correlations among the justices. The first such model used the exchangeable
14
correlation structure; that is, it assumes that the correlation between justices is equal for
all pairs of justices, and estimates a single scalar value for that correlation. The second
model allows for an unstructured correlation structure; pairwise correlations between
justices are allowed to vary, and are estimated simultaneously with, and conditional on,
the other model parameters. These results are presented in columns three and four of
Table 1.
In comparing the GEE models to the more commonly-used counterparts, two
differences are immediately apparent. First, the overall values of the coefficients are
generally smaller; this is particularly true for the pairwise model, where both the intercept
and the coefficient for the ideology variable are between twenty and thirty percent smaller
than the other model estimates. This is due to two factors. First, as lucidly demonstrated
by Neuhaus et. al (1991), estimated covariate effects for population-averaged models such
as the GEE will generally be smaller than those estimated by cluster-specific approaches.
Second, recall that the GEE approach accounts for the effect of inter-justice influence on
decision making. To the extent that that influence occurs between justices of similar
ideological dispositions, the estimated coefficients on ideology in the GEE will be
attenuated relative to those in models where this effect is not estimated. Put differently,
because both factors have the same effect on voting, simple pooled and fixed-effects models
conflate influence among ideologically similar justices with the influence of that ideology
itself; GEE models are better able to separate those effects.
The result of this decrease in the parameter estimates is a reduced impact on the
predicted probabilities generated by the GEE models. These differences are illustrated
graphically in Figure 1. While the change in probability associated with moving from most
conservative to most liberal is 0.51 for the exchangeable GEE, that difference is a reduced
0.29 in the model estimated with a pairwise correlation structure.
A second noticeable difference between the GEE and previous models is the reduced
predictive ability of the two models. Both models correctly predict 62.26 percent of the
cases correctly, a decrease in predictive power over even the null model. This lack of
predictive ability arises from the inability of the models to accurately categorize the
predictions across both types of outcomes, rather than from any systematic over- or
underprediction of a single category. The marginal percentages for liberal and
15
conservative votes are 37.0 and 63.0 percent, respectively; the two cluster-specific models
predict 21.2 and 78.8 percent in each respective marginal category, while the GEE models
come closer to the true marginals with 43.9 and 56.1 percent.
The GEE models also provide estimates of the extent of interjustice influence,
conditional on the estimated parameter values. In the exchangeable specification, the
estimated rho is 0.484; this can be interpreted as the average (in the sense of “typical”)
interjustice correlation in voting, conditional on the effect of ideology. Thus, justices
appear to exercise some nontrivial degree of influence on each other’s decisions. Because
the model is relatively simple, however, its likely that a good deal of this correlation is due
to institutional, contextual, and case-specific factors which are excluded from the model;
one would expect this average correlation to be lower in a more fully specified model of
judicial behavior.
In the model with unstructured, pairwise correlations, the estimated interjustice
effects are presented with Table 1. The results of this estimation are largely consistent
with the conventional wisdom regarding patterns of influence relationships on the Burger
Court. Thus, Justices Marshall and Brennan exhibit high, positive correlations, as do the
conservative bloc of Rehnquist, O’Connor, White and Burger. We also find high
correlations among the Court’s moderates: Justices Blackmun, Stevens, and also Powell.
Perhaps most surprising is that only one correlation, that between Justices Marshall and
Rehnquist, is negative, and that only slightly so. This again can be attributed to the fact
that the estimates are conditional on the model parameters, which in turn points to the
impact of model specification: presumably, a good deal of the interjustice correlation is due
to case facts and other variables which render the justices’ decisions more highly correlated
than they would be were those factors explicit in the model.
The results of this initial model comparison are suggestive: GEE approaches offer at
least the potential for greater understanding of judicial behavior, and in particular for
extracting additional information out of individual-level data on judicial decision making.
The extent to which this potential is realized, however, depends critically on, among other
things, correct model specification. Particularly in the GEE context, where the estimates
of this influence are conditional on the other model estimates, complete and proper model
specification is critical to obtaining useful estimates of the auxiliary parameters. To assess
16
the usefulness of the GEE approach in a more complete model, as well as to examine its
utility in dealing with temporal heterogeneity, I turn to a second application in judicial
behavior.
5. Decision Making Models with Individual Heterogeneity
As noted above, the importance of precedent in Supreme Court decision making has
been the subject of considerable dispute. What has not been recognized is that the debate
is of both substantive and methodological importance; to the extent that precedent implies
consistency in consecutive votes of the justices or decisions of the Court, it suggests a
particular form of longitudinal dependence. That dependence must in turn be considered if
we are accurately to model judicial behavior. In this section of the paper I conceptualize
consistency and precedent at the individual level: that is, the extent to which a justice’s
voting behavior is consistent with his or her previous decisions in similar cases.
The idea of consistency, particularly as it applies to precedent, must necessarily be
evaluated within a series of similar cases. Here I embed my examination of precedent in a
more general “integrated” model of judicial decision making in the area of habeas corpus.
Petitions for habeas corpus “challenge the constitutionality of a person’s detention and
request release” (Flango 1994, 1); these claims are thus premised on the assertion that a
constitutional violation took place prior to or during the defendant’s trial, conviction or
sentencing. I posit a model of Supreme Court decision making in habeas corpus cases,
based on both legal and attitudinal factors.15 Legal factors center around the constitutional
basis for the claim, and include those based on ineffective counsel and trial court error, as
well as the Fourth, Fifth, Sixth and Eighth Amendments; general Due Process claims
relating to the Fourteenth Amendment are excluded, and constitute the baseline category
for comparison. Extralegal and attitudinal considerations include the presence of the U.S.
as a party to the case, the presence of a per curiam decision, the presence of multiple (i.e.,
second and subsequent) petitions for relief, and the ideological disposition of the justice
casting the vote, captured by rescaled Segal/Cover scores (Epstein and Mershon 1996). All
15
For a full exposition of the model, see Naft and Zorn (1998).
17
other variables are coded 0 in the absence of the indicated characteristic and 1 for cases
where it is present.
The data consist of all habeas corpus decisions handed down by the Warren, Burger
and Rehnquist Courts during the 1953-1995 Terms. There were 109 such cases;
disaggregating by individual justice’s votes and excluding missing data, this leaves 961
observations on which to base our analysis. The dependent variable is the reported vote on
the merits, coded 0 if the vote was in favor of the habeas corpus claimant and 1 if the
justice voted against the claimant. In contrast to Section 4, here I treat the justice as the
primary unit of analysis (i.e., the “i”), and consider cases as repeated observations on the
decisions of the justices. In the context of the GEE approach, this allows me to estimate
the within-justice correlation across different cases, controlling for the effects of the
(generally case-specific) independent variables.
Table 2 presents results for five separate models of the vote in habeas corpus cases.
For purposes of comparison, I again begin by estimating a “pooled” probit model of justices’
votes, treating all such votes as conditionally independent; as noted, this is exactly
equivalent to the GEE with an assumed “independent” correlation structure. Despite the
possibility of correlation across cases, Liang and Zeger (1986) show that this pooled model
will produce estimates which are consistent and asymptotically normal. Because it fails to
account for interdependence over time, however, the estimates of the standard errors will
generally be inconsistent: standard errors of the fixed-over-time covariates will tend to be
underestimated, while those of time-varying covariates will be overestimated.
Results of the pooled probit model indicate that only one of the legal variables
exerts a substantial impact on justices’ votes: claims based on an assertion of ineffective
representation at the trial level are significantly more likely to receive a vote favorable to
the claimant than the baseline case. In contrast, all of the attitudinal factors save one
attain statistical significance. Both the presence of the U.S. as a party and multiple
petitions decrease the probability of a pro-claimant vote. Justice ideology is the largest
influence on the vote: liberal justices are very much more inclined to cast a pro-claimant
vote than are their conservative counterparts. Finally, the model does a fairly good job of
predicting the votes of the justices, resulting in a nearly 30 percent reduction in error from
a null model.
18
For comparison purposes, I also present the results of estimating the same equation
with two additional, commonly used models for dealing with time-series cross-sectional
data. As noted previously, the fixed-effects model addresses justice-specific heterogeneity
by estimating separate intercepts for each justice. The results for the fixed-effects model
are generally similar to those of the pooled model: most variables exhibit slightly greater
effects on the vote once individual-justice variations are controlled for, while standard
errors also generally increase as a result of the fewer degrees of freedom. A likelihood ratio
test indicates that inclusion of the fixed effects improves the model significantly (32(22)=
129.55, p < .01), and its predictive power is also substantially better than the pooled model
alone. On the other hand, the results also highlight a potential weakness of the fixed
effects specification: in the absence of variables which vary across observations, fixed
effects cannot distinguish individual effects. Thus, as with the previous analysis, the fixed
effects model treats justices with the same ideology score as equivalent.
Column three of Table 2 presents estimates from a random effects model, where
justice-specific effects follow a specific random distribution and are uncorrelated with the
independent variables (Butler and Moffitt 1982). Estimation is accomplished through
Gaussian-Hermite quadrature with six support points; results were not generally sensitive
to variations in the number of points used. The findings are broadly similar to those of the
fixed-effects estimator, with similar coefficients and nearly identical standard errors. The
exceptions are the variable for justice ideology, which is estimated more precisely than in
the fixed effects case, and in predictive model fit, where it performs slightly worse than the
simple pooled model. The significant, positive estimate of rho indicates that we may
confidently reject the null hypothesis that the disturbances “within” a single justice are
uncorrelated.
None of the models presented thus far grapples with the main question of interest:
the issue of time dependence due to the impact of judicial consistency and legal precedent.
The finding in the random effects specification that errors for a particular justice are
correlated goes to the general issue of independence in the justices votes, but tells us little
about the form that consistency takes. Accordingly, I specify two alternative models which
explicitly model the impact of a justice’s previous votes on his or her current vote. The first
is a simple pooled probit model, with the inclusion of a variable for the justice’s vote in the
19
most recent habeas corpus case it decided. This model is far from perfect; substantively, it
does not differentiate among claims involving different areas of the law, while statistically
it suffers from all the drawbacks associated with dynamic BTSCS specifications outlined in
Beck and Tucker (1997). Nonetheless, the model has intuitive appeal, and will provide a
point of comparison for the GEE specification.
Results of a dynamic pooled probit specification appear in column four of Table 2.
The first case for each justice was dropped because of missing data on the previous decision
variable; estimates are therefore based on 916 observations. The findings are broadly
similar to previous models, both in terms of the coefficients and their standard errors.
Likewise, the addition of the dynamic term increases the predictive ability of the model
over the earlier pooled model by an appreciable amount. Most important, we see a
significant, positive impact of the prior decision on the justices’ voting behavior; holding
other variables constant at their means, a previously conservative vote increases the
probability of a conservative vote in the present case by 0.24.16 By this initial estimate,
then, evidence of the impact of consistency (in the form of prior votes) in the area of habeas
corpus decisions is apparent.
Finally, I estimate a GEE model of decision making on the same data. As suggested
above, we expect that justices’ decisions in older cases will have less impact on current
decisions than do more recent ones. With this in mind, and in the interest of parsimony, I
specify the working within-observation correlation matrix to follow an AR(1) format: for a
particular justice, the correlation between two decisions at times t and s (t>s) is estimated
as '|t-s|, and is the same across justices. These results are presented in column five of
Table 2.
Coefficient estimates for the GEE specification are for the most part similar to those
in previous models, generally tracking most closely with those derived from the simple
pooled estimator. The variable for Eighth Amendment claims is marginally significant
16
Including an interaction of this variable with justice ideology resulted in a
negative and insignificant coefficient (p = .17, two-tailed) and no substantial changes in the
remaining coefficients. Previous votes do not, at least in this preliminary investigation,
appear to vary systematically in their impact on voting across justices with differing
ideological positions.
20
(p<.05, one-tailed) in the GEE model, providing weak confirmation for the idea that such
claims tend to be “last gasp” efforts of litigants with no other more hopeful avenue for an
appeal. Standard errors for the coefficients are slightly smaller than for previous models,
with the exception of the variable for ideology; this fact conforms with Liang and Zeger’s
(1986) general statement that cluster-specific pooled models overestimate the variability in
time-varying variables (here, all variables save ideology) while underestimating the
variance of time-stationary variables.
The empirical estimate of ' is close to the coefficient estimated in the dynamic
model;17 successive cases correlate conditionally at 0.34, those two cases apart at 0.16, and
so on, suggesting that the impact of a particular habeas corpus vote on similar votes in the
future has a relatively short-memory, and dies out quickly as new cases come along. It is
especially important to note that this effect is conditional on the estimated values of the
coefficients, and vice-versa. Thus, to the extent that individual-level consistency in voting
due to ideological stability is controlled for in the model, the significant, positive estimate
of rho measures the average overall degree of consistency stemming from other factors.
While precedent may be only one of these potential factors, these results are at least
suggestive that justices may place some emphasis on stare decisis in making their
decisions.
Finally, as was the case in the more parsimonious model investigated in section 4,
the GEE consistently underperforms its more commonly-used counterparts in predicting
voting outcomes. Unlike the previous analysis, however, the model does represent a real
improvement over the null, and the extent of difference between the GEE and the others is
of smaller magnitude. This suggests that model specification plays a crucial part in the
predictive ability of estimates obtained via GEE estimation. This is not surprising; as
noted before, the estimates of the within-observation correlations in the GEE are
conditional on the parameter estimates. Thus, to the extent that the model is wellspecified, the auxiliary parameters will also be better estimated. Intuitively, careful model
specification allows the GEE estimator to more accurately parse out the effects on the
17
As noted above, no estimated standard errors may be obtained for this parameter.
21
dependent variable between the coefficients on the independent variable(s) and the
parameters of the working correlation matrix.
6. Conclusion
Judges, and especially justices of the U.S. Supreme Court, do not make their
decisions in a vacuum. Over and above the wide range of factors investigators typically
include in their models of judicial behavior, a host of other influences lead one to the
inescapable conclusion that votes, judgments, and other decisions on collegial courts are far
from independent. That this fact is clearly implied by much of what we know about what
courts do is unsurprising; more surprising is the near-absence of any concerted efforts at
accounting for, let alone modeling, the nature of that dependence.
This paper represents an initial step in that direction. Here I suggest that a class of
generalized estimating equation models developed for investigating panel data in medical
research can be applied to address these issues. In particular, the ability of GEE models to
provide empirical estimates of conditional cross-observation correlations within a specific
unit of observation can and should be exploited by researchers in judicial politics to address
these questions of decision heterogeneity. From the two analyses presented, the potential
usefulness and limitations of the GEE model for studying judicial behavior begin to become
apparent. By explicitly incorporating substantively-driven assumptions about over-time
and cross-justice dependency in justices’ decision making, the GEE model makes more
complete, theoretically-informed use of the information contained in the repeated
“observations” that constitute justices’ decisions in a sequence of cases. This added
information offers the possibility of both more accurate and precise estimation of the
variability of our coefficient estimates, as well as a data-based, empirical measure of the
magnitude of such factors’ impacts on justices’ decisions. Moreover, the GEE’s ability to
provide robust, consistent estimates in the face of uncertainty over the proper specification
of the working correlation matrix suggests that these models may be particularly useful
when the nature of the dependency under investigation is unknown.
At the same time, my results also point out limitations of the GEE approach. For
example, GEE models did not, as a rule, offer the same level of predictive success as their
alternatives examined here, though the differences were attenuated by more fully specified
22
models. This does suggest, however, the importance of model specification in using the
GEE approach. In particular, since the estimates of the “nuisance” correlation parameters
are conditional on the estimated coefficients for the independent variables, problems with
specification will also potentially impact upon the conclusions one draws about intraobservation dependence. This means that, even more than is usually the case, researchers
must be confident in their model specification before drawing any hard conclusions from
the GEE’s auxiliary parameters.
A related issue stems from the GEE’s inability to account for heterogeneity along
more than one “dimension” at a time. In the analyses presented here, for example, I focus
in turn on interjustice influence and temporal dependence, but do not address the two
issues simultaneously. While estimating both temporal and spatial effects is beyond the
scope of the GEE, the model remains a useful tool. For one, it remains an improvement
over other “pooled” models by allowing greater exploitation of information along one of
those axes. Moreover, to the extent that heterogeneity in the observations remains due to
the impact of the other, the GEE allows for the calculation of “robust” (heteroscedasticityconsistent, e.g. White 1980) standard errors for the model estimates. Thus the researcher
may evaluate which specific problem he or she wishes to evaluate explicitly in the GEE
context, then employ robust standard errors to mitigate the effects of heterogeneity due to
the remaining source(s) of variability.
With these caveats, the GEE clearly offers the potential for expanded inquiry into
the area of judicial decision making. Moreover, the method shows promise for application
in other areas of political science as well. For example, in the field of international
relations, dyadic conflict data often exhibit many of the same general varieties of
heterogeneity as the data discussed here, and other researchers have already begun to
investigate the applicability of the GEE in that context (e.g. Beck and Katz 1997). More
broadly, as the availability of time-series cross-section data in political science increases,
GEE methods offer an attractive alternative for dealing with issues of dependence in data
analysis.
23
Bibliography
Altfield, Michael F. and Harold J. Spaeth. 1984. Measuring Influence on the U.S.
Supreme Court. Jurimetrics Journal 24(2):236-247.
Barnhart, Huiman X. and John M. Williamson. 1997. Goodness-of-fit Tests for GEE
Modeling with Binary Responses. Biometrika forthcoming.
Baum, Lawrence. 1986. American Courts. Boston: Houghton-Mifflin.
Baum, Lawrence. 1988. Measuring Policy Change on the U.S. Supreme Court. American
Political Science Review 82(September):905-912.
Baum, Lawrence. 1992. Membership Change and Collective Voting Change in the United
States Supreme Court. Journal of Politics 54(February):3-24.
Baum, Lawrence. 1995. The Supreme Court. Fifth Edition. Washington, D.C.:
Congressional Quarterly.
Beck, Nathaniel. 1996. Reporting Heteroskedasticity Consistent Standard Errors. The
Political Methodologist 7(2):4-6.
Beck, Nathaniel and Jonathan Katz. 1997. The Analysis of Binary Time-Series
Cross-Section Data and/or The Democratic Peace. Paper presented at the Annual
Meeting of the Political Methodology Society, Columbus, OH.
Beck, Nathaniel and Richard Tucker. 1996. Conflicts in Time and Space. Paper presented
at the Annual Meeting of the American Political Science Association, San Francisco,
CA.
Boucher, Robert L., Jr., and Jeffrey A. Segal. 1995. Supreme Court Justices as Strategic
Decision-Makers: Aggressive Grants and Defensive Denials on the Vinson Court.
Journal of Politics 57(August):812-823.
Brace, Paul R. and Melinda Gann Hall. 1993. Integrated Models of Judicial Dissent.
Journal of Politics 55(November):914-935.
Brace, Paul R. and Melinda Gann Hall. 1997. The Interplay of Preferences, Case Facts,
Context, and Rules in the Politics of Judicial Choice. Journal of Politics
59(November):1206-31.
Brenner, Saul and Marc Steir. 1996. Retesting Segal and Spaeth’s Stare Decisis Model.
American Journal of Political Science 40(November):1036-48.
24
Brisbin, Richard A. 1996. Slaying the Dragon: Segal, Spaeth and the Function of Law in
Supreme Court Decision Making. American Journal of Political Science
40(November):1004-17.
Butler, J. S. and Robert Moffitt. 1982. A Computationally Efficient Quadrature Procedure
for the One-Factor Multinomial Probit Model. Econometrica 50(May):761-64.
Caldeira, Gregory A. and John R. Wright. 1996. Nine Little Law Firms: Agenda-Building
on the Supreme Court. Manuscript: Ohio State University.
Caldeira, Gregory A., John R. Wright and Christopher J. W. Zorn. 1996. Strategic Voting
and Gatekeeping in the Supreme Court. Paper presented at the First Annual
Conference on the Scientific Study of Law and Courts, St. Louis, MO.
Caldeira, Gregory A., John R. Wright and Christopher J. W. Zorn. 1997. Sophisticated
Judicial Behavior: Agenda-Setting Via the Discuss List. Paper presented at the
Annual Meeting of the American Political Science Association, Washington, D.C.
Epstein, Lee, and Carol Mershon. 1996. Measuring Political Preferences. American
Journal of Political Science 40(February):261-94.
Fitzmaurice, Garrett M., Nan M. Laird, and Andrea G. Rotnitzky. 1993. Regression
Models for Discrete Longitudinal Responses. Statistical Science 8(3):284-309.
Flango, Victor E. 1994. Habeas Corpus in State and Federal Courts. Williamsburg, VA:
National Center for State Courts.
Flemming, Roy B. and B. Dan Wood. 1997. The Public and the Supreme Court: Individual
Justice Responsiveness to American Policy Moods. American Journal of Political
Science 41(April):468-98.
Gibson, James L. 1983. From Simplicity to Complexity: The Development of Theory in the
Study of Judicial Behavior. Political Behavior 5(1):7-49.
Hall, Melinda Gann. 1992. Electoral Politics and Strategic Voting in State Supreme
Courts. Journal of Politics 54(May):427-46.
Heyde, Christopher C. 1997. Quasi-Likelihood And Its Application: A General Approach to
Optimal Parameter Estimation. New York: Springer-Verlag.
Hopkins, W. Wat. 1991. Mr. Justice Brennan and Freedom of Expression. New York:
Praeger.
Hsiao, Cheng. 1996. Logit and Probit Models. In László Mátyás and Patrick Sevestre,
eds. The Econometrics of Panel Data: A Handbook of the Theory with Applications.
2nd Revised Ed. Dordrecht: Kluwer Academic Publishers.
25
King, Gary. 1989. "Event Count Models for International Relations: Generalizations and
Applications." International Studies Quarterly 33:123-47.
Knight, Jack and Lee J. Epstein. 1996. The Norm of Stare Decisis. American Journal of
Political Science 40(November):1018-35.
Liang, Kung-Yee and Scott L. Zeger. 1986. Longitudinal Data Analysis Using Generalized
Linear Models. Biometrika 73(1):13-22.
Maltzman, Forrest and Paul J. Wahlbeck. 1996a. May it Please the Chief? Opinion
Assignments in the Rehnquist Court. American Journal of Political Science
40(May):421-43.
Maltzman, Forrest and Paul J. Wahlbeck. 1996b. Strategic Policy Considerations and
Voting Fluidity on the Burger Court. American Political Science Review
90(September):581-92.
Mason, Alpheus T. 1956. Harlan Fiske Stone, Pillar of the Law. New York: Viking.
Mason, Alpheus T. 1964. William Howard Taft: Chief Justice. New York: Simon and
Schuster.
McCullagh, P. and J. A. Nelder. 1983. Quasi-Likelihood Functions. Annals of Statistics
11(1):59-67.
McCullagh, P. and J. A. Nelder. 1989. Generalized Linear Models. 2nd Ed. London:
Chapman and Hall.
Murphy, Walter F. 1964. Elements of Judicial Strategy. Chicago: University of Chicago
Press.
Naft, Erik and Christopher J. W. Zorn. 1998. The Supreme Court and Habeas Corpus: An
Integrated Model. Manuscript: Emory University.
Nelder, J. A. and R. W. M. Wedderburn. 1972. Generalized Linear Models. Journal of the
Royal Statistical Society B 54(1):3-40.
Neuhaus, J. M., J. D. Kalbfleisch and W. W. Hauck. 1991. A Comparison of ClusterSpecific and Population-Averaged Approaches for Analyzing Correlated Binary
Data. International Statistical Review 59(1):25-35.
Pendergast, J. F., S. J Gange, M. A. Newton, M. P. Lindstrom and M. R. Fisher. 1996. A
Survey of Methods for Analyzing Clustered Binary Response Data. International
Statistical Review 64(1):89-118.
26
Prentice, Ross L. and L. P. Zhao. 1991. Estimating Equations for Parameters in Mean and
Covariances of Multivariate Discrete and Continuous Responses. Biometrics
47:825-839.
Rohde, David W. And Harold J. Spaeth. 1976. Supreme Court Decision Making. San
Francisco: W. H. Freeman and Co.
Segal, Jeffrey A. 1986. Supreme Court Justices as Human Decision Makers: An
Individual Level Analysis of the Search and Seizure Cases. Journal of Politics
47(November):938.
Segal, Jeffrey A., and Albert J. Cover. 1989. Ideological Values and the Votes of U. S.
Supreme Court Justices. American Political Science Review 83(June):557-566.
Segal, Jeffrey A., and Harold J. Spaeth. 1993. The Supreme Court and the Attitudinal
Model. New York: Cambridge University Press.
Segal, Jeffrey A. and Harold J. Spaeth. 1996. The Influence of Stare Decisis on the Votes
of United States Supreme Court Justices. American Journal of Political Science
40(November):971-1003.
Segal, Jeffrey A., Lee Epstein, Charles M. Cameron, and Harold J. Spaeth. 1995.
Ideological Values and the Votes of U. S. Supreme Court Justices Revisited.
Journal of Politics 57(August):812-823.
Songer, Donald R. and Sue Davis. 1990. The Impact of Party and Region on Voting
Decisions in the United States Courts of Appeals, 1955-1986. Western Political
Quarterly 42(June):317-334.
Songer, Donald R. and Susan Haire. 1992. Integrating Alternative Approaches to the
Study of Judicial Voting: Obscenity Cases in the U.S. Courts of Appeals. American
Journal of Political Science 36(November):963-82.
Songer, Donald R. and Stefanie A. Lindquist. 1996. Not the Whole Story: The Impact of
Justices’ Values on Supreme Court Decision Making. American Journal of Political
Science 40(November):1049-63.
Spaeth, Harold J. 1997. The United States Supreme Court Judicial Database, 1953-1995.
Ann Arbor: Inter-University Consortium for Political and Social Research.
Spaeth, Harold J. and Michael F. Altfield. 1985. Influence Relationships Within the
Supreme Court: A Comparison of the Warren and Burger Courts. Western Political
Quarterly 37(March):70-83.
27
Spiller, Pablo T., and Matthew L. Spitzer. 1995. Where Is the Sin in Sincere?
Sophisticated Manipulation of Sincere Judicial Voters (With Applications to Other
Voting Environments). Journal of Law, Economics, and Organization 7(1):32-63.
Wahlbeck, Paul J., James F. Spriggs II and Forrest Maltzman. 1998. Marshaling the
Court: Bargaining and Accommodation on the U.S. Supreme Court. American
Journal of Political Science 42(January):294-315.
White, Halbert. 1980. A Heteroscedasticity-Consistent Covariance Matrix and a Direct
Test for Heteroscedasticity. Econometrica 48:817-838.
Wedderburn, R. W. M. 1974. Quasi-Likelihood Functions, Generalized Linear Models, and
the Gauss-Newton Method. Biometrica 61:439-447.
Zeger, Scott L. and Kung-Yee Liang. 1986. Longitudinal Data Analysis for Discrete and
Continuous Outcomes. Biometrics 42(1):121-130.
Zellner, A. 1962. An Efficient Method of Estimating Seemingly Unrelated Regressions and
Tests of Aggregation Bias. Journal of the American Statistical Association 58:977992.
28
Table 1
Model Comparisons for Civil Liberties Voting on the Burger
Court, OT1981 - 1985
Pooled
Probit
Fixed- Effects
Probit*
GEE Model,
Exchangeable
GEE Model,
Pairwise
(Constant)
0.949
(0.026)
1.163
(0.062)
0.959
(0.040)
0.709
(0.041)
Justice’s
Conservatism
-1.511
(0.049)
-1.916
(0.086)
-1.427
(0.035)
-0.775
(0.039)
'
-
-
0.484
-
lnL
-3814.85
-3608.64
n/a†
n/a†
Variables
Note: N = 6572. Numbers in parentheses are estimated standard errors. The dependent
variable is the justices’ votes in civil liberties cases, coded 0 for liberal votes and 1 for
conservative votes. *Due to collinearity in the ideology variable, Justice Marshall was not
estimated separately. †Since GEE models are estimated via a quasi-likelihood procedure,
the results are not directly comparable to more conventional MLE models.
Estimated Matrix of Intracase, Interjustice Correlations, Pairwise GEE Model
Burger
Brennan
White
Marshall
Blackmun
Powell
Rehnquist
Brennan
0.112
White
0.606
0.139
Marshall
0.084
0.873
0.122
Blackmun
0.486
0.580
0.533
0.600
Powell
0.608
0.262
0.597
0.225
0.681
Rehnquist
0.558
0.018
0.502
-0.022
0.313
0.499
Stevens
0.385
0.578
0.451
0.536
0.839
0.559
0.306
O’Connor
0.641
0.101
0.616
0.057
0.497
0.658
0.584
29
Stevens
0.428
Table 2
Model Comparisons for Supreme Court Voting in
Habeas Corpus Cases, 1953-1995
Pooled
Probit
Fixed
Effects
Probit*
Random
Effects
Probit
Pooled
Dynamic
Probit
Autoregressive
GEE Probit
(Constant)
0.824
(0.122)
0.985
(0.249)
1.086
(0.166)
0.315
(0.147)
0.709
(0.132)
Ineffective Counsel
-0.572
(0.169)
-0.639
(0.180)
-0.650
(0.179)
-0.618
(0.176)
-0.471
(0.149)
Trial Court Error
-0.006
(0.129)
-0.016
(0.136)
-0.015
(0.135)
0.045
(0.133)
0.090
(0.109)
Fourth Amendment
-0.084
(0.193)
-0.083
(0.203)
-0.094
(0.201)
-0.007
(0.202)
-0.027
(0.176)
Fifth Amendment
-0.240
(0.144)
-0.252
(0.153)
-0.254
(0.152)
-0.227
(0.149)
0.015
(0.122)
Sixth Amendment
-0.124
(0.167)
-0.128
(0.179)
-0.122
(0.178)
-0.197
(0.176)
-0.125
(0.150)
Eighth Amendment
0.025
(0.157)
0.051
(0.167)
0.032
(0.166)
0.105
(0.163)
0.236
(0.140)
Multiple Petitions
0.499
(0.141)
0.602
(0.151)
0.590
(0.150)
0.633
(0.147)
0.673
(0.123)
U.S. as a Party
0.436
(0.140)
0.554
(0.153)
0.547
(0.151)
0.543
(0.149)
0.559
(0.131)
Per Curiam Opinion
0.059
(0.105)
0.111
(0.113)
0.104
(0.112)
0.056
(0.108)
-0.036
(0.101)
Justice Ideology
-1.716
(0.138)
-2.002
(0.382)
-2.268
(0.222)
-1.396
(0.151)
-1.721
(0.192)
Justice’s Votet-1
-
-
-
0.621
(0.098)
-
'†
-
-
0.199
(0.036)
-
0.338
(n/a)
% Predicted Correctly
66.81
73.15
64.93
68.89
62.54
PRE
29.7
43.2
25.8
34.1
20.7
lnL
-566.863
-502.089
-525.845
-521.113
n/a‡
961
961
961
916
961
Variables
N
Note: Numbers in parentheses are standard errors. * The fixed-effects probit model includes 22 justice-specific
constant terms (not shown); additionally, six justices were dropped from the estimation due to perfect
agreement on their ideology scores. †' is the estimated intrajustice correlation in the random component for
the random-effects model, and the estimated AR(1) nuisance parameter for the GEE model. ‡ Since GEE
models are estimated via a quasi-likelihood procedure, the results are not directly comparable to more
conventional MLE models.
30
Figure 1
Impact of Ideology on Probability of a Liberal Vote for Four Models
in Civil Liberties Cases, OT1981-1985
.9
.8
.7
.6
.5
.4
.3
.2
0
.25
.5
31
.75
1
Download