Stairway to heaven or highway to hell? A skeptical view on

advertisement
Stairway to heaven or highway to hell?
A skeptical view on combining regression analysis and
case studies
- draft, 3/27/2008 –
Paper to be presented at the ECPR Joint Sessions, Rennes, 11-16 April 2008
Workshop on
“Methodological Pluralism? Consolidating Political Science Methodology“
Ingo Rohlfing, PhD
Dr. Peter Starke
Research Associate
Research Associate
Department of Management, Economics and
Collaborative Research Center 597
Social Sciences
“Transformations of the State”
University of Cologne
University of Bremen
rohlfing@wiso.uni-koeln.de
peter.starke@sfb597.uni-bremen.de
1
1. Introduction
Mixed-method approaches, triangulation, nested design – those are the terms that have been
used to describe – and often advocate – methods that combine small-N, case study work and
large-N, statistical techniques of analysis (Symposium, 2007). After years of mutual
ignorance, skepticism, or even hostility between the two camps of qualitative and quantitative
researchers in political science, a growing number of scholars now takes the view that
methodological pluralism and, indeed, the combination of various methods is the way forward
(see, for example, Bäck and Dumont, 2007; Bennett, 2002; Capoccia and Freeden, 2006;
Coppedge, 1999; Lieberman, 2005; Tarrow, 1995). We share this view but, in this paper, will
point to a number of problems of ’mixing methods’ in practice and the combination of
process-tracing case studies and multivariate regression in particular. More specifically, this
paper addresses several serious problems which affect precisely the case selection procedures
that have been advanced in the methodological literature as ‘good practice’.
In what follows, we concentrate on three research designs in particular: typical case,
deviant case and pathway case. In contrast to what is argued in the literature, we contend that
there is no simple way to identify genuine deviant or typical cases. The same goes for the
recently developed pathway case technique (Gerring, 2007b), which largely derives its logic
from the more long-standing deviant-case design. The baseline of our paper is that the modeldependence of case selection should move to the center of the discussion. While there is
considerable literature on the model-dependence of statistical results (e.g., Bartels, 1997; Ho,
Imai, King and Stuart, 2007; King and Zeng, 2007), there is almost no reflection about how
this dependence affects the choice of cases and within-case analyses. On the basis of our
criticism of the existing techniques, we suggest a robust case selection technique that allows
us to select deviant and typical cases with more confidence. Our point is not to argue that
multi-method work is not advisable per se, but to point to the problems and unresolved issues
that may, in some cases, seriously endanger the validity of the causal inferences. Put
differently, the whole may not be more, but less than the sum of its parts (Dunning, 2007;
Rohlfing, 2008).
The paper is structured as follows: We first present what has emerged as something
like the ‘canon’ of case selection in multi-method designs. We particularly focus on mixedmethod designs aimed at uncovering spurious empirical relationships and at identifying
omitted variables. We then go on to criticize the case selection techniques that have been
advised in this context. The deviant case analysis and its more recent ‘enhanced version’, the
pathway case, are at the centre of our argument. In the fourth section, we generalize our
2
critique and look at what we call the ‘model-dependence’ of case selection and its various
forms. We present, in section five, a procedure to select ‘robust’ typical or deviant cases that
explicitly takes some of our criticisms into account. The last section concludes.
2. Why and how to select cases in regression analysis?
Before addressing some of the problems of case selection and analysis in mixed-methods
designs, we should ask for the main reasons for using mixed methods in the first place. Why
should quantitative large-N techniques – usually multivariate regressions – be supplemented
with less technical methods based on a much smaller number of units (cf., Beck, 2006)? Or,
conversely, why should qualitative scholars base their case selection on large-N analyses?
Don’t they already have enough detailed knowledge about ‘their’ cases to be able to develop a
workable research design without having to rely on quantitative methods and its numerous
and demanding underlying assumptions (Ebbinghaus, 2005)?
A variety of reasons for using mixed methods have been advanced in the literature.
For instance, after having carried out an in-depth study of a single case, a scholar may have
reason to believe that his or her conclusions can be generalized (cf., Rueschemeyer, 2003).
The validity of the small-N conclusions can then simply be tested on a larger number of cases
in a regression analysis. An exploratory case study is thus combined with a confirmatory
large-N analysis (Lieberman, 2005). In this paper, however, we focus on two specific mixedmethods research designs and case selection techniques usually associated with them. These
are, first, testing for spuriousness and the identification of within-case causal processes and,
second, finding omitted variables.
On the first count, process tracing can be used to identify the causal processes
operating at the within-case level, presuming that a causal effect is underpinned by a causal
process (Tarrow, 1995: 472). Indeed, testing for spuriousness and the presence of causal links
is one major purpose of process tracing (George and Bennett, 2005: 35; Lieberman, 2005:
444). The most basic form of spuriousness is present when an X/Y-relationship identified
through large-N analysis is non-causal because both X and Y depend on a common
background variable Z (Simon, 1954). Process tracing thus looks for the presence and nature
of causal pathways linking X and Y and, if there is no causal process, whether a third variable
Z causes both X and Y. It might even be the case that X and Y correlate with each other, but
there is neither a causal link between them, nor is there an antecedent variable Z accounting
for the correlation. An example for this variant of spuriousness can be found in Thomson’s
3
(2007) regression analysis of the compliance of 15 European Union (EU) member states with
six labor market directives. One independent variable is the degree of centralization of the EU
countries. The hypothesis is that an increasing degree of decentralization makes the timely
implementation of directives less likely, which is supported by the (large-N) results. However,
a qualitative analysis of the directives shows that the sub-federal level was not involved at all
in the implementation stage (Falkner, 2007). Thus, the correlation between the
decentralization variable and the dependent variable is spurious.
If there is a causal process, one can inductively generate hypotheses about how a
certain X causes Y (Lijphart, 1971: 692). Alternatively, one can proceed deductively by
developing theoretical expectations about causal within-case patterns that are then tested
empirically through ‘pattern matching’, that is, by contrasting theoretical expectations with
empirical observations.1 In these variants, process tracing is particularly useful for the
analysis of overdetermined causal relationships at the cross-case level, that is, several theories
predict the same X/Y-relationship. Ideally, we have several rival theories and can assume that
all, except one, are spurious. Small-N techniques then help to identify the one theory that
causally explains the outcome. Process tracing is suitable for such purposes because rival
theories are usually based on rival mechanisms that we can try to uncover at the within-case
level through pattern matching. 2 For instance, the reciprocal exchange of concessions in
international trade can be explained through the politicians’ concerns about national welfare,
national security, and the lobbying of private economic actors (cf., Bhagwati, 2002; Gowa and
Mansfield, 1993, 2004). Thus, observing reciprocity does not suggest which of the three
approaches really explains trade cooperation. According to the security account, however, the
military should play an important role in the trade policy process and lobbying of economic
actors should be irrelevant. The domestic politics explanation makes the opposite predictions,
thus allowing it to clearly discriminate between the two theories at the within-case level.3
1
Another purpose of process tracing is to develop comprehensive explanations for a small number of cases
(Mahoney and Goertz, 2006), which is the key feature of Comparative Historical Analysis (CHA) (Mahoney and
Rueschemeyer, 2003). In terms of causal perspectives, one can say that case studies in regression analysis are Xcentered, i.e., one is interested the effects of certain independent variables. On the other hand, case studies in the
realm of CHA are Y-centered insofar as they are interested in understanding a specific outcome.
2
Obviously, this view is based on the assumption that causal mechanisms are observable. There is, however, no
scholarly agreement on this point (cf., Gerring, 2007c). For the sake of our argument, however, that issue is not
crucial.
3
One important precondition for pattern-matching is that theories are sufficiently specified in terms of their
underlying causal mechanisms, which is often not the case. Strictly speaking, in order to do deductive process
tracing correctly, we need to ‘establish an uninterrupted causal path linking the putative causes to the observed
effects, at the appropriate level(s) of analysis as specified by the theory being tested’ (George and Bennett, 2005:
222). Spelling out the expected empirical implications of a theory in a fine-grained manner, however, is no
trivial task and may itself be based on a number of contestable assumptions.
4
Now, what case(s) should be selected for pattern-matching in a mixed-method design?
Prima facie, it is argued that one should select a ‘typical case’ or ‘onlier’ (Eckstein, 1975:
108; Gerring, 2007a: 91-97). In statistical terms, a typical case is defined by a low residual in
a regression model. Statistically typical cases are believed to be theoretically typical too. This
means that they no omitted variables should be in place and that they are best suited to test the
within-case implications of one or multiple theories. For example, Lieberman (2003) performs
a regression analysis of tax structures and picks South Africa as a typical case so as to
perform a within-case analysis (South Africa is additionally compared with the Brazil, which
is an outlier).
A second important aim of mixed-method designs is the search for omitted variables.
The most prominent case selection technique in this context is the ‘deviant case analysis’
which is done in an exploratory fashion (Eckstein, 1975: 110). In Arend Lijphart’s classic
definition (1971: 692) deviant case analyses are ‘studies of single cases that are known to
deviate from established generalizations’. The aim is ‘to uncover relevant additional variables
that were not considered previously or to refine the (operational) definitions of some or all of
the variables’.4 This mode of research is at present probably more widely acknowledged as a
key feature of case studies than pattern-matching. In fact, some of the more respected casestudy designs are based on the idea that within-case analysis may help in the search for
variables previously not considered relevant (Rogowski, 1995). A famous example of a
deviant case analysis is Lijphart’s (1968)own study of the Netherlands. Whereas it was
previously thought that democratic consolidation in segmented societies is impossible when
cross-cutting cleavages between groups are absent, Lijphart demonstrated that this is not
necessarily the case. The Netherlands, a ‘pillarized society’ marked by high segmentation into
a small number of religious/ideological groups nonetheless had a history of stable democracy.
He explained this particular outcome by highlighting the ‘politics of accommodation’ at the
elite level, a variable which he thought may enrich the existing pluralist theories of
democracy. Lijphart thus went beyond explaining the single case of the Netherlands as an
historical curiosity and towards the development of more general propositions (to be tested in
later studies).
In statistical terms, we are thus looking for cases with a high residual, or ‘outliers’
(Gerring, 2007a: 105-108). Contrary to what is often believed, quantitative analysis is not
blind to the issue of deviant cases and case knowledge in more general. In a textbook
4
The second part of the quote refers to the refinement of concepts and indicators. This is an important task of
within-case analysis (Adcock and Collier, 2001), which we cannot address in our paper in more detail.
5
regression, the researcher is equipped with within-case insights that inform the specification
of the model (Achen, 2005; Beck, 2006). Given that deviant cases pose problems for proper
regression estimation, sound knowledge of cases particularly extends to deviant cases (cf.,
King and Zeng, 2007).5 The rationale for the choice of statistically atypical cases, that is,
cases with a high residual, is that they are expected to be theoretically deviant as well. It is
precisely this theoretical deviance one aims to resolve by discerning the variables keeping the
case away from the regression surface. Because of this, deviant cases are considered
inappropriate for pattern-matching (George and Bennett, 2005: 20-21; Gerring, 2007a: 105106), which is what should be done with typical cases as described above.
One serious problem of both typical case and deviant case analysis is the issue of
systematic and non-systematic variables, which we think it is a crucial problem when it comes
to integrating quantitative and case study methods. In the quantitative literature, a variable is
considered important when it has a systematic effect, that is, when the causal effect is
different from zero in a large number of cases (King, Keohane and Verba, 1994: 76-85).6
Thus, “systematic” and “non-systematic” are cross-case properties of variables that are
necessarily impossible to identify in small-n research. Finding omitted factors should not be a
problem in a case study because the empirical picture is much more complex than the one
captured by the model (Eckstein, 1975: 107; Geddes, 2003: Ch. 3). However, the tricky task is
to separate the idiosyncratic factors – e.g., the factors explaining stable democracy in the
Netherlands and only there – from the ones that are relevant at the cross-case level – e.g.,
factors that explain democratic consolidation in divided societies more generally. Due to the
larger number of cases, regression analysis is more suitable for separating systematic from
non-systematic variables. Yet, as emphasized earlier, the systematic relationships may still be
non-causal, that is, spurious. Conversely, while the advantage of small-N analysis is its ability
to detect spuriousness, the corresponding disadvantage is the difficulty to detect nonsystematic variables through case-study analysis.
What does this mean for the problem of selecting deviant and typical cases? In the
case of the deviant case, we may well find a cause that pulls the case away from the
regression surface, but without looking at a larger number of cases, we simply cannot know if
the causal factor we have found is systematic or not. With respect to the typical case, there
5
We readily acknowledge that the practice of handling outliers often deviates from the textbook advice, which
is, probably true for all methods. In the case of regression estimation, one may dispense a deviant-case analysis
by running a robust regression that diminishes the influence of the case on the results (Berk, 1990; Western,
1995) or eliminate the case altogether. The latter strategy actually is viable if it can be shown that the case should
not belong to the population. However, this requires a within-case analysis in the first place.
6
Of course, a variable with a systematic effect must also be substantively relevant, meaning that it must make
theoretical sense to include it in the model at hand.
6
may equally be non-systematic factors that transform a theoretically atypical case into an
empirically typical case, i.e., one with a low residual. In other words, a high residual alone
does not suggest whether a typical or deviant case is not just statistically but also theoretically
typical or deviant.
Notwithstanding this criticism, we still believe that it makes sense to select cases on
the residual. However, we would like to sound a note of caution with respect to some of the
strong theoretical assumptions regarding the status of a deviant or typical case. A deviant-case
analysis should be performed with the knowledge that the acquired process tracing evidence
can only deliver clues about another model specification, the appropriateness of which needs
to be determined through large-N analysis and diagnostics. Another implication of the
impossibility of case studies to credibly identify theoretically typical and deviant cases is that
the large-n method should provide the best possible context for a within-case analysis. This
means that the estimated residuals should be as valid indicators as possible for the theoretical
status of a case. In the following sections, we argue that this is less often case than is
acknowledged in the literature.
3. Outliers, pathway cases, and the statistics of case
selection
In the previous section, we have detailed the standard approach toward the choice of cases in
regression analysis. Recently, the pathway case procedure has been proposed as a somewhat
more sophisticated variant of the conventional account (Gerring, 2007b). The rationale for the
choice of pathway cases is to achieve the best possible context for the inductive development
of hypotheses on causal process and pattern-matching.7 The search for pathway cases begins
with the identification of the model thought to be free of misspecification errors and which
displays a good performance.8 After having selected the model, the variable one wants to
make subject of a within-case analysis is dropped from the equation. The reduced model is
then estimated and the residuals for all cases are computed. In the next step, one identifies
those cases that are typical in the full model and for which the absolute value of the reducedmodel residual is larger than the absolute value of the full-model residual. In other words, we
should look for cases for which the inclusion of the additional variable in the full model
7
The pathway approach can be applied to the classic two-case comparisons with binary variables and continuous
variables in regression analysis. We limit the following discussion to the latter type because of our interest in
regression analysis and residuals as the basis for case selection.
8
Selecting a pathway case without having a well-performing model in the first place makes no sense because
there is little value in examining independent variables without considerable explanatory power. The precise
criterion for model-performance and selection – like the Akaike and Bayesian Information Criterion (cf., Kuha,
2004) – is not relevant for the point we make.
7
makes a big difference in terms of pulling them towards the regression surface. The set of
cases satisfying these criteria are the pathway cases. Within this set, one should choose the
one case with the largest difference between the residuals of the reduced and the full model
(Gerring, 2007b: 242-243).9 After having selected the appropriate case, one proceeds with a
process tracing analysis of the X/Y relationship of interest. As we explained above, this can
be done in an exploratory fashion, i.e., by discerning how X actually produces Y, or
deductively, which requires theorizing in advance of the within-case analysis about competing
causal mechanisms (George and Bennett, 2005: Ch. 10).
As a matter of fact, the pathway case technique is a formalization of a longstanding
argument about the analysis of deviant cases. When a variable previously identified in an
outlier is added to a model, the case should be a typical case in the expanded model. The
reason is that a variable that is relevant at the cross-case level should add explanatory power
so that the cases move closer to the regression surface on average. The pathway case
technique takes the opposite starting point. It drops a variable from a model that seems wellspecified and then picks the case that is most deviant in the reduced model. Because of the
intimate link between the pathway case procedure and the established deviant-case analysis,
our criticism of the former automatically applies to the latter as well in ways that we detail at
the end of this section.
The rationale for the pathway case technique, and case selection based on regression
analysis more generally, is that the residuals capture the causal effect of unmeasured
variables. With respect to the pathway case, this means that the difference between the fullmodel and reduced-model residual can be attributed to the variable that is dropped from the
model. We argue that this interpretation of differences in residuals is misleading because it
ignores the adverse statistical effects of omitting variables in regression estimation. In this
context, we want to emphasize that we do not claim to provide innovative statistical insight
because the statistics on which our critic is based is basic in quantitative analysis. Instead, our
argument is that in the realm of mixed-method designs, the seemingly intuitive manipulation
of regression analysis through the lenses of case study analysis is fallacious because of
inherent incompatibilities of quantitative and qualitative research.
In the best of all worlds, the included independent variables are orthogonal, that is,
completely uncorrelated. In practice, however, this is almost never the case (Gujarati, 2004:
9
The formula summarizing the procedure is: Pathway = |Resreduced-Resfull|, if |Resreduced| > |Resfull| (Gerring,
2007b: 243).
8
513), so we discuss the more realistic case of multicollinear independent variables here.10
Multicollinearity is present when some variance of an independent variable can be modeled
through a linear combination of the other independent variables. The presence of
multicollinearity is the rule in multivariate analysis and it is the more severe, the larger the
number of independent variables. With respect to case selection, collinearity is a two-fold
problem. First, the estimated coefficients in the full model may change in size and may even
switch signs as compared to the identical model with no multicollinearity (Fox, 1991: 11).
Since the choice of pathway case also depends on the accuracy of the full-model residuals, it
is obvious that this problem undermines case selection.
Second, the causal effect of the independent variable one drops from the full model is
not fully absorbed by the residuals. The stronger is the degree of multicollinearity, the larger
the share of the causal effect that will be absorbed by the other independent variables.
Technically seen, the omission of a multicollinear variable from a model renders the
estimators of the remaining independent variables biased and inconsistent (Gujarati, 2004:
510-511). As a consequence of that, the estimated reduced-model coefficients are
systematically different from the true coefficients that we need to know in order to obtain
meaningful residuals. More specifically, the reduced-model residuals capture the causal
influence of the eliminated independent variable plus a specification error with ambiguous
effects on the regression output. Thus, case selection takes place under uncertainty about the
interpretability of the reduced-model residuals and is prone to faulty case selection. Because
of these problems of collinearity, we argue that the pathway procedure is inherently unable to
be useful for what it has been designed.11
While multicollinearity is the more severe problem, we want to add that the pathway
approach is questionable even if the variables are orthogonal. In this constellation, the good
thing is that the estimation of the full model is not undermined and the estimators of the
reduced-model coefficients are unbiased. However, the estimator of the intercept of the
regression surface remains biased (Gujarati, 2004: 511). Since an accurate estimation of the
intercept is as important as the correct estimation of the coefficients for the identification of
pathway cases, it can be seen that the omission of a variable is a general problem.
10
One may argue that our critique is somewhat unfair because multicollinearity is a specification problem that is
not characteristic for a “true model”, the identification of which is considered a prerequisite for the pathway
technique. However, we do not see much value in discussing a technique that is too detached from real-world
empirical analyses because multicollinearity is a pervasive problem. Thus, we evaluate the pathway approach in
the presence of multicollinearity.
11
On a more general level, the implications of multicollinearity show that case selection is model dependent,
which is what we address in more detail in the following section.
9
Ultimately, the problem for case selection is that one may choose a wrong case. We
define a wrong case as a case the observed status of which is different from the true status.
This means a truly typical case appears as deviant and vice versa. Because of such
discrepancies, one may select a true outlier for pattern-matching because of the belief that the
case is typical. Similarly, it is conceivable that a truly typical case is selected for an
exploratory within-case analysis searching for omitted variable. When a wrong case is chosen,
process tracing will be based on false premises and may undermine the generation of valid
causal inferences. Of course, not all cases have the wrong status in the pathway procedure.
However, there is some unknown potential for erroneous case selection when the estimated
model is not the correct one, thus introducing uncertainty in the validity of case selection.
At the beginning of this section we explained that the pathway procedure takes the
reverse view on an established deviant-case argument. According to this, an outlier should
become a typical case when the model is expanded by a variable that has been identified
earlier in the within-case analysis of an outlier. We believe that this perspective is deficient,
too, because a decreasing residual should not come as a surprise. In general, more variables
tend to capture more variance on the outcome and the cases are closer to the regression
surface on average. An increase in model-fit may be spurious because the variable that is
relevant in the deviant case may be non-systematic in the whole set of cases. Thus, it is
essential to run the appropriate diagnostics for overfitting like Hausman specification tests on
the expanded model. If the originally omitted variable is indeed systematic, the test results for
overfitting should be negative. This finding can be strengthened even further by running tests
for underfitting on the original model. If these tests are negative and the test for overfitting on
the expanded model is positive, there is good reason to believe that the within-case evidence
is particular to the outlier.
An additional problem we see with the outlier approach corresponds closely with our
critique of the pathway case. When a systematic variable is missing in the original model, the
residuals carry the effect of this variable as well as a specification error. A case with a large
residual may also be a theoretical outlier and therefore suitable for an exploratory within-case
analysis. However, it is also conceivable that the residual is an artifact deriving from a
misspecified regression model producing misleading coefficients. In this instance, the case is
theoretically typical, not atypical, and it only appears as a statistical outlier because of the
adverse effects of the ignored variable on regression estimation. If one would pick such a case
for the search of omitted variable, one selects a wrong case and performs process tracing on
10
false premises because the case actually is a theoretical onlier. To conclude, we believe that
the traditional deviant-case perspective and the pathway technique are less appropriate for
what they were developed than is currently argued in the literature.
4. The model-dependence of case selection
The particular problem of the pathway technique and the deviant case analysis is that cases
may not have the status they should have because of an inherent misspecification of the
model. In this section, we generalize this argument by highlighting the model-dependence of
case selection. Put simply, the problem can be summarized as one of “how the model you
choose affects the cases you get”. We base our discussion in this section on the analysis of
welfare state expenditure data. The analyses serve to illustrate our methodological claims.
They do not imply any substantive claims on welfare state development or the veracity of the
models we estimate.
Assume that we aim to assess the explanatory power of a simple linear-additive model
having as independent variables the share of elderly people (65 years and older), Gross
Domestic Product (GDP), and the share of unemployed people. The outcome we want to
explain is welfare state expenditure as a share of the GDP. The dataset comprises observations
for 21 OECD member states for the years 1980, 1990, and 2000, totaling 63 observations. We
estimate two models: one taking logged welfare state expenditure as the dependent variable
and one taking the non-transformed data as the outcome. The logged expenditure model
performs somewhat better according to the AIC and BIC (output not reported here), so a
“mindless quantitative researcher” (Beck, 2006) would select this model without running any
diagnostics. Afterwards, all cases within one standard deviation above or below the regression
line are classified as typical.12 While the model performs more or less well when using the
transformed dependent variable, it is a contestable choice. The original data yields better
results in the Shapiro-Wilk W test for the normal distribution of the dependent variable. The pscore for the logged data is .11 and the result for the ordinary data is .60.13 In this respect, the
model drawing on the non-transformed data is superior. The result for the transformed data is
not significant at the conventional .05 level, which would be a strong sign for a non-normal
distribution. It is, however, sufficiently close to create strong doubts about the suitability of
12
The assignment of cases to the set of typical and deviant cases evidently depends on where the boundary is
drawn. The sensitivity of case classification to the specification of the dividing line is an important issue and
affects case selection too. However, we cannot pursue this topic further here.
13
Of course, our simple model suffers from additional specification problems. Recall that the example is purely
illustrative.
11
the transformed data and to consider the non-transformed model the better one (leaving aside
all other specification issues).
The decision between the original and transformed data is important because of the
status of a case as typical and deviant may be sensitive to the model one estimates.14 Table 2
details the cases that are outliers in the two models.15 Cases that are deviant in both analyses
are printed bold. All cases that are not bold are outliers only in one of the two models. Cases
that are not included at all are typical irrespective of the data used.
Table 1: Sensitivity of case selection to the estimated model
deviant cases exp model
Denmark, 1980
Finland, 1990
Greece, 1980
Ireland, 1990
Ireland, 2000
Japan, 1990
Japan, 2000
The Netherlands, 1980
New Zealand, 1990
Sweden, 1980
Sweden, 1990
Sweden, 2000
Switzerland, 1990
Uuited States, 1990
deviant cases logexp model
Austria, 1980
Denmark, 1980
Finland, 1990
Greece, 1980
Ireland, 1990
Ireland, 2000
Japan, 1990
Japan, 2000
The Netherlands, 1980
New Zealand, 1990
Portugal, 1980
Spain, 2000
Sweden, 1980
Sweden, 1990
Switzerland, 1990
Uuited States, 1990
We call those cases that have the same status in both regressions robust cases because their
classification is insensitive to the estimated model. 13 out of 16 outliers in the regression
operating with the transformed data are robust deviant cases, denoting that they are outliers in
both estimated models. Consequently, all cases that are listed in neither column of the table
are robust typical cases. Furthermore, it can be seen that four cases are non-robust. Austria
and Portugal in 1980 and Spain in 2000 are outliers in the wrong model and typical in the
correct regression drawing on the non-transformed data. The reverse constellation holds for
Sweden in 2000.
Similarly to what we have discussed above, the presence of non-robust cases opens the
floor for the choice of wrong cases, that is, cases having the wrong status given the research
question at hand. On the one hand, three of sixteen cases appear as deviant in the wrong
14
See Deken and Kittel (2006) for a treatment of the (lack of) sensitivity of panel regression results for the type
of welfare state expenditure data one uses.
15
The regression output is not reported here because it is irrelevant for our point.
12
regression while they are typical in the correctly specified one. This means that there is a
chance of nearly twenty percent that one will select a wrong case (presuming that cases are
randomly chosen, cf. Lieberman, 2005: 446-448). On the other hand, Sweden in 2000 appears
as typical in the logged model, but actually should be classified as deviant because this is the
case’s status in the non-transformed model. Since there is only one out of 47 typical cases that
should not be selected for a typical-case analysis because of an incorrect status, the chance of
committing a wrong case selection is rather small in our hypothetical example.
In practice, however, one is often confronted with multiple and probably more severe
specification problems affecting the proper estimation of the regression surface and the valid
identification of cases. In some instances, it may be easy to make the correct specification
decision, while it will be a rather ambiguous endeavor to determine the most appropriate form
of the model in other cases. As a matter of fact, it is often not possible to single out one model
as the unequivocally superior one (Bartels, 1997; Ho et al., 2007; Kittel, 1999; Kittel and
Winner, 2005). Thus, we believe that the model-dependence of case selection is a pervasive
problem that is currently largely neglected in the methodological literature and empirical
research selection cases from regression analyses. While model-dependence is a widespread
phenomenon, we also argue in the next section that there is a way to diminish this problem.
5. A robust procedure for the choice of cases
The model-dependence of case selection is a problem because the expectations with which we
approach a case crucially hinges on whether a case is statistically typical or deviant – or a
‘pathway case’, for that matter. Thus, it is essential to maximize the confidence in the
accuracy of the residuals and the classification of cases. We argue that this can be achieved by
determining the sets of robust typical and deviant cases and by picking a case from this set.
The precise case selection procedure one should follow depends on whether the estimated
models differ from each other statistically or with respect to the included variables.16
There generally is a variety of ways to estimate a model containing the same
independent variables. For example, panel data can be estimated with or without a lagged
dependent variable, with or without panel-corrected standard errors, and so on (cf., Beck and
Katz, 1995; Kittel, 1999; Kittel and Winner, 2005). When a case is an onlier independently of
the applied estimation techniques, we can be as certain as possible that this case is appropriate
for theory-generating or theory-testing process tracing. A similar argument applies to robust
16
Robust case selection is no solution to the problems of the pathway technique. This approach presumes that
one has identified the true model, which is difficult enough and what we do not assume because then there is no
need for the robust choice of cases. The problems of the pathway procedure are inherent to this approach and
cannot be fully resolved.
13
deviant cases, i.e., cases that are outliers in whatever way the model is estimated. We can trust
in the suitability of the case for the search of omitted variables when it is an outlier under all
variants of estimation techniques.
Robust case selection works best when the correctly specified model is among the
estimated models (presuming that there is one). Of course, one generally does not know the
true model because in that case one would simply estimate it and select an onlier for process
tracing.17 However, it is often possible to identify a set of model specifications that appear to
be the most appropriate without being able to identify a single model as the unambiguously
superior one. The failure to determine the correct model is not a major problem for case
selection as long as it is one of the estimated models, since then the true model contributes to
the identification of the robust cases. If the correct model is run, the set of robust typical and
robust deviant cases will be equal or, more likely, a subset of the sets of typical and deviant
cases one would derive from the true model. The identified robust cases are only a subset of
the true sets – even if we cannot identify the true model yet – because some cases may have a
wrong status in one of the incorrect models. Consequently, these cases appear as non-robust
and will be excluded from the empirical analysis. Figure 2 visualizes this argument with a
hypothetical example involving the true model and an alternative model that is misspecified.
Figure 1: Robust case selection when the true model is estimated
True model
Alternative model
Joint perspective
Outliers
Robust outliers
True outliers
Non-robust cases
True onliers
Onliers
Robust onliers
Outliers
Non-robust cases
Sample of cases
17
The analysis of outliers would be fruitless because we know that there are no omitted variables. Nevertheless,
theory-oriented process tracing is useful, since in most studies multiple causal processes are compatible with the
same cross-case evidence (George and Bennett, 2005: Ch. 10).
14
Since we do not know which of the two models is the correct one, we estimate both
and determine the groups of robust and non-robust cases. As can be seen, the number of
robust typical cases is smaller than true number of onliers. At the same time, the set of robust
outliers is a subset of the true set of deviant cases. Consequently, there are also some nonrobust cases that have a different status in the wrong and the correct model. In sum, the
number of robust onliers and outliers is smaller than the number of truly typical and deviant
cases. What matters, however, is that all robust cases have the same status as in the true
model. This ensures that one will select the right cases for pattern-matching and the search for
omitted variables, respectively. Moreover, figure 2 shows that one generalizes to fewer cases
when one implements the robust case selection procedure. The causal inferences generated in
pattern-matching and deviant case analyses are not generalized to non-robust cases because
one does not know to which of the groups a non-robust case belongs. In this view, our
approach is conservative when it comes to the generalization of causal inferences. We think
that this is a beneficial aspect of robust case selection because it avoids the overgeneralization
of causal insights (cf., Collier and Mahoney, 1996).
This discussion implies that our robust procedure may result in the choice of a wrong
case when the true model has not been estimated. The identified robust cases may be identical
to the set one would obtain if the correct model would have been estimated too. Yet, it is more
likely that the group of robust cases is too large inasmuch it includes some cases that have a
different status in the true model and that would be excluded as non-robust if it would have
been estimated. Since this has not been done, there is a certain probability that one selects a
wrong case even if one picks it from the set of robust onliers or outliers. However, as long as
the results of the estimated models are not substantially different from the true regression
output, a considerable share of the robust cases is likely to have the true, yet unknown status
as under the true model.
Figure 3 captures this scenario. In this hypothetical example, we estimate two models
that are misspecified in some respect and additionally fail to run the correct model. The nonestimation of the true model is a problem because a small set of cases that are truly deviant
appear as robust typical when taking a joint perspective on the cases’ status in the wrong
models. Thus, there is some potential for the choice of a wrong case.
15
Figure 2: Robust case selection when the true model is not estimated
Wrong
model 1
Wrong
model 2
True model
Outliers
Outliers
Onliers
Outliers
Onliers
Onliers
Cases without
true model
Cases with
true model
Robust
outliers
Robust
outliers
Nonrobust
Robust
onliers
Nonrobust
Robust
onliers
Sample of cases
In principle, robust case selection is equally applicable to models that differ with
respect to the included variables. However, some additional remarks are in order when one
confronts the situation that is at the heart of the pathway technique and the classic deviant
case that we discussed in section 3. The identification of robust typical cases can be
performed as described above when the variable of interest is not the variable distinguishing
the estimated models from each other. Independently of whether the smaller or expanded
model is the superior one, the inferences generated in process tracing are likely to apply to all
robust typical cases.
As explained above, non-robust cases should be ignored when the estimated models
exhibit divergent statistical specifications. In contrast to this, cases that lack robustness are
useful targets for within-case analysis when the models differ with respect to the included
variables. More specifically, cases that are outliers in the smaller model and onliers in the
expanded model are suitable for the search of omitted variables. On a general level, we thus
agree with the pathway procedure and the deviant-case argument. However, observing a
change of status from deviant to typical alone is a rather weak basis for making a decision
about whether to expand the smaller model or not. As explained above, more variables tend to
capture more variance on the outcome. Thus, it is intuitive that cases move closer to the
regression surface when the model is expanded even if no systematic variable has been
omitted from the original model.
16
In a similar vein, it would be fallacious to infer from the robust deviance of a case that
no variable has been omitted from the reduced model. Outliers do not necessarily move closer
to the regression surface when a systematic variable has been omitted. Two reasons may
account for robust deviance when a systematic variable is added to the model. First, one may
have ignored two (or more) variables some of which go undetected in an exploratory withincase analysis. Second, a non-systematic variable may exert a strong effect and keep the case
away from the regression surface in the expanded model. In sum, we contend that one should
not make strong claims from non-robustness or robust deviance. If one decides on the basis of
a deviant case analysis to add a variable to a model, it is mandatory to run the appropriate
diagnostics for underfitting on the original model and tests for underfitting and overfitting for
the expanded equation. This is the point where our approach is sharply distinct from the
existing approaches that are solely based on the (non-)observation of a shrinking residual. Of
course, the diagnostics for underfitting and overfitting should be always applied in regression
analysis so as to check whether the model is too narrow, too broad, or both. In the light of
how the inspection of outliers is treated in the literature, however, we deem it particularly
necessary to emphasize that process tracing and the change from deviant to robust, or the lack
thereof, are a viable substitute for regression diagnostics.
6. Conclusions
The choice of cases on the basis of a regression analysis is the oldest and most widely
accepted way to combine small-n and large-n research. We have shown that the intuitively
plausible perspective on case selection is methodologically deficient. The basis for all
problems is the impossibility to distinguish between systematic and non-systematic variables
in process tracing. Because of this, the validity of case selection fully hinges on the quality of
the estimated residuals. This is a problem because one often estimates a range of plausible
models that perform equally well in the regression diagnostics. Just as the regression output
may be model dependent, so is the classification of cases as typical and deviant. Moreover,
we have demonstrated that the existing perspective on the analysis of outliers and the recently
suggested pathway technique is flawed in several respects. As a solution to both problems, we
proposed the choice of robust cases and the systematic application of tests for underfitting and
overfitting. We believe that case selection in regression analysis will improve when our
guidelines are followed.
17
References
Achen, Christopher H. (2005): Two Cheers for Charles Ragin. Studies in Comparative
International Development 40(1):27-32.
Adcock, Robert, and David Collier (2001): Measurement Validity: A Shared Standard for
Qualitative and Quantitative Research. American Political Science Review 95(3):529-546.
Bäck, Hanna, and Patrick Dumont (2007): Combining Large-N and Small-N Strategies: The
Way Forward in Coalition Research. West European Politics 30(3):467-501.
Bartels, Larry M. (1997): Specification Uncertainty and Model Averaging. American Journal
of Political Science 41(2):641-674.
Beck, Nathaniel (2006): Is Causal-Process Observation an Oxymoron? Political Analysis
14(3):347-352.
Beck, Nathaniel, and Jonathan N. Katz (1995): What to Do (and Not to Do) with Time-Series
Cross-Section Data. American Political Science Review 89(3):634-647.
Berk, Richard A. (1990): A Primer on Robust Regression. In Modern Methods of Data
Analysis, edited by John Fox, and J. Scott Long, pp. 292-324. Newbury Park: Sage.
Bhagwati, Jagdish N. (2002): Introduction: The Unilateral Freeing of Trade Versus
Reciprocity. In Going Alone: The Case for Relaxed Reciprocity in Freeing Trade, edited by
Jagdish N. Bhagwati, pp. 1-30. Cambridge, Mass.: MIT Press.
Capoccia, Giovanni C., and Michael Freeden (2006): Multi-Method Research in Comparative
Politics and Political Theory. Committee on Concepts and Methods working paper series, no.
9
Collier, David, and James Mahoney (1996): Insights and Pitfalls: Selection Bias in Qualitative
Research. World Politics 49(1):56-91.
Coppedge, Michael (1999): Thickening Thin Concepts and Theories - Combining Large N
and Small in Comparative Politics. Comparative Politics 31(4):465-476.
Deken, Johan De, and Bernhard Kittel (2006): Putting the Chain Saw into Social
Expenditures. Retrenchment and the Problems of Using Aggregate Data. In Welfare Reform
in Advanced Societies: Exploring the Dynamics of Reform, edited by Nico Siegel, and Jochen
Clasen, pp. Cheltenham: Edward Elgar.
Dunning, Thad (2007): The Role of Iteration in Multi-Method Research. APSA Qualitative
Methods Newsletter 5(1):22-24.
Ebbinghaus, Bernhard (2005): When Less Is More: Selection Problems in Large-N and SmallN Cross-National Comparisons. International Sociology 20(2):133-152.
Eckstein, Harry (1975): Case Study and Theory in Political Science. In Strategies of Inquiry.
Handbook of Political Science, Vol. 7, edited by Fred I. Greenstein, and Nelson W. Polsby,
pp. 79-137. Reading, Mass.: Addison-Wesley.
Falkner, Gerda (2007): Time to Discuss: Data to Crunch or Problems to Solve? A Rejoinder
to Robert Thomson. West European Politics 30(5):1009-1021.
Fox, John (1991): Regression Diagnostics. Newbury Park: Sage.
Geddes, Barbara (2003): Paradigms and Sand Castles: Theory Building and Research Design
in Comparative Politics. Ann Arbor: University of Michigan Press.
George, Alexander L., and Andrew Bennett (2005): Case Studies and Theory Development in
the Social Sciences. Cambridge, Mass.: MIT Press.
18
Gerring, John (2007a): The Case Study Method: Principles and Practices. Cambridge:
Cambridge University Press.
Gerring, John (2007b): Is There a (Viable) Crucial-Case Method? Comparative Political
Studies 40(3):231-253.
Gerring, John (2007c): The Mechanismic Worldview: Thinking inside the Box. British
Journal of Political Science 38:161-179.
Gowa, Joanne, and Edward D. Mansfield (1993): Power Politics and International Trade.
American Political Science Review 87(2):408-420.
Gowa, Joanne, and Edward D. Mansfield (2004): Alliances, Imperfect Markets, and MajorPower Trade. International Organization 58(4):775-805.
Gujarati, Damodar N. (2004): Basic Econometrics. Toronto: McGraw-Hill.
Ho, Daniel E., Kosuke Imai, Gary King, and Elizabeth A. Stuart (2007): Matching as
Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal
Inference. Political Analysis 15(3):199-236.
King, Gary, Robert O. Keohane, and Sidney Verba (1994): Designing Social Inquiry:
Scientific Inference in Qualitative Research. Princeton: Princeton University Press.
King, Gary, and Langche Zeng (2007): When Can Be History Be Our Guide? The Pitfalls of
Counterfactual Inference. International Studies Quarterly 51:183-210.
Kittel, Bernhard (1999): Sense and Sensitivity in Pooled Analysis of Political Data. European
Journal of Political Research 35(4):533-558.
Kittel, Bernhard, and Hannes Winner (2005): How Reliable Is Pooled Analysis in Political
Economy? The Globalization-Welfare State Nexus Revisited. European Journal of Political
Research 44(2):269-293.
Kuha, Jouni (2004): Aic and Bic: Comparisons of Assumptions and Performance.
Sociological Methods & Research 33(2):188-229.
Lieberman, Evan S. (2003): Race and Regionalism in the Politics of Taxation in Brazil and
South Africa. Cambridge: Cambridge University Press.
Lieberman, Evan S. (2005): Nested Analysis as a Mixed-Method Strategy for Comparative
Research. American Political Science Review 99(3):435-452.
Lijphart, Arend (1968): The Politics of Accommodation: Pluralism and Democracy in the
Netherlands. Berkeley: University of California Press.
Lijphart, Arend (1971): Comparative Politics and the Comparative Method. American
Political Science Review 65(3):682-693.
Mahoney, James, and Gary Goertz (2006): A Tale of Two Cultures: Contrasting Quantitative
and Qualitative Research. Political Analysis 14:227-249.
Mahoney, James, and Dietrich Rueschemeyer (2003): Comparative Historical Analysis in the
Social Sciences. Cambridge: Cambridge University Press.
Rogowski, Ronald (1995): The Role of Theory and Anomaly in Social-Scientific Inference.
American Political Science Review 89(2):467-470.
Rohlfing, Ingo (2008): What You See and What You Get: Pitfalls and Problems of Nested
Analysis in Comparative Research. Comparative Political Studies
19
Rueschemeyer, Dietrich (2003): Can One or a Few Cases Yield Theoretical Gains? In
Comparative Historical Analysis in the Social Sciences, edited by James Mahoney, and
Dietrich Rueschemeyer, pp. 305-332. Cambridge: Cambridge University Press.
Simon, Herbert A. (1954): Spurious Correlation: A Causal Interpretation. Journal of the
American Statistical Association 49(267):467-479.
Symposium (2007): Multi-Method Work, Dispatches from the Front Lines. APSA Qualitative
Methods Newsletter 5(1):9-28.
Tarrow, Sidney (1995): Bridging the Quantitative-Qualitative Divide in Political-Science.
American Political Science Review 89(2):471-474.
Thomson, Robert (2007): Time to Comply: National Responses to Six Eu Labour Market
Directives Revisited. West European Politics 30(5):987-1008.
Western, Bruce (1995): Concepts and Suggestions for Robust Regression Analysis. American
Journal of Political Science 39(3):786-817.
20
Download