Draft only -- comments welcome Humans as Instruments; or, The Inevitability of Experimental Philosophy Jonathan M. Weinberg Indiana University jmweinbe@indiana.edu [N.B. This document is about 80% a draft but still 20% notes. Caveat lector.] I. Introduction: Humans as Philosophical Instruments Here's a pocket theory on philosophy's poor progress, compared to the sciences: in the sciences, we have generally done a great job of taking humans out of the equation, both in terms of removing intentional idioms from our scientific explanations (going back to Descartes, at least), and perhaps more important, learning how to expand human cognitive powers (e.g., instruments) and overcome human cognitive foibles (e.g., using statistics instead of only eyeballing generalizations). Even in the scientific study of human psychology, we have had greater success in the places where such de-humanization can be carried out, such as low-level vision, than in places where it can't, such as mate selection. (This raises its own set of issues for philosophy of psychology, which I won't pursue here.) But so much of what is still part of philosophy's portfolio concerns domains for which we have no sources of information that go beyond the more-or-less unaided human agent.1 Justice, goodness, agency, beauty, explanation, mereology, meaning, rationality – all of these are matters for which we have next to no capacity to build detectors, or even to begin to imagine how to build them. The only detectors we have for them, it seems, is us. (Perhaps in some of these we could train artificial pattern-detectors to cotton on to pretty close to the same things that we do -- but that would be purely parasitic on the human capacities, and not be any way to extend those capacities.) So, let's embrace that situation, in good Reidian fashion. In these domains, humans are the only instruments we are likely to get, and let's accept as a starting point that humans are at least minimally decent instruments for them -- pace radical views like Mackie's error theory, there are truths to be had in these domains, and we are getting at least some good information about those truths. Where do we go from there? Some philosophers, even at the highest levels of the profession, have claimed that our ordinary capacities are epistemically successful enough for philosophers to muddle through, and have suggested that to ask more than that from philosophy is to betray a commitment to an untenable form of scientism. (Williamson; Sosa) And if we were just interested in some fairly basic level of justification in our philosophical beliefs on the whole, then maybe that modicum of reliability would be sufficient, just as people's ordinary arithmetical abilities are often sufficient for the 1 Actually, it might be wise to try to figure out how to make use of some nonhumans, especially the like of other primates, to shed light on philosophical issues. Arguably some philosophers, such as Colin Allen and Peter Carruthers, have already done so regarding such topics as the nature of language and folk psychology. But I will not pursue such avenues further here. more rudimentary tasks of their daily lives. But, as philosophers, we're presumably not just interested in having some fairly basic level of justification in our philosophical beliefs, any more than scientists or mathematicians are just interested in having some fairly basic level of justification in their scientific or mathematical beliefs. We appropriately ask much more of ourselves, in order to increase our chances of getting a hold of the real facts of the matter. One way of thinking about this is that we expect that our minimally decent philosophical reliability is not evenly distributed across our cognitive range, and that by this point we've already got our hands on most of the facts for which we're particularly reliable... and that the questions that are still (still!) open for us are ones for which our basic capacities turn out not to be up to snuff on their own. (We will revisit this idea of differing ranges of reliability shortly, and with the added concern that it isn't just reliability per se that will be important here.) One way that the sciences have gotten past the limits of basic human competence is through building new instruments that provide novel means of detection. (e.g., telescopes; litmus strips) But we've noted that that's likely just not an option here. What we need to do, then, is to deploy a different approach that has also been successfully used by other forms of inquiry: find better ways of extracting the information from instruments that we already have. And how are we to do that? My contention in this paper is that answering that question will require a substantial contribution from experimental philosophy – given our interest in continual improvement and refining of our philosophical methods, I argue, experimental philosophy will prove inevitable. [clarificatory note on what sort of x-phi: not just surveys of the folk, and likely will include lots of psychological work done by non-philosophers] II. Responding to the Experimental Restrictionist Challenge The first reason why experimental philosophy is inevitable comes from experimental philosophy itself, for recent "x-phi" research has uncovered findings that pose a fairly direct challenge to the trustworthiness of our ordinary intuitive capacities regarding some rather celebrated philosophical thought-experiments. Most of this work can be divided into four categories: demographic differences; order effects; framing effects; and environmental influences. For example, judgments about knowledge, reference, and morality2 have all been found to differ somewhat depending on whether the agent offering the judgment is of Western or Asians descent, even, in some cases, where both groups are native-English-speaking American college undergraduates (though I agree with Williamson that it is not at all obvious at this time how best to explain this variation). The order in which thought-experiments are considered also seems capable of influencing judgments about morality and knowledge.3 Interestingly, in preliminary follow-up studies, subjects who are more likely to reflect harder about the cases do not show any immunity to the order effects, but have demonstrated instead a predilection towards an order effect in the opposite direction.4 Petrinovich & O’Neill (1996) in a study of trolley cases discovered that small differences in wording could exploit framing effects along the lines of those famously studied by Tversky and Kahneman; for one group of participants, the action being considered was described as throwing “the switch which will result in the death of the one innocent person on the side track.” For another group, the action was described as throwing “the switch which will result in the five innocent people on the main track being saved.” The difference in wording had a significant 2 Weinberg et al. (2001); Machery et al. (2004); Doris and Plakias (2008). Haidt and Baron (1996); Swain et al. (2008). 4 Weinberg, Alexander, and Gonnerman (2008) “Unstable Intuitions and Need for Cognition: How Being Thoughtful Sometimes Just Means Being Wrong in a Different Way”. SPP Poster Presentation. 3 effect on participants’ judgments despite the fact that, in the context of the trolley problem vignette, they are obviously describing the same action.5 Perhaps most unexpectedly, in many cases people’s judgment are influenced by features of the physical or social situation in which the judgment is elicited. These influences, as with order effects and framing, are typically covert. Those affected usually have no idea that they are being influenced and, short of doing or reading about carefully controlled empirical studies, they have no way of finding out. For example, psychologists6 asked subjects to make moral judgments on a range of vignettes. Some of the subjects performed the task at a clean and tidy desk. Others did it at a desk arranged to evoke mild feelings of disgust. There was a dried up smoothie and a chewed pen on the desk, and adjacent to the desk was a trash container overflowing with garbage that included a greasy pizza box and dirty looking tissues. They found that the judgments of the subjects in the gross setting were substantially more severe. These sort of empirical findings indicates that armchair practice with thought-experiments may be inappropriately sensitive to a range of factors that are psychologically powerful but philosophically irrelevant. Unwanted variation7 in any source of evidence presents a prima facie challenge to any practice that would deploy it. Once they recognize that a practice faces such a challenge, practitioners have the intellectual obligation to see whether their practice can meet that challenge. Once challenged, practitioners incur an obligation to (i) show that their practice’s deliverances are immune to the unwanted noise; (ii) find ways of revising their practice so that it is immune; or (iii) abandon the practice. “Immune” here of course should not be read as 5 Baron (1994) and Sunstein (2005) have argued that the distorting influences of framing are widespread in the ethical judgments of philosophers, jurists and ordinary folk. See also Sinnott-Armstrong (2008). 6 Schnall et al., in press. requiring anything like infallibility – just a reasonable insulation of the conclusions produced by the practice from the unwanted variation that may afflict its evidential sources.8 Joshua Alexander and I (2007) call this line of argument "the experimental restrictionist challenge", since it contends that the scope of our intuition-deploying practices may need to be restricted. My claim here is just the limited one that if a successful response to the challenge can be had, it will need to make its own nontrivial use of experimental philosophy results. What is at stake are a range of empirical questions -- whether and, if so, to what extent philosophers' intuitions are susceptible to these sorts of effects, and what might be done to reduce or eliminate such susceptibilities -- and they are not of a sort that can be settled by the kind of casual observation we might be able to make from the armchair. Responses to the resteictionist challenge so far have predominantly attempted to conflate it with classical skeptical arguments; yet as the previous section indicates, a mere rejection of skepticism is not enough to establish that our methods are not flawed and improvable. III. The Promises & Limits of Abduction One might hope that the deployment of coherence norms of rationality could suffice to address that challenge. That's what coherence is for, one might have thought -- to take in noisy information streams, and filter out signal from noise. The experimental philosophers have only shown, at worst, that there is some noise to be thus filtered, but not that at all that current philosophical practices of disputation and reflection aren't up to the task of doing so. 7 Strictly speaking, unwanted lack of variation can be as much of a problem as unwanted variation; a thermometer that always gave the same readings even as the temperature changes is an epistemically unsuccessful thermometer. Unfortunately, we have good reason to worry that such general invocations of coherence will be insufficient. First, seeking coherence can only help right mix of information is coming into the process in the first place: an error will only be corrigible if sufficient correcting information is present. Given the very substantial ethnic and cultural homogeneity of the profession, we may just not yet be receiving any correcting information for any errors of cultural bias we may be making. Moreover, the many stages of selection and professional enculturation that any wouldbe philosopher (quite appropriately) must persevere through will have an unintended consequence of shielding us from other variants of the human instrument whose inputs we might stand in need of. Some of the discussion of these matters has been marred by treating the question of the epistemic consequences of professionalization in a binary fashion: is it virtuous development of expertise, or a threat of circularity and insularity? (Williamson, Ichikawa, McBain) The answer will surely be, to at least some extent: both. And what expertise there is that gets developed is likely to be both more limited and more costly than one might expect. Limited, in that the psychological literature on expertise indicates that the extent of acquired expertise is usually very circumscribed. And costly, as that literature also reveals that the inculcation of expertise in one matter often leaves one with a diminished, or perhaps more accurately, distorted competence elsewhere. To become an expert is to turn oneself into a specialized cognitive tool. But the general extent and precise contours of improvement and distortion produced by any particular 8 One appropriate line of response would be to deny that the variation is unwanted, by defending a form of relativism, contextualism, or the like. I will not address such responses here, though see Swain et al. (2008) for a brief discussion. regimen of training is not discernible from the armchair -- scientific investigation would be required to ascertain such facts. Relatedly, any noise due to unconscious factors like heuristics and biases will be hard to filter out just with our general coherence norms. Rather, it will likely be a source of instability in any such reflections until they can be brought to consciousness, and some manner of explicitly addressing them devised. The history of experimenter effects in scientific practice presents both cautionary tales and models of success. (E.g., development of double-blind methods; "protocol analysis" as a refinement of introspectionism) One sort of coherence-based form of reasoning that has become more and more popular in philosophy is that of inductive/abductive inference. (Does the trend perhaps start with Lewis?) The attraction of this sort of inference in the presence of noise is obvious: they aim to manage conflicts in our data sets by weighing the various inputs and discerning which hypotheses best fit them. Yet such forms of reasoning present their own epistemic risks, and have substantial preconditions for their success. First, such inferences are nonmonotonic, as they pretty much need to be, if they are to help with the task of separating wheat from chaff. It follows that the requirement of total evidence must be taken very seriously here -- what we don't know really can hurt us. One particularly salient way that can happen here is the risk of our data set reflecting a biased sample, either in the demographics of which human instruments are consulted, or in the set of cases about which we have attempted to get readings. Second, looking at best practice of such inference reveals that evaluating the fit of hypothesis to data must take into account how noisy that data is expected to be in the first place. The economist Robin Hanson makes such a point concerning moral intuitions: “In the ordinary practice of fitting a curve to a set of data points, the more noise one expects in the data, the simpler a curve one fits to that data. Similarly, when fitting moral principles to the data of our moral intuitions, the more noise we expect in those intuitions, the simpler a set of principles we should use to fit those intuitions.” The point easily generalizes: the noisier our methods, the less subtlety we should allow in our generalizations. This perhaps sets up an argument similar in some ways to Brian Weatherson’s terrific “What Good Are Counterexamples?” paper. If our sources of evidence for knowledge attributions are sufficiently noisy, then maybe we should take JTB to be a better analysis than JTB+W (for some anti-Gettier factor W), even if we don’t have any specific reason to doubt our judgment about Gettier cases in particular. Given that JTB and JTB+W agree on the overwhelming majority of the cases, it may be that the additional complexity in the model introduced by covering the Gettier cases is unwarranted. In particular, in the absence of a sufficiently validating account of the reliability of the Gettier judgments, we may not be licensed in making an abductive inference from the overall pattern of our knowledge ascriptions to a rejection of a JTB theory. I don't mean to pick especially on the Gettier judgments [especially not in a talk at UMass!!!] or even the "S knows that p" literature on the whole. The case is just very useful for illustrating the much more general point that our current intuition-based methodology has a bad mismatch between, on the one hand, the extravagant degree of precision we expect from it, in terms of the intricateness of the theories we want to be able to argue for on the basis of such intuitions, and on the other, the rather scarce information we have about just how much and what sorts of noise the intuitive instrument may be susceptible to. An interesting epistemological upshot here is that mere reliability-based weighting isn't enough, since when trying to match a model to our data, the particular sorts of errors that may afflict the data set need to be tracked as well. For example, a simple illustration: Two thermometers may be equally reliable over a given range, yet one is accurate to within +/- 1 degree, so when it reads 77 we may be fairly confident the temperature lies between 76 and 78. And the other only measures high when it is wrong, and is accurate to within -2 degrees, so when it reads 77 we may be fairly confident the temperature lies between 75 and 77. The correct inferences to draw from a data set generated by the first thermometer will be different from the ones to be drawn if the same data set were produced by the second one. Even if we take ourselves to be basically reliable in our philosophical judgments, that claim – the starting point of this essay – is insufficient by itself to secure the sought-for result that abductive inferences based on such judgments will be sufficiently shielded from noise. IV. Epistemic Profiles & Experimental Philosophy What we need, then, is an account of what I will call the epistemic profile of a source of evidence. We are used to asking of a source of evidence whether it is reliable, and over what target domain, and perhaps to what extent it is reliable over that domain (95%, 99%, or what not). An epistemic profile expands such a reliability characterization along several dimensions at once. In addition to a target domain, we must consider also particular environmental contexts and modes of use: a compass' reliability is partly a function of how near it may be to a deposit of iron or an active MRI machine; a medical thermometer will reveal body temperature accurately only when it is inserted in some appropriate locations on the body. The target domain needs to be articulated in such a way that we can distinguish these matters for different degrees of precision. Moreover, for any such point in this high-dimensional space of performance- relevant factors, we want to know not just whether the source is reliable, and not just to what extent it is reliable, bur furthermore to what sorts of errors it may be prone. It's not that we are utterly lacking an account of the epistemic profile of human judgment about matters philosophical. For example, we expect on average for judgments about more long and complicated propositions to be less reliable than those about the short and sweet. We have in the course of the field's history identified some sources of noise, such as the ease of conflating use and mention, the epistemological and metaphysical, the semantic and the pragmatic. We have tools like the formalism of the predicate calculus to reduce the noise in our judgments of validity, particularly regarding noise generated by things like quantified or negation scope ambiguity. None of my discussion here should be taken as downplaying the value of this methodological knowledge, much of which stands as a counterexample to any who would claim that philosophy never makes any progress at all. The point is not to belittle those accomplishments, but rather to emphasize the scope of the task that is still unfinished. If we want to get maximum epistemic value out of the human philosophical instrument, then we're simply going to need a much better understanding of what it is and how it works. I understand that this move can sound radical; indeed it has sometimes been construed as advocating giving up philosophy altogether and replacing it with science. Yet this is not so. Revisions of philosophy’s methods are as old as philosophy itself. Hume’s epistemological proscriptions are one obvious line of our intellectual lineage, though the arguments do not rest on any specifically empiricist epistemology or philosophy of mind. (Indeed, I am in contemporary terms both a moderate rationalist and a nativist.) There is a clear affinity between the nature of the challenge we are offering here and Hume’s willingness to make use of what he took to be contingent facts about our minds to draw metaphilosophical conclusions. (Note, though, that I am more moderate than Hume; I am advocating only that armchairs be consigned to the flames, not books.) The philosophical ancestry of this sort of argument can also be traced through to Kant’s attempts – motivated at least in part by what he perceived to be a mismatch between science’s successes and philosophy’s lack thereof – to delimit the space of legitimate philosophical theorizing with an admonition of philosophical humility. And even though these arguments are not skeptical, I do share with Descartes a desire to revise received philosophical practices to make them more fruitful. Even though I am advocating that we attempt a more ambitious use of science within our philosophy, the goal is not to destroy philosophy by dissolving it into science, but to make good use of science to put philosophy on a sounder footing. One can also see strains – both in the sense of "directions" and of "stresses" – in contemporary mainstream analytic methodology for which experimental philosophy is a natural extension. Philosophers are already willing to make local, piecemeal, and informal invocations of results from cognitive psychology to "explain away" intuitions that are problematic for one's preferred philosophical thesis. (Hawthorne and Williamson on the availability heuristic and contextualism) Philosophers tend to dip in and out of the scientific psychological very quickly when they do so. But it turns out that a successful "explaining away" can be more difficult than one might have thought, and requires getting the empirical particulars nailed down at a level of detail that requires attending to a rather broader chunk of the scientific literature, and attending to it much more deeply, than seems standard in current practice. (Nagel (forthcoming) vs. Hawthorne and Williamson) One way in which this project cannot be collapsed to a purely psychological scientific one is that an absolutely essential part will be substantially metaphilosophical – what is the nature of the philosophical truths in question, such that various aspects of the instrument's performance can be understood properly as providing a true signal, and others, as distorting noise? This is a question to which experimental results may be at best relevant to finding an answer, but no the source of the answer itself. V. On Taking the Instrument Out of the Armchair In closing, let me note that in my discussion here I have not tried to distinguish between "the human philosophical instrument" in general, and the particular sorts of delployment of that instrument as found in contemporary philosophical practices of thought-experiment. Yet we might want to explore other ways in which our attunement to the philosophical truth might be exploited and extracted. --ecological invalidity of the armchair? perhaps especially with moral judgment, affectively hot and personal reactions need to be considered --neurological explorations of aesthetic experience in real time, more than judgments of beauty --might be that we can track our values and goals more directly than how they play out in particular cases. Cross-checking method of pragmatist engineering (Craig; Neta; Horgan and Henderson; Justin Fisher; me) References Alexander, J. and Weinberg, J. (2007.) Experimental philosophy and analytic epistemology. Philosophy Compass. Baron, J. (1994). “Nonconsequentialist Decisions,” Behavioral and Brain Sciences 17, 1-42. Charness, Feltovich, Hoffman, and Ericsson, eds. (2006) The Cambridge Handbook of Expertise and Expert Performance. Cambridge: Cambridge University Press Cohen, S. (2002). Basic knowledge and the problem of easy knowledge. Philosophy and Phenomenological Research, 65, 309–329. Haidt, J. & Baron, J. (1996). “Social Roles and the Moral Judgement of Acts and Omissions,” European Journal of Social Psychology, 26, 201-218. Hanson, Robin. (2002). Why health is not special: errors in evolved bioethics intuitions. Social Philosophy and Policy, 19, 153-179. Hawthorne, J. 2004. Knowledge and Lotteries. New York: Oxford University Press. Lehrer, K. (1990). Theory of Knowledge, Boulder, CO: Westview Press. Ludwig, K. (2007). “The Epistemology of Thought Experiments: First vs. Third Person Approaches,” Midwest Studies in Philosophy: Philosophy and the Empirical, 31 (1), 128– 159. Machery, E., Mallon, R., Nichols, S. Stich, S. (2004). Cognition, 92, B1- B12. “Semantics, Cross-Cultural Style,” Nagel, J. forthcoming. Knowledge ascriptions and the psychological consequences of thinking about error. To appear in The Philosophical Quarterly. Petrinovich, L. & O’Neill, P. (1996). “Influence of Wording and Framing Effects on Moral Intuitions,” Ethology and Sociobiology, 17, 145-171. Schnall, S., Haidt, J., Clore, G. L., & Jordan, A. H. (in press). “Disgust as embodied moral judgment,” Personality and Social Psychology Bulletin. Shanteau, J. (1992). “Competence in experts: The role of task characteristics,” Organizational Behavior and Human Decision Processes, 53, 252-266. Sinnott-Armstrong, W. (2008). “Framing Moral Intuitions,” in W. Sinnott-Armstrong, ed., Moral Psychology: The Cognitive Science of Morality, Cambridge, MA: MIT Press. Sunstein, C. (2005). “Moral Heuristics,” Behavioral and Brain Sciences 28: 531-42. Swain, S., Alexander, J., and Weinberg J. (2008). “The Instability of Philosophical Intuitions: Running Hot and Cold on Truetemp,” Philosophy and Phenomenological Research, 76, 138 – 155. Tversky, A. & Kahneman, D. (1981). “The Framing of Decisions and the Psychology of Choice,” Science, 211, 453-458. van Cleve, J. (2003). Is knowledge easy—or impossible? Externalism as the only alternative to skepticism. In S. Luper (Ed.), The skeptics: contemporary essays (pp. 45–59). Aldershot, Hampshire: Ashgate. Weinberg, J. (2007). “How to Challenge Intuitions Empirically Without Risking Skepticism,” Midwest Studies in Philosophy, 31(1), 318 – 343. Weinberg, J., Nichols, S. & Stich, S. (2001). “Normativity and Epistemic Intuitions,” Philosophical Topics, 29, 1 & 2, 429-460. Weinberg, J., Alexander, J., and Gonnerman, C. (2008) “Unstable Intuitions and Need for Cognition: How Being Thoughtful Sometimes Just Means Being Wrong in a Different Way”. SPP Poster Presentation. Williamson, T. (2007). The Philosophy of Philosophy, Oxford: Blackwell. Williamson, T. (2005). Contextualism, subject-sensitive invariantism, and knowledge of knowledge. The Philosophical Quarterly, 55, 213-235. Williamson, T. (2009). Replies to Ichikawa, Weinberg, and Martin.