Context Effects in Judicial Decision Making Jeffrey J. Rachlinski Chris Guthrie Andrew J. Wistrich DRAFT 01/06/11 Abstract In this paper, we present five sets of studies that demonstrate both how lawyers can use psychological influences to induce judges to make erroneous judgments and how judges respond to these efforts. In the first, we show lawyers can take advantage of contrast effects to influence how judges evaluate the credibility of their expert witnesses. Surprisingly, the study shows that adding a worthless expert witness who lacks credibility can help one's case. In the second study, we found that judges' assessments of the probative value of forensic evidence depends upon how the information is presented. But we also found that judges resist this misleading attempt at misdirection. Similarly, in the third set of studies, we show that judges are vulnerable to the misleading effects of the conjunctive logic, but that they seem also able to resist this effect in settings that are more relevant to the kinds of judgments they make. In the fourth set of studies, we identify how judges avoid the influence of the hindsight bias on judgments of probable cause by altering the target of the judgment that they make. In this study, the natural setting, rather than crafty lawyering, provides the potential source of misdirection, but judges seem to navigate it well. In the final set of studies we show how judges attempt to ignore inadmissible evidence in ways that make them vulnerable to further undesirable extralegal influences. Taken together, these studies show that misdirection is a two-way street in the courtroom. Lawyers can try to misdirected judges to induce erroneous judgments, but judges might well respond by changing their own focus to more stable targets that make them less vulnerable to untoward influences. These efforts, however, can also backfire, as judges can make themselves more vulnerable to error. 1 Introduction An old expression advises lawyers that "if the facts are against you, argue the law; if the law is against you, argue the facts."1 The euphemism states the obvious, perhaps. But it also identifies a particular talent that quality lawyers are supposed to possess--the ability to change the terms of the debate so as to win a case that cannot otherwise be won. We suspect that judges hate this expression, as it reifies lawyers' conflicting efforts to control the agenda in lawsuits and potentially distract the judge from the core issues of a case. Judges prefer to control the agenda in their own courtroom, and naturally dislike dilatory, distracting litigation strategies. They also regard such efforts as wasteful in that they believe that they will see through these efforts. From the judges' perspective changing the subject will only waste time and delay the inevitable. But we wonder whether lawyers' efforts to distraction and control of the way a case is framed are really such a waste of time. Perhaps the real problem is not that these tactics are wasteful, but that they are effective. The opening aphorism about facts and law dovetails with a fundamental lesson of social psychology concerning how to influence people's judgment. That is, social psychologists report that although changing people's judgment of objects is challenging, people's choices can be influenced easily by simply shifting the object of judgment. The often misunderstood study on conformity by Solomon Asch illustrates the point. Asch gave a room full of eight undergraduates a seemingly innocuous task of identifying which of three lines of very different length was closest in length to a series of target lines. The target lines were always identical in length to one of the three options, making the task seem incredibly easy. At least the task looked easy until six of the supposed participants apparently chose the wrong line. In reality, participant number seven was the only real subject in the experiment, with the others being confederates of the experimenter who were instructed to choose the wrong line in several of the rounds. In many instances, the real subject went along with the group's choice, even though it was seemed erroneous. Although this study is often described as an experiment on conformity, in reality, the subjects complied with the group choice only because they felt that they had misunderstood either the instructions or some other aspect of the task. They were never confused about the lines, which were too different in length to inspire confusion. In effect, the confident, seemingly erroneous answers provided by the six confederates changed the target of judgment. The subjects were not assessing the length of lines, but were trying to understand the motives of their supposed colleagues. The notion of misdirection and controlling the object of judgment has been a fundamental tenet of advertising almost since there has been advertising. Clothing lines are selling images, not clothing. Decades of cigarette ads sold youth and cool, along with nicotine. Beer commercials that feature attractive young people are quite obviously selling sex, not beer. If advertisers try to misdirect the public, surely litigators try to misdirect judges and juries. And if the public is susceptible to these manipulations, maybe judges are as well. 1 Variations sometimes add, "if both the law and facts are against you, pound the table." 2 In this paper, we present five sets of studies that provide examples of why litigators are apt to attempt the same kind of tricks as Solomon Ach and Madison Avenue in the courtroom. In the first, we demonstrate that a simple contrast effect can influence judges by showing that adding a worthless expert witness who lacks credibility can boost one's case. In the second, we show that judges' assessments of the probative value of forensic evidence depends upon how the information is presented. But at the same time, we show that judges resist this misleading attempt at misdirection. Similarly, in the third set of studies, we show that judges are vulnerable to the misleading effects of the conjunctive logic, but that they seem also able to resist this effect in settings that are more relevant to the kinds of judgments they make. In the fourth set of studies, we identify how judges avoid the influence of the hindsight bias on judgments of probable cause by altering the target of the judgment that they make. In the final set of studies we show how judges attempt to ignore inadmissible evidence in ways that make them vulnerable to further undesirable extra-legal influences. Taken together, these studies show that misdirection is a two-way street in the courtroom. Laywers can try to misdirected judges to induce erroneous judgments, but judges respond by changing their own focus to more stable targets that make them less vulnerable to untoward influences. These efforts, however, can also backfire, as judges can make themselves more vulnerable to error. I. Contrast Effects: Or How to Improve Your Case by Hiring a Lousy Expert Contrast effects are one of the most pernicious distractions found in the psychological literature on judgment and choice. This literature demonstrates that the addition of undesirable items to a choice set can produce wildly inconsistent choices. As one paper on the subject put it, "someone who prefers chicken over pasta should not change this preference upon learning that fish is available" (Kelman et al, 1996)--and yet preferences seem to be that fickle. In a series of studies, Amos Tversky and his students had subjects choose between consumer products that varied along multiple dimensions and found that the addition of an inferior product altered the subjects' stated preferences for these products. (Simonson & Tversky, 1992 and others). For example, in one study, subjects choosing between accepting (as a gift) either a nice Cross pen or $6 were more likely to choose the pen when a third option of a vastly inferior pen was added to the choice set. This phenomenon has been documented in a wide variety of settings, including in the legal settings. Settlement offers, in particular are amenable to context effects. In deciding whether to accept a settlement, a litigant is choosing between further litigation and accepting a certain settlement. Consequently giving a litigant a choice of two settlements, one of which is clearly inferior, makes the more attractive settlement more compelling, relative to litigation (just as the addition of the inferior pen makes the good pen seem more attractive than that money). (Guthrie, 2001?). Adding inferior settlement offers can similarly affect the choice between types of settlements. (Kelman et al. 1996). 3 Similar effects can be found in choices among criminal penalties as well. (Kelman et al. 1996). These kinds of contrast effect would seem to defy logic. At the very least, they are not consistent with a widely held assumption that preferences are invariant. A person who prefers A to B, A to C, and B to C, should not prefer B to A and C. The addition of an inferior contrast, does not so much chance people's judgment of the objects, but does alter the object of their judgment. People might not truly know whether they prefer a Cross pen to $6, but they can be sure that they prefer a Cross pen to a cheap, used Bic. The contrast between the Cross and the Bic gives them a dimension on which they can evaluate a choice with some measure of certainty. (Hsee, 1998). They can support and defend the choice of the Cross over the Bic in a way that they cannot support the choice of the Cross over the money. So powerful is the need to support and evaluate one's choice that people can be induced to favor accepting less of a desirable commodity. (Hsee, 1998). In one study, for example, subjects indicated that they would be willing to pay an average of $1.66 for eight ounces of ice cream presented in a ten ounce cup (making the cup seem unfilled), even though a similar group of subjects indicated that they would pay an average of $2.26 for seven ounces of ice cream, in a five ounce cup (making the cup seem overfilled). The object of judgment is not the absolute value of ice cream, but of whether the subjects think they are getting good value for their money. All of these studies provide examples of the basic social psychological point that changing the object of judgment can manipulate people's choices. When people are faced with a choice between the Cross pen and the money, they are evaluating the Cross pen in terms of its cash value. When the cheap Bic becomes available, however, they are judging the Cross in terms of its value as a pen as compared to pens--and it looks good on that dimension. The nature of the inquiry has changed. Similarly, putting ice cream in a cup that is too large makes it look like you are being cheated, while putting it in a cup that is too small makes it look like you are getting a bargain. This kind of misdirection is much like the driver who looks for his keys under a lamppost across the street from where he actually lost them because the light is better. No one knows what a Cross pen is worth to them, but they know it is worth more than a Bic, so they choose it more frequently. Despite the illogic of these choices, "evaluability" and contrast effects creates are potent phenomena. But can contrast effects influence judges. To assess this we had a group of trial judges evaluate a hypothetical legal question designed to elicit contrast effects. The judges were in attendance at a state-wide annual judicial education conference for Florida Circuit Court Judges June of 2006. The judges participated in this research as part of their annual conference and were part of a plenary session labeled only "Judicial Decision Making". During this session, the judges completed questionnaires that included a number of hypothetical scenarios.2 2 This sample is described in other of our papers. 4 We designed one of these to test the contrast effect in evaluation of expert witnesses. The scenario described a child-custody dispute, with the following text: "Imagine that you are presiding over a child custody dispute in which the husband and wife are at odds over the custody of their 11-year-old son, Jeremy. The husband and wife are both competent parents, but their relationship with each other is profoundly strained. They have rejected a joint custody relationship and are each seeking sole custody of Jeremy (though the other parent would retain visitation rights). Both the husband and the wife have retained experts to testify as to the custodial arrangement that would serve Jeremy’s “best interests.” Based solely on the information provided below, which of the following experts would you deem to be most credible (please select one only):" The materials then described the expert witnesses. The judges were randomly assigned into one of two conditions: a control condition in which they chose between two different experts--one working on behalf of each party--who were designed to be of roughly comparable quality and a contrast condition, which provided the same two experts, but also added a third expert for the husband. The third expert had vastly inferior qualifications. In effect, the third expert is the equivalent of the Bic pen in this variation on contrast effects. The materials described the two comparable experts as follows: Wife’s Expert – Dr. Henry is a licensed psychologist with a B.A. in Psychology from Stanford and a Ph.D. in Clinical Psychology from the University of Michigan. Dr. Henry has practiced as a clinical psychologist for 20 years in the District of Columbia, working primarily with children and families. Dr. Henry has testified as an expert in 15 child custody cases, seven times for the wife and eight times for the husband. In this case, Dr. Henry will testify that the wife should get custody. Husband’s Expert – Dr. Williams is a licensed psychiatrist with a Bachelor’s Degree in Biology and an M.D. from Emory University. Following medical school, Dr. Williams completed a psychiatric residency and has since practiced psychiatry for 10 years in the Miami area, working primarily with children and families. Dr. Williams has testified as an expert in ten child custody cases, four times for the husband and six times for the wife. In this case, Dr. Williams will testify that the husband should get custody. The third expert, identified also as the husband's expert, was identified as follows: Husband’s Expert – Dr. Hancock is a psychiatrist with a B.A. in Psychology from the University of Mississippi and an MD from St. George’s University School of Medicine in Grenada. Dr. Hancock has never been admitted to practice medicine in the United States. Dr. Hancock has, however, testified as an expert in 5 37 prior child custody cases, each time for the husband. In this case, Dr. Hancock will testify that the husband should get custody. Of the 144 judges who reviewed this problem,3 six did not respond (4 who saw the 2-option and 2 who saw the 3 option). None of the judges in the 3-option condition chose the 3rd choice.4 Even though no judges identified the weaker expert as the most credible, the addition of this expert made the husband's better expert seem more credible. In the control (2-option) condition, 54.% (38 out of 70) of the judges chose the husband's witness as the more credible. In the contrast condition, this rose to 72.1% (49 out of 68). This difference was significant statistically.5 Judges, it seems, are no different than ordinary consumers in that they are vulnerable to contrast effects. It is hard to evaluate the reliability of an expert witness (particularly since we did not provide the testimony, or any detail other than their qualifications). But it was easy enough for the judges in our study to tell that one psychiatrist was better than the other. In effect, the addition of this weak expert changed the object of judgment; the judges were rightly seeing the good psychiatrist as more qualified than the weaker psychiatrist, which distracted them from the more critical inquiry as to whether the psychiatrist was more qualified than a psychologist. This result suggest an insidious litigation strategy--put forth a weak expert to make your good expert look better. That can, and should, seem ridiculous, but contrast effects are powerful in real-world settings, just as they are in the lab. Real estate agents reportedly show clients houses that are wildly inferior on some dimension that is important to homebuyers as a way of getting them to see the real target that the agent is trying to sell as a good buy. CITE. Firms often offer extremely high-end (and high priced) versions of their product without any real hope of selling many of that version, but only to make the price of the version they really hope to sell seem more attractive. (Tversky & Simonson, XXX). If such distractions can affect homebuyers and consumers, who are making serious decisions, then judges might be just as vulnerable. 3 This was roughly half of the judges in attendance at the conference. We varied our materials slightly so that half of the judges read this problem and the other half read an unrelated scenario. 4 We originally presented this problem to a group of judges at an educational conference in another jurisdiction. In the original version, we identified the wife's expert as the psychiatrist and the husband's as psychologists; one of which had vastly inferior qualifications to the other. In that version, we encountered a kind of ceiling effect, in that in the control condition, 84% of the judges (26 out of 31) chose the husband's expert as more credible; that is, there was not much room for more support for this expert I the contrast condition. The addition of the contrasting inferior expert psychologist increased this to 93% (28 out of 29), but this trend was not significant. (Fisher's exact test, p = .20.) Many of the judges informed us that they believed psychologists were better witnesses in such cases, and hence we re-wrote the qualifications and education of the experts so as to make the choice in the control condition a closer call. 5 Fisher’s exact test, p =.035. If we combine the results from the similar, original version of the problem and this version, the combination also produces a significant contrast effect, with 63% (64 out of 101) choosing the husband's expert in the control condition as compared to 79% (77 out of 97) in the contrast condition. Fisher's exact test, p = .01. 6 Contrast effects might also explain why lawyers often include weak arguments in their briefs to accompany arguments that stand a real chance of success. The weak arguments might make the strong ones seem stronger by contrast. Obviously, including a weak argument might undermine the credibility of the lawyer as well. And it would seem surprising if judges cannot make more stable judgments of the quality of the arguments attorneys present. But contrast effects are surprisingly strong and seem to affect consumers in situations in which they have quite a bit of experience (such as evaluating the monetary value of ice cream). Our study demonstrates that contrast effects can influence judges in at least one relevant legal setting, and perhaps the effect is as potent and general in judges as it is in consumers. Similarly, lawyers might take advantage of the related phenomenon known as compromise effects. That is, asking for a more extreme result than one expects to obtain can move judgments in the direction of that extreme result. In one study demonstrating this effect, people's judgment as to whether a homicide was an instance of second or first degree murder was affected by whether they also had available a more extreme choice, consisting of first degree murder with special circumstances (which could have drawn the death penalty). (Kelman et al., 1996; Guthrie, Iowa?). Although few subjects thought that murder with special circumstances was appropriate, the availability of this option increased the percentage of subjects who determined that judged the homicide constituted first-degree murder. The addition of a more extreme option shifts the scale in that direction, so that the "compromise" of choosing in the middle moves along with that extreme judgment. While we did not test this variation on the contrast effect in judges, it is similar to contrast effects. Even though context effects seem to be a widespread phenomenon that permeates marketing and the consumer environment, and our study demonstrates its potential influence on judges, we have conducted another study in which we were unable to elicit contrast effects in judges. In this study, we presented judges in attendance at an educational conference sponsored by the Federal Judicial Center for United States Magistrate Judges with the task of evaluating two settlement offers. The case consisted of a civil rights claim by an African-American high school honors student who had been shot in the back by a security guard at a public university where he was taking classes. 6 6 The facts of the scenario, labeled "Settlement Problem" consisted of the following: “You are presiding over a settlement conference in a lawsuit filed by a minor, Henry Johnson, against Ted Samuelson, a campus police officer employed by the State University. The suit includes a claim under 42 U.S.C. § 1983 and state law tort claims. The University will indemnify Samuelson and has assumed the defense of this action. "Johnson is a 16-year-old African-American from a poor neighborhood. He lives with his four younger siblings and his mother, who works as a hotel maid. He has not seen his father in many years. In January of Johnson’s junior year in high school, he began taking some classes at the local University, through a special program offered to honor students. "While returning from the University library late one evening, Johnson was stopped by Officer Samuelson. Samuelson began questioning Johnson aggressively in connection with an armed robbery. Nervous and frightened, Johnson ran off. Samuelson shouted at him to stop. When Johnson kept running, Samuelson shot Johnson in the back. The bullet damaged Johnson’s spinal cord leaving him permanently unable to walk. The incident has left Johnson bitter and angry. He is nonetheless determined to complete 7 Because the plaintiff was a minor, the judge had to approve the settlement. The defendant was the university and offered either a $3 million settlement or $1.5 million settlement, plus they would fire the security guard. The materials indicated that the student wanted to accept the lesser sum so as to ensure that the guard would get fired. For half of the judges, we created the contrast effect by indicating that the University had initially only been willing to offer $1.5 million and a suspension of the guard.7 The materials ultimately asked the judge whether they would allow the plaintiff to accept the lesser sum.8 The 42 judges at the conference all answered the question, but they did not express a contrast effect.9 In the control condition, 47% (10 out of 21) indicated that they would allow the plaintiff to accept the lesser settlement, as compared to 38% (8 out of 21) in the contrast condition. Thus, contrary the addition of the earlier settlement offer that contrasted unfavorably with one of the options made that option slightly less attractive. The difference in acceptance rates of this option was not significant. 10 While we think it important to report this apparent disconfirmation of the effect, we do not think that it undermines the basic conclusion that judges are vulnerable to contrast effects. In the study, we were attempting to replicate some of the studies of contrast effects in settlement as conducted by Kelman and his coauthors (1996) and by his classes at the University (which is the best public college in the state) and perhaps enroll as a full-time student after high school "Johnson’s mother is his guardian ad litem. Between her job, managing Johnson’s injuries, and caring for her other four children, however, she is overwhelmed. She is counting largely on others to do the best for her son, and will approve any decision her son and his attorney make. Johnson’s lawyer seems competent, but is inexperienced. Neither side has actively pursued discovery and both seem interested in settling. Criminal charges against Samuelson for the shooting were dropped after a brief investigation." 7 This was described as follows: "In a previous settlement conference, Johnson insisted that the Officer Johnson be disciplined and the University initially refused. After protracted discussions, the University reluctantly offered to pay Johnson $1.5 million and to suspend Samuelson for three months without pay. Johnson rejected that offer, and you adjourned the conference to give the parties a chance to reconsider their positions." 8 This was stated as follows: “The [second] settlement conference also proved to be contentious. The University attorneys wanted to discuss a cash offer (which would go into a trust because Johnson is a minor), but Johnson seemed interested only in disciplinary action against Officer Samuelson. After a lengthy bargaining session, University attorneys gave Johnson two options, and announced that unless he accepted one, it would withdraw both and actively litigate the claim. The options are: A) The University fires Officer Samuelson and pays Johnson $1.5 million B) The University takes no disciplinary action against Officer Samuelson and pays Johnson $3 million You privately discussed the University’s offer with Johnson and his attorney. Johnson wants to accept option A. He is furious that Officer Samuelson has not been disciplined and is worried about studying on the same campus that Samuelson patrols. Johnson also needs cash to pay his medical bills, however, which have become a concern to his family. Because Johnson is a minor, you must also approve the settlement. Johnson has told you that he will reluctantly accept option B if you would not approve option A. If Johnson accepts settlement option A, will you approve it as a settlement? ___ Yes, I would approve option A as a settlement ___ No, I would not approve option A as a settlement 9 The 42 judges included 41 U.S. Magistrate Judges and one Federal District Judge. 10 Fisher's exact test, p = .75. 8 Guthrie (2002). In these studies, the addition of an inferior choice tended to boost acceptance of the choice that it most closely resembled. Unlike the previous studies, however, we did not offer the judges a third option. Instead, we had thought that the judges might consider the previous settlement offer as a contrast. Even though we feel our version represented a more realistic way that settlement offers might be structured in such a case, it is possible that the contrast effect depends upon that previous offer being available as a choice, just as it is in the studies of consumer choice. It is also possible that that the role in which we case judges in this study undermined the effect. Judges were not selecting their preferred choice, but were deciding whether to approve or reject the decision of a litigant. Although we believe that the contrast effect should have improved the desirability of the option that the litigant chose, it could be that the judges are focused on a different object of judgment--that of the maturity of the litigant. The contrast effect was thus somewhat indirect, at best, in this case, which might have diluted the phenomenon. II. Imaging the Numerator: Or How to Lie To Judges With Statistics In other work, we have argued that judges, like most adults, use two distinct cognitive systems to make judgments; an intuitive system founded large on affective processes ("System 1") and a deliberative system founded largely on deduction processes ("System 2") (Guthrie et al., 2007). The intuitive system is surprisingly accurate (Gladwell, 2004), but can lead to predictable errors in judgment. Because the intuitive system is faster than the deductive system, good judgment requires the ability to suppress the intuitive response and substituting a response based on deduction. We have found that judges sometimes suppress misleading intuitive responses, but they do not do so consistently. (Guthrie et al, 2007). This gap in judicial ability suggests a vulnerability that savvy lawyers can exploit. A lawyer who can change the object of judgment to one that triggers a favorable affective judgment potentially gains an advantage in trying to persuade a judge. Presentation of actuarial or statistical information presents a prominent example of how manipulating the object of judgment can influence judgment by triggering a misleading intuitive response. For example, it is widely thought that actuarial information can often seem drab and unpersuasive relative to anecdotes. (Borgida, 1978?). Even Josef Stalin is reported to have articulated this tendency in the quote (widely attributed to him) that "the death of a single life is a tragedy, but the death of millions is a mere statistic." (cite). Anecdotes and stories play on the intuitive system, triggering rapid responses. Statistics require the slower, emotionally disconnected deliberative system to process. Individual examples and identifiable victims trigger quick, emotional reactions that people do not easily override with judgments based on statistical information. (Lowenstein et al, cite). The ease with which exemplars can trigger people to use their emotional system to make judgments that are inconsistent with a careful, statistical analysis is illustrated 9 neatly with the so-called "jellybean study" by Seymour Epstein. (Epstein, 1998?). In this study, Epstein told subjects that they would receive a prize for drawing a red jellybean from one of two jars filled with red and white jellybeans. One jar contained 1 red and 9 white jellybeans, while the other contained 10 red and 90 white jellybeans. Even though the probability of drawing a red jelly bean from both jars was identical, subjects preferred to draw from the jar with 10 red jellybeans. Epstein termed the phenomenon, "imaging the numerator", arguing that the subjects' intuitive systems reacted to "more chances to win" in the second urn, thereby ignoring the fact that this urn also provided a proportional number of chances to lose. Many people did not override the intuitive focus on the larger set of red jellybeans in the second urn. So powerful is the phenomenon of imaging the numerator that many subjects preferred the second urn even when it contained as few as 7 red jellybeans (while still containing 90 white jellybeans). Work by John Monahan and Eric Silver (2003) suggests that the phenomenon of imaging the numerator also affects judges. These researchers gave judges hypothetical scenarios involving a question of whether to commit an individual suffering from mental illness involuntarily to an institution. They described the individual's condition and diagnosis and had judges identify how great the risk of violence would have to become before they would commit the individual. The judges chose from one of five categories of risk: 1%, 8%, 26%, 56%, or 76%. When the researchers presented the risk of violence in subjective format (that is, as a percentage chance, as in the previous sentence), the modal threshold for commitment was 26%. When the researchers presented the risks in a frequency format (that is 1 in 100, 8 in 100, 26 in 100, 56 in 100 or 76 in 100), the modal threshold dropped to 8 in 100. Presenting the risks of violence in a "frequentist" format made it easier for judges to think of the person as violent. The authors liken this effect to imaging the numerator, arguing the judges focused on instances of violence described in the frequentist format. In contrast, the percentages feel abstract. The trend for judges to be more willing to commit mentally disturbed individuals when the risks of violence were expressed in frequency formats was not significant in Monahan and Silver's study.11 But the result is similar with results found on forensic experts from a series of other studies. (Slovic & Monahan, 1998, 2000). In one of these studies, for example, clinical psychologists were then asked to indicate whether they would be willing to recommend committing a similar individual. Among the experts who were told that "8%" of people with this individual's condition commit a violent act, 39% stated that they would recommend committing him. Among the experts who learned that "8 out of 100" people with this individual's condition commit a violent act, 61% stated that they would commit him. 11 Although the authors do not report test statistics, they report the raw data on page 4, which enabled us to conducted an order logistic regression, which confirmed that the trend was not significant. (p = .38). The authors only had 26 judges available for the study, which was unlikely to detect the effect that researchers have found for probability format in other contexts. QUICK POWER ANLAYSIS HERE--effect size from Slovic & Monahan (1998, 2000). 10 The lesson for an attorney in this context is straightforward. A lawyer who wants a judge to commit a patient involuntarily should present relevant statistics both to the judge and courtroom expert in a frequentist format. In similar work, Koehler has demonstrated how frequentist format can affect ordinary adults acting as jurors in a criminal case. (Koehler, 2001). Koehler presented jurors with a one-page description of a criminal case in which a prosecutor presented probabilistic testimony of a forensic technique known as PCR matching (which is a simple form of a DNA matching test).12 The materials indicated that blood from the crime scene matched that of the defendant using this PCR technique. Koehler also presented the probability that an innocent person would randomly drawn from the community at large would also match the suspect. He presented this probability either as 0.1% (subjective format) or as 1 in 1,000 (frequency format).13 The frequentist format made it easier to image that a large number of individuals would match the blood type. In contrast, the subjective format makes it easier to commit the "inverse fallacy" (Thompson, 19??); that is, people might quickly confuse the 0.1% as the probability that he defendant is innocent. Hence, while the deductive system will easily see these two formats as identical, the intuitive system will see the frequentist presentation as pointing towards innocence and the subjective format as point towards guilt. Koehler found 12 The text of Koehler's full scenario was as follows: "Imagine that you are presiding over People v. Nethers, a criminal case. In People v. Nethers, Steven Nethers was accused of murdering Richard Oden during an attempted robbery of a hardware store owned by Mr. Oden. According to reliable eyewitness accounts, the perpetrator entered Mr. Oden's hardware store at approximately 4:30 p.m. on November 2, 1997 wearing a Halloween-type of mask, and waving a small caliber handgun. The perpetrator approached Mr. Oden (who was behind the cash register) and said, "Open it fast or you're a dead man." According to the eyewitnesses, when the perpetrator turned his head to survey the store, Mr. Oden grabbed a hammer from the counter and smashed the perpetrator on the head with a single blow. The perpetrator fired a single shot into Mr. Oden's chest and fled the store. Mr. Oden died shortly thereafter in a local hospital. "During an investigation of the hardware store crime scene, the police identified and recovered several moist blood drops from the path that was taken by the perpetrator as he fled the store. These drops were subjected to a form of DNA analysis called PCR testing. The PCR tests revealed the blood to be of a type known as "2, 3." Because this blood type was different from Mr. Oden's blood type, police believed that the recovered blood drops came from the bleeding head of the robber. During routine interviews of people who live in the neighborhood, the police identified several potential suspects. All of these individuals agreed to provide blood samples to police for comparison with blood that was recovered from the crime scene. One of the suspects, Mr. Steven Nethers, matched the 2, 3 blood type and was arrested for the murder. "At trial, the prosecution alleged that the blood analysis demonstrated Mr. Nethers was the source of the wet blood drops, and that he was therefore guilty of attempted robbery and murder. A DNA expert testified that his tests could not rule out Mr. Nethers as a possible source of the blood drops. He also testified that the probability that the suspect would match the blood sample if he were not the source is 0.1%. The defense argued that the blood evidence is irrelevant because there was no direct evidence, such as eyewitness identifications, that linked Mr. Nethers to these crimes." 13 Koehler also varied the specificity of the population of potential innocents. He stated this either in a general way (e.g., "the probability that the suspect would match the blood sample if he were not the source is 0.1% ") or by identifying a specific population (e.g., "0.1% of people in Houston would also match the blood drops"). In our study, reported below, we also varied this parameter, but found it had little effect on the judges, and is less relevant to our hypothesis than the subjective versus frequentist format. Hence, we do not discuss this in detail. 11 exactly these results. Subjects who saw the statistics presented in subjective format were more likely to conclude that the blood at the crime scene had come from the defendant than those who saw the probabilistic format. Interestingly, as discussed below, they did not also express a greater willingness to convict the defendant. These studies produce a clear set of advice for lawyers. Prosecutors should present the statistics in a subjective format and defense attorneys should present the statistics in a frequentist format. Each format flags a slightly different aspect of the statistical evidence. The subjective format induces a sense of the power of forensics and the near certainty of a 0.1% match. People exposed to statistics in this format are assessing whether they believe that the person has only a 0.1% chance of being innocent. By contrast, the frequentist format invites people to imagine a large number of potentially innocent people who might also match the evidence. People exposed to statistics in this format have to judge whether the defendant is one of the many innocents or is the guilty party, based on the other facts. The format changes the target of the subjects' analysis and changes the outcome. To assess whether this same effect could be observed in trial judges, we presented Koehler's materials to a group of 68 trial judges from judges in an urban Eastern jurisdiction in attendance at their annual educational conference in May, 2002.14 Judges saw four versions of Koehler's scenario, two in which the probability of matching was presented in subjective (0.1%) format and two in which it was presented in frequentist (1out of 1000) format. This variation was crossed with the degree of specificity of the sample from which the potential sample of innocent matches might be drawn (matching in general, or matching people from the District of Columbia). Following Koehler, we asked the judges three questions about the fact pattern: "Based on this evidence, what is the probability that the defendant is the source of the recovered DNA trace? ___% "Based on this evidence, what is the probability that the defendant is guilty of the murder of Mr. Oden? ___% "Based on this evidence, how would you find the defendant (assuming a bench trial)? (Guilty or Not Guilty)" As to the first two questions, the judges produced results similar to the subjects in Koehler's study. Among the 24 judges who saw the frequentist presentation, the average probability that the blood was from the perpetrator was 42%; whereas it was 73% among the 29 judges who saw the subjective format.15 This difference was significant 14 These judges preferred not to have the jurisdiction identified. Three judges declined to allow us to use their results for any further discussion, and the results from these judges have been omitted from all analysis. 15 14 of the judges did not respond to this question or responded by indicating "I don't know". 12 statistically.16 Judges echoed this result when asked about the probability that the defendant committed the crime. The judges in the frequentist condition gave an average probability of guilt as 44% (21 judges), as opposed to 68% in the subjective condition (27 judges). This difference was also significant statistically.17 These assessments, however, did not affect the judges' assessments of the verdict. Among the judges who saw the frequentist presentation 20% (6 out of 30) indicated they would find the defendant guilty, as opposed to 25% (9 out of 36) judges who saw the subjective version. This difference was not significant.18 This result also echoes that of Koehler's who found that 32% of lay adults who read frequency statistics were willing to convict, as opposed to 36% of those who read subjective statistics. (Koehler, 2002). The results suggest an interesting dichotomy. Judges (and lay adults) were influenced by the format. They assessed the facts differently and reported that they were more persuaded by the evidence when it was presented in the subjective format. But the format did not affect the judges' willingness to convict the defendant. This result suggests that asking judges to assess guilt changes their perspective. We believe that asking when judges render a verdict in a criminal context, they slow down, and think a bit harder. The verdict triggers careful thoughts about due process and the degree of confidence in the evidence in a different way than when we simply ask for an assessment of probabilities. (Nesson, 19??). The question of guilt directly flags confidence in the police and prosecutors, and the judges' own willingness to endure the risk of a wrongful conviction. For judges, a criminal verdict represents a different question from that of the probability of a blood match. To be sure, the fact that we, and Koehler, observed a small trend towards greater conviction in the subjective format suggests that we might not have enough statistical power to detect an effect on a binary outcome, like guilt. Koehler, in fact, suggests as much, and cites a similar study in which a significant effect was observed on guilt. (Koehler, 2002). But in light of our research on the role of intuition and deliberation, we suggest that the question of guilt prompts more deliberative judgments. Judges and jurors have to be a confident in guilt to conclude that they will convict. This greater confidence requires more deliberation and produces a different judgment than assessment of probability. 16 F(1,49) = 7.86, p = .008. Specificity (general versus specific city) and the interaction between specificity and format were not significant (F = 0.83 and F = 0.89, respectively). 17 F(1,44) = 4.83, p = .03. Specificity (general versus specific city) and the interaction between specificity and format were not significant (F = 1.26 and F = 0.00, respectively). 18 Fisher's exact test, p = 0.77. 13 III. Conjunction Effects: Or, Persuading With Specificity?19 The so-called “extension rule” 20 is “perhaps the simplest and most transparent rule of probability theory.” 21 This rule states that “if A is a subset of B, then the probability of A cannot exceed that of B.”22 For example, the probability of a terrorist act in New York City (A) cannot exceed the probability of a terrorist act in the United States (B) because the United States includes New York City (as well as many other locations that might be subjected to such an attack). Implicit in the extension rule is the “conjunction rule.”23 This rule states that “the probability of A&B can exceed the probability of neither A nor B, since it is contained in both.”24 For example, the probability of a terrorist attack in New York City carried out by Muslim extremists (A&B) cannot exceed the probability of a terrorist attack in New York City (A) or the probability of a terrorist act carried out by Muslim extremists (B). The extension and conjunction rules are deductively accurate, as only a little deliberation shows. Psychologists have found repeatedly, however, that people tend to violate these rules of logic. Rather than engaging in careful deliberation, which leads to compliance with the rule, people often engage in intuitive, impressionistic thinking and thereby violate the rules. Apparently, it seems more likely that New York might face a terrorist act committed by Muslim extremists than that New York might face a terrorist attack. The most famous problem of this type—the “Linda Problem”25 —is instructive. In this widely administered problem, Professors Amos Tversky and Daniel Kahneman gave subjects the following information about Linda: Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of 19 Note: Much of this section (everything other than the between-subjects conjunction problem) was taken from our article in the Duke Law Journal ("The 'Hidden Judiciary': An Empirical Investigation of Executive Branch Justice"), where the data from our extensional fallacy problem was first reported. 20 Maya Bar-Hillel & Efrat Neter, How Alike Is It Versus How Likely Is It: A Disjunction Fallacy in Probability Judgments, 65 J. PERSONALITY & SOC. PSYCHOL. 1119, 1119 (1993); see also Amos Tversky & Daniel Kahneman, Extensional Versus Intuitive Reasoning: The Conjunction Fallacy in Probability Judgment, 90 PSYCHOL. REV. 293, 293–94 (1983) (“[The] probability theory does not determine the probabilities of uncertain events—it merely imposes constraints on the relations among them. For example, if A is more probable than B, then the complement of A must be less probable than the complement of B.”). 21 Researchers have described this as “perhaps the simplest and most transparent rule of probability theory,” Bar-Hillel & Neter, supra note 20, at 1130, which “even untrained and unsophisticated people accept and endorse.” Id. 22 Id. at 1119. 23 Id. 24 Id. 25 See Tversky & Kahneman, supra note 20, at 297; Amos Tversky & Daniel Kahneman, Judgments of and by Representativeness, in JUDGMENT UNDER UNCERTAINTY: HEURISTICS AND BIASES 84, 92 (Daniel Kahneman, Paul Slovic & Amos Tversky eds., 1982). 14 discrimination and social justice, and also participated in anti-nuclear demonstrations.26 The researchers asked the subjects to rank-order the likelihood of eight different statements, including these three: “Linda is active in the feminist movement”; “Linda is a bank teller”; and “Linda is a bank teller and is active in the feminist movement.”27 The description made it seem as though Linda was a feminist, but not a bank teller. As a result, subjects generally reported that it was more likely that Linda was a bank teller active in the feminist movement than that she was a bank teller.28 Obviously, this is inaccurate. Under the conjunction rule, it cannot possibly be the case that it is more likely that Linda was a bank teller and was active in the feminist movement than that she was simply a bank teller.29 To explore whether judges would comply with, or violate, the conjunction rule, we gave those who attended the national conference a problem called the “Employment Case.” We asked a group of 102 Administrative Law Judges ("ALJ's) to imagine that they were presiding in a case involving an employment dispute between Dina El Saba, a public sector employee, and the agency for which she previously worked. The judges learn that Dina worked as an administrative assistant for a senior manager named Peter before the agency fired her. While at the agency, Dina’s employment evaluations were “average” to “above average,” so she claimed her termination must have been motivated by unlawful discrimination. The agency contends, instead, that it terminated Dina because she repeatedly violated workplace rules and norms. Among other things, she “took too many breaks during the workday and took odd days off as holidays”; “dressed in ways that made her co-workers and agency visitors feel uncomfortable, covering herself mostly in black”; acted “odd” and “aloof”; and refused to eat lunch in the presence of male co-workers. Based solely on these facts, we asked the ALJs to rank-order the likelihood of the following four options: _____ The agency unlawfully discriminated against Dina based on her Islamic religious beliefs _____ The agency actively recruited a diverse workforce _____ The agency adhered to its internal employment policies 26 Tversky & Kahneman, supra note 20, at 297; Tversky & Kahneman, supra note 25, at 92. Tversky & Kahneman, supra note 20, at 297; Tversky & Kahneman, supra note 25, at 92. 28 Tversky & Kahneman, supra note 20, at 297; Tversky & Kahneman, supra note 25, at 93. 29 The Linda problem can be criticized as methodologically flawed in that people might be assuming that the single feature is actually meant to be the conjunction of “bank teller” and “not active in the feminist movement.” But Professors Tversky and Kahneman have conducted a version in which they avoid this problem by changing the second response to “Linda is a bank teller whether or not she is active in the feminist movement” and obtained similar results. Tversky & Kahneman, supra note 20, at 299. 27 15 _____ The agency actively recruited a diverse workforce but also unlawfully discriminated against Dina based on her Islamic religious beliefs Based on the facts we provided, we believed that the ALJs would deem it likely that the agency unlawfully discriminated, but not that the agency actively recruited a diverse workforce. Thus, we expected the ALJs to deem the fourth option (“The agency actively recruited a diverse workforce but also unlawfully discriminated against Dina based on her Islamic beliefs”) more likely than the second option (“The agency actively recruited a diverse workforce”), even though, as a matter of deductive logic, the second option must be more likely than the fourth. Judges who identified the fourth option (“The agency actively recruited a diverse workforce but also unlawfully discriminated against Dina based on her Islamic beliefs”) as more likely than the first option would also be violating the conjunction rule (“The agency unlawfully discriminated against Dina based on her Islamic religious beliefs”). As expected, we found that the ALJs violated the conjunction rule. Rather than thinking through the problem deliberatively, which would have led them to rank-order options one and two as more likely than option four, we found the exact opposite. Of the ninety-nine ALJs who responded to this problem, eighty-four (or 84.8 percent) violated the conjunction rule in some way. These eighty-four judges committed all of the possible errors, albeit at different rates: thirty-three rated the fourth option as either equally likely as, or more likely than, both the first and second options;30thirty-six ranked the fourth option as either equally likely as, or more likely than, the first option (but not the second option);31 fifteen ranked the fourth option as either equally likely as, or more likely than, the second option (but not the first option).32 Thus, the problem lured most judges into committing the conjunction error, and they were more likely to commit the error with the distraction of the more probable of the two components, just as happened in the classic Linda problem.33 30 Of these thirty-three judges, twenty ranked the fourth option as more likely than both options one and two, eight wrote in that they were all equally likely, three assigned the same ranking options one and two as option four, and two simply put a check mark next to option four. 31 Of these thirty-six judges, one ranked the first and fourth options as equally likely. 32 Of these fifteen judges, seven ranked the first and fourth options as equally likely. 33 Although female judges were somewhat more likely to commit the conjunction error than male judges (89 percent, or fifty out of fifty-six, versus 79 percent, or thirty-one out of thirty-nine, respectively), this difference was not significant. Fisher's exact, p = .24. More experienced judges tended to be more likely to commit the error than their younger counterparts, but this trend was also not significant. Logistic regression of committing the error on years of experience yielded a negative, but not significant, coefficient of -.048, z = 1.35, p = 0.18. The error rate was also nearly identical among judges who make recommendations as opposed to the other judges (86 percent, or thirty-six out of forty-two, versus 84 percent, or forty-two out of fifty, respectively). This was not a significant difference. Fisher's exact, p = 1.00. Previous work on the CRT has shown that people who score high on the CRT are less likely to commit the conjunction error. Oechssler et al., supra note, at 5 (“Of our subjects in the Low CRT group, 62.6% [committed the conjunction fallacy on the ‘Linda’ problem, but t]his percentage is much lower for the High CRT group at 38.3% . . . .”). We found that among the judges who scored perfectly on the CRT and answered this question, 72 percent (thirteen out of eighteen) committed the fallacy, whereas 86 percent (sixty out of seventy) of the judges who got at least one of the CRT questions wrong committed the fallacy. The percentage who committed the conjunction error broken down by exact CRT score: zero right on CRT, 16 As with our other findings, we believe that the judges in responding to this problem were judging a different object of judgment than one might have thought. Judges were not treating this problem as one of deductive logic and taking advantage of its logical structure. Rather, they were trying to make sense of the story, which led them to be distracted by their intuitions. Just like "Linda" feels like a feminist bank teller and not an ordinary bank teller. Dina feels like a likely target of discrimination, so options that highlight that fact seem more likely than those that do not. This suggests that providing more detail can induce judges to come to believe fact patterns more readily, even though they necessarily (as a matter of loci) make the stories less likely. But like the other efforts to channel judges with misdirection in probabilistic evidence, judges might resist this effect when asked a question with more direct legal effect. We crafted another test of extensional reasoning that provided a more straightforward test of the thesis that adding more detail will persuade judges. We gave this problem to the same group of 68 judges from an urban Eastern jurisdiction who read the problem on probabilistic evidence. In the problem, labeled "Evaluation of Probable Cause", we asked judges to assess probable cause in the issuance of a search warrant. We did not ask the judges whether they would grant a warrant or not, but asked them to assess the likelihood that the defendant had actually committed the crime. The facts ran as follows: "Imagine that you are trying to decide whether to issue the police a warrant to search the home of a suspect in a murder case. The suspect is John P., a meek man, 42 years old, married with two children. His neighbors describe him as mild-mannered, but somewhat secretive. He owns an import-export company based in New York City, and he travels frequently to Europe and the Far East. Mr. P. was convicted a couple of years ago of smuggling precious stones and metals (including uranium) and received a suspended sentence of 6 months in jail and a large fine. The murder victim is one of his employees. "The police have an eyewitness who knows Mr. P and states that he thinks that he may have seen Mr. P. at the victim’s apartment shortly before the time that the police believe that the victim was murdered. The police have no other meaningful evidence against Mr. P. They have no other suspects at this time. "In deciding whether there is probable cause for issuing the warrant, it is important to get a sense of the likelihood that Mr. P may have committed the crime. . . . " 84 percent (twenty-five out of thirty); one right on CRT, 86 percent (nineteen out of twenty-two); two right on CRT, 89 percent (sixteen out of eighteen); three right on CRT, 72 percent (thirteen out of eighteen). This difference was not, however, significant. Fisher's exact, p = 0.29. We also ran a logistic regression of whether the judges committed the error based with CRT score as a predictor, which also showed no significant effect. z = .72, p = .47. That we only observed a trend might be due to a somewhat small sample size with which to identify the effect. 17 All of the judges received these facts, but we varied the question we asked judges slightly. For half, we asked simply " Given the description above, how likely is it that Mr. P. committed the murder?" For the other, we added a motive: "Given the description above, how likely is it that Mr. P. committed the murder to prevent the victim from talking to the police about smuggling?"34 Asking whether the defendant committed this crime as a result of this particular motive excludes the possibility that he committed the crime out of some other motive. Hence, the probability that the defendant committed the crime because of that particular motive is necessarily smaller than the probability that he defendant committed that particular crime. But research on conjunction and extensional problems suggests that adding details that explain the event make it seem more likely. And theorists have suggested that this affect might influence how judges and juries think. (Kelman et al, 1996). But the judges in our study were not affected by this manipulation. The 32 judges in the "no motive" condition stated that, on average, the defendant had a 26.5% chance of having committed the crime. The 33 judges in the "motive" condition gave a slightly smaller average of 23.5%. In addition to running in the opposite direction from that predicted by the previous work on the conjunction fallacy, this difference was not significant statistically.35 In effect, when judges were properly focused on the relevant issue, they were not distracted by conjunction effects. This single study cannot rule out the possibility that conjunction effects cannot be used to persuade judges. They might simply have imputed the motive (which was the obvious one) in the "no motive condition" and under other circumstances, the effect might emerge. And conjunction effects have proven to be powerful in other, similar contexts. (Tversky and Kahneman, 1983). But it supports the continuing theme of this paper that judgment can be influenced by an adjustment to the object of judgment. In the "Dina" problem, judges did not realize that a normative approach would have set them off to looking for logical, deductive relationships among the potential stories. But they were quite adept at avoiding such distractions when focused directly on a question of whether a potential defendant committed a crime. It is as if judges are rearranging the target of their judgment in an appropriate way. Some of our studies above show that misdirection judges might be a profitable undertaking for lawyers. Adding a weak witness produced notable context effects, describing probability in a subjective probability format made forensic evidence more compelling, and adding conjunctive details distracted judges from the logical structure of a legal problem. But our studies also show that judges resist these efforts, and can rearrange the target of their judgments so as to avoid misdirection. Even though judges found forensic testimony more persuasive in the subject format, it did not influence their judgments, and judges were not moved by conjunctive details in assessing the likelihood that a potential defendant had committed a crime. Judges' ability to defend against 34 We also indicated to both groups that they should use a percentage to respond: "(Please state a percentage between 0% and 100%.)" 35 t(60)= .063, p = .53. 18 potential misdirection suggests that judges manipulate the object of a legal judgment on their own in ways that can avoid bias. Our final two sets of problems test this thesis with the potentially misleading influence of the hindsight bias (in this study) and of evidence that must be suppressed (in the final study). IV. Hindsight Bias: Probable Cause Probability The hindsight bias provides an excellent test for the thesis that judges can spontaneously alter the object of judgment so as to avoid erroneous judgments. Psychologists have found that people are prone to the “hindsight bias,” according to which the past comes to seem more predictable than it actually was.36 Once the outcome of an event is known, that outcome comes to feel inevitable or at least much more likely to have occurred than it would have seemed before it actually happened. This bias is thought to influence a wide range of judgments in legal settings (Rachlinski, 1998). And we have found that in at least two settings, it influences judges. (Guthrie et al., 2001; Rachlinski et al., 2009). Legal scholars have argued that the hindsight bias likely influences assessments of probable cause. (Bibas, 20??). Judges assess probable cause both in foresight when they decide whether to grant a search warrant, and in hindsight, when prosecutors attempt to introduce evidence obtained in a search conducted pursuant to an exception to the requirement of a warrant. In the latter instance the judge must assess whether the police had probable cause to conduct the search, even though the judge knows that the search produced incriminating evidence. Arguably, knowledge of the outcome taints that judgment. Judges might determine that probable cause existed in hindsight, even in circumstances in which they would not have granted a warrant in foresight. In our initial investigation of this issue, we found no evidence that the hindsight bias influences judgments of probable cause. (Wistrich et al, 2005). In our study, we asked judges either to grant a warrant or whether to admit evidence obtained after a search. In both cases, we presented identical facts to support the finding of probable cause, except that in hindsight, we informed the judges that the search had produced incriminating evidence. To our surprise, the same percentage of judges admitted the testimony as granted the warrant. Hindsight did not seem to cloud their judgment. How did judges manage to avoid the influence of the hindsight bias? We undertook further study of probable cause determinations to find out if they truly avoided the bias, or whether they simply made judgments that did not really turn on their assessments of probability. We recruited a total of 224 sitting judges attending five judicial education conferences to participate in our study: 80 federal district judges (from three different See Baruch Fischhoff, Hindsight ≠ Foresight: The Effect of Outcome Knowledge on Judgment Under Uncertainty, 1 J. EXPERIMENTAL PSYCHOL.: HUM. PERCEPTION & PERFORMANCE 288, 288 (1975) (“Reporting an outcome's occurrence increases its perceived probability of occurrence . . . .”). 36 19 educational conferences), 43 federal magistrate judges, and 101 state trial judges from Florida. The federal judges were all attending educational conferences sponsored by the Federal Judicial Center. The Florida judges were attending an annual educational conference held each year and organized by the judges themselves. The educational conferences were all generic, including a wide variety of topics for the judges. In all cases, the data were collected at a “breakout” session during the conference. The judges who participated in this research were thus judges who attended our panel, entitled “the Psychology of Judging”, rather than some other educational session.37 As with our other studies, we did not ask the judges to identify themselves by name, but we did ask them to identify their gender,38 number of years of experience on the bench,39 and title (to ensure that they were judges). We also asked whether they had served as a defense attorney or criminal prosecutor before becoming a judge.40 Finally, we inquired into their political affiliation. For the Federal District Judges this inquiry consisted of asking them to identify the political party of the President who appointed them. For the magistrates and the state judges (both of whom are appointed largely 37 The judges in this sample all shared important characteristics. All of the judges participating in this study were appointed, rather than elected. The Florida judges and the federal magistrate judges are appointed for fixed terms by merit-based selection panels. They both can be (and often are) re-appointed. The Federal District Judges are appointed by the President with the Advice and Consent of the U.S. Senate, and have life tenure. All of the judges function essentially as trial-court judges who manage settlement, hear motions, and preside over trials, although some variations exist between the three groups. The Federal District Judges preside over all manner of civil and criminal suits filed in the federal courts in the United States. Federal cases are individually assigned to a single federal judge, meaning that the judges preside over all stages of a case, except appeal. The U.S. Magistrate Judges perform a role similar to that of the Federal District Judges, except that the Magistrates may conduct trials only in civil cases and misdemeanor criminal cases. By and large, the Magistrates focus more of their attention on pre-trial matters, such as discovery and settlement, than do the Federal District Judges. The workload of the state trial judges resembles that of the federal district judges, with one exception. The state judges rotate through various departments (civil, criminal, family, and probate), such that at any point in time, these judges will only hear matters arising within that department. Furthermore, some of the state judges specialize in a single department; notably, the sample includes several judges who hear only family court matters. 38 Overall, only 21.6% of the judges in this part of the study were women (48 out of 222--with 4 judges who did not answer the question). The percentage of women did not vary much by judge type: 20.3% among the federal district judges (16 out of 79); 23.3% (10 out of 43) among the Magistrate judges; and 21.4% (21out of 98) among the state judges. 39 Overall, the judges had an average of 12.4 years of experience in their current positions as judges. As with gender, the level of experience did not vary much by judge type. The federal district judges served for an average of 11.4 years, the magistrates an average of 10.2 years, and the state judges for 14.4 years. This figure understates the experience of the federal judges, however, as many had experience in state court before serving as a federal judge. Among the 124 federal district and magistrate judges who responded to the question about their experience, 41 (33.1%) had previous experience as a judge. Adding this experience to their time on the federal bench shows the district judges and magistrates had an average of 15.1 and 12.1 years of total experience as judges, respectively. Factoring in this prior experience among the federal judges reveals that the three samples of judges had an average of 14.2 total years of experience as judges. 40 With regards to their experiences before becoming judges, 48.0% (106 out of 221) had been former prosecutors (46.3%, or 37 out of 80, of the federal district judge; 46.5%, or 20 out of 43 federal magistrate judges; and 50.0%, or 49 out of 98 of the state judges), and 55.0% (121 out of 220) had had experience as criminal defense attorneys (51.9%, or 41 out of 79, of the federal district judges; 21.2%, or 22 out of 43, of the federal magistrate judges; and 59.2%, or 58 out of 98 of the state judges). 20 through a committee recommendation process), we instead asked: “Which of the two major political parties in the United States most closely matches your own political beliefs?”41 None of these demographic variables influenced the judges’ decisions and we therefore omit further discussion of them. We gave each participating judge a questionnaire that included between four and seven scenarios, only one of which dealt with probable cause. Some of the data from the other items have been reported elsewhere (Wistrich et al., 2005). The probable cause questionnaire was always included as the second item in the questionnaire, and the demographic information on the judge as the last page. Each judge received one of four versions of the questionnaire to create a 2x2 between-subjects design. Half of the judges received a questionnaire cast in a foresightful perspective and half in a hindisghtful perspective. In foresight, the critical inquiry was whether the facts of the scenario constituted probable cause for a search and thereby justified granting a warrant. In hindsight, the materials indicated that a police officer had already conducted a search that revealed incriminating information, and the critical inquiry was whether the circumstances surrounding the search had constituted probably cause, thereby making the search legal. Furthermore, we varied the severity of the crime being investigated. The crime consisted either of an assault on or murder of a police officer. The questionnaires were shuffled thoroughly before our presentation and were distributed randomly to the judges. The basic story in all four variations was the same. The one-page fact pattern, labeled “Fourth Amendment Issue,” began with one of the following two paragraphs: FORESIGHT: “You have been asked to issue a telephonic warrant authorizing Officer George McAllen to search the trunk of a parked car for evidence related to the battery of a police officer. After you placed McAllen under oath, he relayed the following information to you:” HINDSIGHT: “You have been asked to rule on a motion to suppress evidence obtained from a warrantless search of the trunk of a parked car made by Officer George McAllen for evidence related to the battery of a police officer. The parties’ briefs convey the following information:” Both versions then presented the following facts (with variations by crime noted): 41 Across all of the judges, 48.6% (101 out of the 208 who answered the question) identified themselves as more affiliated with the Republican than Democratic Parties (or were appointed by a Republican President). This varied somewhat by judge type: 57.0% (45 out of 79) of the federal district judges were appointed by a Republican President; only 15.4% (6 out of 39) of the federal magistrate judges identified more with the Republican party; and 55.6% (50 out of 90) of the state judges identified with the Republican party. 21 “Officer McAllen was part of a task force investigating a drug distribution network operating in a poor urban area. McAllen was driving to meet a potential informant at 1:47 am on a Saturday morning when he received a call from his supervisor stating that another police officer had just been attacked while operating undercover one mile from McAllen’s location. [Battery: The officer had been struck hard in the head by a blunt instrument. Although the attack had left him groggy, he was expected to recover fully. Murder: The officer had been bludgeoned to death by a blunt instrument.] “The perpetrator remains at large and the only available information about him is that he is likely a drug dealer who had identified the officer. The officer also wounded the perpetrator with a knife, which was found at the scene. [NOTE: in hindsight, we used past tense] “Officer McAllen remained in place after receiving the call, waiting for his informant. Fifteen minutes later, McAllen observed a late-model, black BMW park in front of a small nightclub. The driver got out of the car, opened the back door, pulled out a long, curved piece of metal from the seat, and placed it into the trunk of the car. The driver closed the trunk and then entered the club. McAllen noted that the driver had a bandaged hand. “Officer McAllen walked over to the car. He observed that the front left tire was a small, temporary tire of the type used as a spare. This observation made him realize that the metal object was likely a crowbar. He looked into the car and observed a car jack on the floor and three envelopes on the back seat, two of which appeared to be stuffed with cash. McAllen also observed a stain, possibly from blood, on the steering wheel.” The materials then closed with one of the two passages below: FORESIGHT: “Based on these observations, Officer McAllen believes there is probable cause to search the trunk of this car and has asked you to issue a telephonic warrant authorizing the search. “Would you issue the warrant? _____ Yes, there is probable cause for the search; I would issue the warrant _____ No, there is not probable cause for the search; I would not issue the warrant” HINDSIGHT “Based on these observations, Officer McAllen believed that there was probable cause to search the trunk of the car. He opened the trunk and found a bloodied crowbar and a large quantity of white powder that appeared to be cocaine. After phoning for backup, McAllen and his colleagues arrested the driver when he returned to his car. 22 Subsequent investigative work confirmed that the driver’s fingerprints were on the crowbar. DNA tests also matched the blood on the crowbar with that of the officer who had been attacked. The BMW driver is now being prosecuted for battery and drug violations. “The driver’s defense attorney has filed a motion to suppress the evidence obtained from the trunk on the ground that there was no probable cause to conduct the search. “Would you allow the evidence to be admitted? _____ Yes, there was probable cause for the search; I would admit the evidence _____ No, there was not probable cause for the search; I would not allow the evidence to be admitted” The final question thus provides the first dependent variable. It requests that the judges give their opinion as to whether the circumstances supported a finding of probable cause. The call of the question lists both the determination of probable cause and the consequence. The materials presented above depict the versions presented to the state judges. The federal version differed slightly in that the law enforcement agents were identified as FBI agents, rather than police officers. This was necessary to ensure that the crime was of a type that would come before a federal judge. In effect, each of the versions gives an ambiguous set of information about an individual who may have been a drug dealer who attacked a police officer (or FBI agent). The story was intended to suggest either that the suspect was the perpetrator, or just someone out at night who had recently changed his tire. In foresight, the materials simply ask whether granting a warrant is appropriate. The hindsight version goes on a bit further to indicate that the police officer conducted the search without a warrant. It describes the results of the search and indicates that the suspect was indeed the perpetrator--presenting evidence that leaves little doubt about it. None of the evidence will be admissible, however, unless the police officer had probable cause to conduct the search. Because the search involved an automobile, the officer can conduct the search without a warrant. Law enforcement officials can engage in warrantless searches if exigent circumstances would justify acting quickly, without the time necessary to obtain a warrant. Case law firmly establishes that the search of an automobile constitutes exigent circumstances, owing to the concern that automobiles are apt to be moved while the officer obtains the warrant. Carrol v. United States, 267 U.S. 132 (1925); United States v. Ross, 456 U.S. 798 (1982). Officers sometimes seek a warrant, even when exigencies would excuse its absence, as a means of ensuring the admissibility of any evidence they might obtain. Thus, both our foresight and hindsight conditions portray realistic situations. Exigency does not, however, excuse the necessity that probable cause be present. Absent probable cause, a judge should not issue a warrant in foresight or should suppress evidence obtained in hindsight. 23 At the top of the page that followed the probable cause hypothetical, the judges confronted a second question concerning the scenario. They had been given no notice of this question, although had they turned ahead before answering, they might have seen it. This second question was designed to elicit a probability estimate as to the success of the search. In the foresight conditions, we asked: “In the problem on the previous page, what is the likelihood that the search, if conducted, would uncover evidence that would incriminate the driver in the attack on the police officer?” In the hindsight conditions, we asked: “In the problem on the previous page, if Officer McAllen had requested a telephonic warrant before conducting the search, what would you have said was the likelihood that the search would have uncovered evidence that would incriminate the driver in the attack on the undercover police officer?’ Below the question was a blank line ending in a percent symbol. As to the determination of probable cause, the judges overall displayed no significant effects, as the table below shows. Across both crimes, 57.1% (68 out of 119) of the judges in foresight found the circumstances supported a finding of probable cause, as compared to 54.3% (57 out of 105) of the judges in hindsight. Among the judges reviewing the battery case, 55.7% (34 out of 61) in foresight found probable cause, as compared to 44.0% (22 out of 50) of the judges in hindsight. Among the judges reviewing the murder case, 58.6% (34 out of 58) in foresight found probable cause, as compared to 63.4% (69 out of 113) in hindsight. Logistic regression of the choice on time and crime revealed no significant main effects or interactions.42 Table: Percent of judges finding probable cause, by condition. Battery Murder Total Foresight 55.7 (34/61) 58.6 (34/58) 57.1 (64/119) Hindsight 44.0 (22/50) 63.6 (35/55) 54.3 (57/105) Total 50.5 (56/111) 61.1 (69/113) 54.0 (121/224) These results replicated and extended our earlier finding that the hindsight bias does not appear to influence judgments of probable cause. This scenario replicates the result that judges make the same judgments in foresight as hindsight with respect to probable cause in a different fact pattern. And we extend the result by showing that the severity of the crime does not influence judges either. We had thought that even judges 42 All p’s > .20. 24 could overlook their knowledge that the defendant was actually guilty for minor offenses, it would be more difficult to did so for more serious crimes. The more serious the crime, the more costly it is to suppress the evidence. In the literature on the hindsight bias, in fact, negative outcomes produce a more significant bias than positive ones, and we reasoned that an even more negative outcome might facilitate the bias. But judges resisted this manipulation—at least in this hypothetical setting. Our second dependent measure affords us some ability to assess how it is that judges avoid the influence of the hindsight bias. In fact, our initial question is somewhat unlike that which hindsight bias studies typically pose. Typically, researchers ask for assessments of probability, whereas we had asked for a judgment with direct legal effect. Our follow-up question, however, asked judges to make probability estimates. The results also showed no overall hindsight bias effects, at least at a superficial level, as the Table below indicates. ANOVA of these results showed no significant main effect of interaction.43 Table: Mean Rating of probability, by condition. Battery Murder Total Foresight 53.4 56.9 55.2 Hindsight 56.0 61.1 58.6 Total 54.8 59.0 56.5 But these results are apt to be influenced somewhat by the judges’ rulings on either granting a warrant or admitting the evidence. We asked the judges these questions before asking for a probability assessment. When we re-ran the analysis, controlling for the effect of their ruling, we found that time had a marginally significant effect on the probability estimate.44 The Table below, which collapses across the type of crime, but controls for the judges’ rulings, shows a marked hindsight bias when judges determined that there was no probable cause. It also shows that the judgment of probable cause did correlate with their probability judgments.45 Hence, the judges were subject to a hindsight bias, but their rulings masked the effect. All p’s > .20. ANOVA of the probability estimate on time, crime, ruling, and all interactions. Main effect of time was marginally significant, F(1, 183) = 3.04, p =.08. 45 This was significant as well. F(1, 193) = 27.8, p < .001 43 44 25 Table: Mean Rating of probability, by condition. Ruling on Probable cause No Yes Foresight 40.5 65.0 55.2 Hindsight 52.0 64.6 58.6 Total 46.3 64.8 56.5 Condition Total Thus, the hindsight bias has a complex relationship to the assessment of probable cause. Curiously, judges do not seem to base their rulings of probable cause on an assessment of the likelihood that the search would turn up (or would have turned up) incriminating evidence. Rather, we suspect that the judges are assessing police conduct in a more general manner. Probable cause has generated a mountain of case law—much of which is familiar to the judges we studied and most of which represents a judicial effort to manage police behavior, not guide criminal investigations. Judges largely ignored the probability that the search would turn up incriminating evidence and ignored even the severity of the crime. The object of judgment is the judicial evaluation of what constitutes reasonable investigation tactics—not the probability of a successful search. Probability judgments, and the accompanying hindsight bias, lurked in the background. Controlling for the judges’ ruling, the judges gave higher probability estimates of the likelihood of a successful search when evaluating in hindsight—at least among judges who felt that there was no probable cause. Probability, however, had a relationship to the judges’ judgment. Judges who determined that probable cause existed gave markedly higher estimates for the likelihood of a successful search. But it seems likely that the judgment of probable cause drove the probability estimate, rather than the other way around. This pattern of results suggests, overall, that judges are not making probability judgments when assessing probable cause. They focus their attention on different factors and thereby avoid the hindsight bias that he situation would otherwise seem likely to produce. V. Suppressed Confessions: Policing the Police. In our final study reported here, we test another situation in which judges are under great cognitive pressure to rely on an untoward influence—that of suppressed evidence. Bench trials can sometimes place judges in the awkward position of ruling what evidence is to be considered, and then limiting their assessments to only that evidence. Juries have a luxury in that regard, in that they are commonly (although 26 certainly not always) shielded from even knowing what evidence was excluded. Psychology includes a large literature on human ability to ignore known information that does not bode well for a judge who must ignore highly relevant testimony. The human brain does not seem well suited to ignoring relevant information any more than it is well suited to ignoring known outcomes (which produces the hindsight bias). In a previous study, we demonstrated that judges have difficult ignoring evidence that they deem inadmissible. (Wistrich et al, 2005). In numerous contexts, we showed that trial judges rely on evidence even after they rule it to be inadmissible. But two instances, judges seemed to resist the influence of inadmissible evidence. The first was that they resisted the influence of the hindsight bias on their probable case assessments, as discussed above. The second was that they seemed able to disregard a reliable confession that was illegally obtained. In this scenario, we asked judges to determine whether they would find a criminal defendant guilty or innocent in a bench trial. We provided a set of somewhat weak evidence as to the guilt of a defendant charged with robbing a convenience store. Half of the judges also learned that the defendant had provided a reliable confession to the crime, although the confession was obtained two hours after the defendant had asked for a lawyer—a request that he police ignored. When asked judges to rule on the admissibility of this confession, almost all of the judges deemed it inadmissible. And they ignored it as well. Judges who learned of the confession and suppressed it were as likely to convict the defendant as judges who had never heard the confession. As with the hindsight bias, we wondered how it is judges managed this cognitively challenging task. And as with the hindsight bias, we suspected that judges were not truly immune from the effects of the inadmissible information. Anecdotal comments from the judges in the study provided a few clues to what they were doing. Judges noted that the crime was “only a robbery”; no one was hurt, and not much cash was stolen. In such a case, the cost of suppressing the evidence, in terms of a lost conviction, is lower than it would be if the crime were more serious. Also, judges seemed annoyed at the blatant disregard of constitutional rights practiced by the police in this scenario. We suspected that if the police conduct were more extreme, judges might reveal their displeasure with even lower conviction rates than had they not learned of the confession, and of the police misconduct that produced it. In effect, we suspect that the judges are not truly ignoring the confession. Rather, they are balancing different factors—the cost of suppression and the degree of police misconduct. To test this, we altered our scenarios somewhat to vary the severity of the crime (just as with the hindsight bias) and the police misconduct. We thereby created a 2x3 design in which we crossed the severity of the crime (armed robbery versus murder) and the degree of police misconduct (none, ignoring a request for a lawyer, and a severe interrogation). We predicted that the both the severity of the crime and the degree of misconduct would influence conviction rates. 27 For the crime scenario, we used a fact pattern, modeled somewhat after our earlier study. The problem, labeled “Evaluation of a Robbery Trial”, included the following facts (with the added facts making the robbery into a related murder noted in brackets): “Mr. Simson is on trial for bank robbery under 18 U.S.C. § 2113 [MURDER: and a related murder]. Mr. Simson has waived his right to a jury trial. You are thus presiding in a bench trial. The following summarizes the evidence presented at trial: “In the early morning, an armed assailant wearing jeans, a white t-shirt, a ski mask, and black gloves entered a small branch of First Federal Bank (a federally insured bank), ordered everyone onto the ground, and demanded that a teller put money in a plastic shopping bag. The teller complied, quickly emptying $520 that she had just received from a customer into the bag. As she reached for more money, the perpetrator ran off. The robbery was captured on a surveillance camera videotape. [MUDRER: On his way out of the bank, the perpetrator stumbled into a young woman pushing a stroller with her infant son. Startled, the perpetrator fired at the woman twice, killing her instantly. The child was unharmed.] “When police arrived, the teller reported that once outside the bank, the perpetrator pulled off his ski mask, discarding both it and the gun as he climbed quickly into a white Ford Taurus and sped off. Police retrieved the gun and mask; neither had usable fingerprints. The gun was unloaded. The gun had been reported stolen several years earlier by its original owner, who is now deceased. “Several police officers then began a search of the neighborhood for a white Ford Taurus. Two hours after the crime, they found one, parked 10 blocks from the crime scene. The Department of Motor Vehicle records identified the owner as the defendant. The police knocked on the door to his apartment. The defendant matched the height, weight and race of the perpetrator in the surveillance videotape, although he was wearing different clothing. The police then insisted that the defendant accompany them to the station-house to answer questions, which he did. “Upon arrival, the police led him to a room, locked the door, read him his Miranda rights, and began interrogating him. The defendant reported that he had been home alone all morning. The police allowed the teller to listen in from the next room. The teller reportedly said “that sounds like the guy.” The police then placed the defendant under arrest. They obtained a search warrant and searched his apartment. They found shopping bags similar to the one used by the perpetrator of the crime, a pair of black gloves, and clothing matching that of the perpetrator (white t-shirt and jeans) in the washing machine. The defendant also had nearly six hundred dollars in cash in his apartment. The police did not find firearms or ammunition of any kind.” 28 In the two versions in which there was no confession, the materials ended by noting: “The police continued questioning the defendant, but he requested a lawyer and the interrogation ended.” Then the materials ended by asking: “Based solely on the evidence admitted at trial, would you convict the defendant?” In the two versions involving mild police misconduct, the story continues instead as follows: “The police continued questioning the defendant. Even though the defendant clearly requested a lawyer, twice, the police refused to call one and continued the interrogation. Two hours later, the defendant confessed, and agreed to write out a description of the crime. His written description matched the events perfectly, including the fact that he discarded the ski mask and gun outside the store (which the police had not told him). The entire interrogation was recorded with both video and audio.” In the two versions involving severe police misconduct, the story continued instead to state that: “The police continued to interrogate the defendant. During the entire interrogation, they had denied the defendant access to the restroom, and he ultimately soiled his clothing. One officer had threatened the defendant and “pushed him around.” After nine hours of this treatment, the defendant confessed and agreed to write out a description of the crime. His written description matched the events perfectly, including the fact that he discarded the ski mask and gun outside the store (which the police had not told him). The entire interrogation was recorded with both video and audio. Finally, in the four versions in which the defendant confessed, the materials went on to indicate that: “The defendant’s attorney has moved to suppress the confession, arguing that the interrogation violated the defendant’s rights under Miranda by continuing after the defendant had requested an attorney. Would you grant the motion and suppress the evidence?” After obtaining the ruling, the materials then asked, “Based solely on the evidence admitted at trial, would you convict the defendant?” The design is described in the table below: 29 Table: Experimental Design Crime Robbery Police Misconduct None (no confession) Murder None (no confession) Ignored requests for lawyer Ignored requests for lawyer 9-hour severe interrogation 9-hour severe interrogation We presented this scenario at multiple judicial education conferences in order to obtain enough data to fill out all six conditions. Much of the data came from judges who are described above. Notably the 81 Federal District Judges, 44 U.S. Magistrates, and 101 Florida State judges who participated in our extended study of the hindsight bias also responded to this scenario. Additionally, 88 judges in attendance at a conference of California State trial judges in May of 2006 (in Palm Springs) assessed this scenario for a total of 314 judges. In analyzing the results, we excluded the 7 judges who declined to suppress the evidence.46 Excluding these judges, the Table below reports the conviction rates by condition47: Table: % Conviction by Condition No Confession 2-hour 9-hour Total Robbery 29.6 (13/44) 43.1 (22/51) 28.3 (15/53) 33.8 (50/148) Murder 24.0 (12/50) 36.0 (18/50) 44.0 (22/50) 34.7 (52/150) Total 26.6 (25/94) 39.6 (40/101) 35.9 (37/103) 34.2 (102/298) As the table shows, the severity of the crime did not affect conviction rate when the defendant did not offer a confession.48 Unlike our previous study, the confession seemed to affect judgment, although as predicted, this effect depended both on the crime and on the police misconduct. When the police engaged in a minor violation of the defendant’s rights, the judges were influenced by the confession; even though they suppressed the confession, they were more likely to convict the defendant.49 This 46 These consisted of 1 District Judge in the murder/9-hour condition who convicted; 1 District Judge in the robbery/9-hour condition who acquitted; 1 Magistrate in the robbery/2-hour condition who convicted; 1 Florida judge in the murder/2-hour condition who acquitted; 1 Florida judge in the murder/2-hour condition who convicted; 1 Florida judge in the robbery/2-hour condition who convicted; 1 Palm Springs judge in the robbery/2-hour condition. 47 Note: 9 judges gave no verdict: 6 Florida judges (1 robbery/no confession; 1 robbery/2-hour; 1 robbery/9-hour; 1 murder/2-hour; 2 murder/9-hour) and 3 PS judges (1 robbery/2-hour; 1 murder/2-hour; 1 murder/9-hour. 48 Fisher’s exact test, p = .642. 49 Collapsing across crime, the confession in the minor violation produced marginally significantly more convictions. Fisher’s Exact Test, p = .07 30 conviction rate did not vary with the type of crime.50 When the police misconduct was severe, however, the conviction rate dropped slight for the robbery (albeit not significantly),51 but did not drop for the murder.52 Analyzing the results using logistic regression revealed that the crime overall had no significant effect53; that the presence of a confession overall had a marginally significant effect54; that the harsh interrogation produced a slightly lower conviction rate than the mild police misconduct55; that the crime by confession (overall) interaction was not significant56; but notably, the crime by harshness of the interrogation interaction was significant.57 These results show that judges are not, in fact, able to ignore the confession. Rather, they are making a more complex judgment. Just as in the case of the hindsight bias, they are assessing the degree of police misconduct. The judges reacted to the greater degree of police misconduct by reducing their willingness to convict the defendant when the crime was less serious. When the crime was a horrific homicide, however, judges were unwilling to punish the police misconduct with lower convictions. Note that throughout, the judges followed the law exceptionally—the confessions were all clearly illegally obtained and almost all of the judges suppressed them. But the judges also took two extra-legal steps: they convicted the defendant more often when they had learned of (and suppressed) the confession, and they punished severe police misconduct not only by suppressing the conviction (which the law obliges them to do), but also by reducing their willingness to convict the defendant. The object of judgment in the case of these materials was thus a combination of the underlying facts, the additional confession, the severity of the crime and the degree of police misconduct. Had the judges been truly ignoring the confession, only the facts would have affected their judgment. 50 51 52 53 54 55 56 57 Fisher’s exact test, p = .54. Fisher’s exact test, p = .15. Fisher’s exact test, p = .54. z = .61, p =.54. z = 1.68, p =.09. z = 1.98, p =.05. z = .20, p =.85. z = 2.03, p =.04. 31 V. Conclusion The results of these studies paint an intricate portrait of how judges might be navigating the risk of cognitive errors in the courtroom. Lawyers have opportunities to exploit judicial vulnerability to untoward influence by manipulating the setting to alter the object of judgment from a normatively appropriate one (such as credibility of witnesses in an absolute sense) to a misleading one (relative credibility of witnesses). Judges are not always so easily fooled by tricky litigation strategies, however, as they can refocus their attention when the ultimate judgment must be made. In our studies of probabilistic testimony and the misleading effects of conjunctive events, judges seemed to be misled by the illusion in some respects, but ultimately made good judgments on the most critical question (likelihood of guilt). Our study of the hindsight bias also shows that judges seem to divert their attention away from potentially misleading judgments (such as a probability assessment) onto a more stable set of judgments (such as what constitute appropriate police conduct). But even this kind of refocusing can lead judges astray, as we show in our last study, in which judicial attention to police misconduct leads to a somewhat lawless response. Overall, we believe these kinds of studies being paint a more thorough portrait both of judicial vulnerability and resilience to errors in judgment. Judicial errors are not random mistakes, but are products of both the limits of human cognition, efforts by lawyers to exploit those limits, and a judicial effort to cope with those limits. 32