Teaching an Audience of One: The Judicial Reception of Statistical Evidence Mary W. Gray American University, Washington DC mgray@american.edu For the rational study of the law the black letter man may be the man of the present, but the man of the future is the man [woman] of statistics and the master of economics. Oliver Wendell Holmes The Path of the Law (1897) Landmark cases Yick Wo v. Hopkins, 118 U.S. 356 (1886) Baker v. Carr, 369 U.S. 186 (1962) Griggs v. Duke Power Company, 401 U.S. 424 (1971) Castenada v. Partida, 430 U.S. 482 (1977) Hazelwood School District v. United States, 433 U.S. 299 (1977) O.J. Simpson trial (1995) The role of the statistician To present the evidence clearly and ethically To prepare the litigator Cautions for the statistician Legal proceedings are adversarial Expert testimony cannot reach legal conclusions Early involvement is essential Responsibility Accountability Guard your reputation Avoid advocacy Resist unrealistic expectations The tasks of the statistician Be certain that you know what questions must be answered Get the data Clean the data Grapple with the data Consider the strategy of the opposition Communication Statisticians should attempt to promote and preserve the confidence of the public without exaggerating the accuracy or explanatory power of their data. Statisticians should provide adequate information to permit their methods, procedures, techniques, and findings to be assessed. Statisticians should not promise more than they can deliver. Statisticians should address rather than minimize uncertainty. Recognition of ethical concerns Recent headlines: “Vioxx Kept Trial Going in Spite of Concern” “Heart Deaths Concealed?” “US Scientists Say They are Told to Alter Findings” “FDA Employee Seeks Help from Whistle-Blowers Group” “CDC Study Overstated Obesity as a Cause of Death” “EPA Inspector Finds Mercury Proposal Biased” “Abuses Endangered Veterans in Cancer Drug Experiments” “Alarm over Single AIDS Case Is Challenged by Questioners” Missteps People v. Collins (1968) Probability Partly yellow automobile Man with mustache Woman with ponytail Blond woman Black man with beard Interracial couple in a car 1/10 1/4 1/10 1/3 1/10 1/1000 Probability: (1/10)x(1/4)x(1/10)x(1/3)x(1/10)x(1x1000) = 1/12,000,000 People v. Collins p(more than one given at least one) = p(more than one)/p(at least one) p(more than one) = 1 – p(0) – p(1) = 1 – 1/e – 1/e = .26 p(at least one) = 1 – p(0) = 1 – 1/e = .63 p(more than one given at least one) = .26/.63 = .43 Hardly “beyond a reasonable doubt”! Maryland v. Wilson (2002) Are SID deaths in the same family independent? During rebuttal closing argument, the State's Attorney referred to the statistics that the experts relied on in forming their opinion that Garrett's death was criminal homicide, and argued the probability of petitioner's innocence. The State's Attorney did not merely argue that there was a low probability that two SIDS deaths would occur in one family; he argued that there was a low probability that petitioner was innocent. He told the jury, "if you multiply his numbers, instead of 1 in 4 million, you get 1 in 10 million that the man sitting here is innocent. That was what a doctor, their expert, told you." Defense counsel's motion for a mistrial was denied and, instead, the court gave a curative instruction. Results of Wilson and other SIDS cases Wilson’s conviction overturned by Maryland’s highest court Misuse of statistics in British cases led to review of 250 convictions of murder in possible SID (“cot death”) cases Gonzales v. Carhart and Gonzales v. Planned Parenthood Appeal from 8th and 9th Circuit cases involving “partial birth abortions” in which testimony regarding the Chasen study was cited (used also in a Nebraska case) Stephen T. Chasen (2004), Dilation and evacuation at ≥ 20 weeks: Comparison of operative techniques, American Journal of Obstetrics and Gynecology, 190, 1180. Null hypothesis: two different procedures led to the same rate of subsequent premature births p = 0.30 Testimony of government’s expert Dr. Clark: 30% is just “stretching it a little bit” from 5% and “There is a 30 percent chance this occurred by chance and a 70 percent chance that it in fact is a true, meaningful, increased risk.” But how can it be more? Salary differences increase with time But how can that be if raises are always straight percentages? A woman is hired for $40,000 A man is hired for $50,000 The woman gets $10,000 less than the man Each gets a 10% raise Now the difference is more than 10%! U.S. Federal Evidence Rule 702: If scientific, technical or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training or education, may testify thereto in the form of an opinion or otherwise, if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case. U.S. Federal evidentiary standards: “Commercial Marketplace Test” early 1900’s Frye v. United States (1923) generally accepted, e.g., peer-reviewed Daubert v. Merrell Dow Pharmaceuticals (1993) Judge must evaluate the methodology according to the following: testing and validation peer review existence and maintenance of standards controlling the use of the technique rate of error “general acceptance” Effectiveness of the Daubert rule Joiner v. General Electric (1997) and Kuhmo v. Charmichael (1999) further clarification of role of the judge and who is an expert How have statistics been used in legal settings? discrimination (race, sex, age) pipeline regulation police profiling assaults on prisoners SID human rights violations service interruption lotteries drug trials evidence-based medicine environment clinical trials glass fragments anti-trust epidemiology driving offenses redistricting DNA death penalty sales figures intellectual property sentencing recidivism bullet composition product liability earprints What is discrimination? Disparate treatment—similarly situated individuals are treated differently on the basis of race, sex, etc. Disparate impact—a facially neutral criterion or process has a disparate impact on members of one sex, race, etc. How “disparate” must an impact be? Inexorable zero Difference in percentages 4/5’s rule Selection ratio Odds ratio Statistical significance 2 or 3 standard deviations No “bright line” % minorities %minorities probability in pool on jury panels 1 x 10-8 Swain v. Alabama 26% 16% Avery v. Georgia 5% 0% 4.6 x 10-2 Castaneda v. Partida 79% 39% 1 x 10-140 Cassell v. Texas Juries with no African Americans Juries with one African American Juries with more than one African American expected number 9.14 observed number 4 7.87 17 3.99 0 Χ2 = Σ (expected – observed)2/expected = 17.47 p < 0.001 Title IX of the Education Act of 1972 requires in collegiate athletics that Opportunities be provided to men and women in numbers substantially proportionate to their respective enrollments or History and continuing policies of program expansion be demonstrated or Interests and abilities of underrepresented sex be effectively accommodated What does “substantially proportionate” mean? Cohen v. Brown University Percentage women among students: 51% Percentage women among athletes: 39% Cohen v. Brown University Difference? 51%-39% = 12% Ratio? 39%/51% = .76 Pass rates? 12%/20% = .60 Statistically significant? p < .001 Methodology Descriptive statistics t-tests Non-parametric test Matched pairs Lorenz curve Meta-analysis Regression Power Sensitivity Mantel-Hanszel Change point analysis Urn models Lorenz curve Gini index of inequality Capture-recapture Multiple systems estimation Bayesian methods Sampling Acceptance of techniques Probability Note the jury selection, discrimination cases But, there are still gaps in understanding the interpretation of “p,” and the meaning of “reject the null hypothesis” Regression Widely used in discrimination, anti-trust, etc. Are assumptions met? Bayesian techniques First proposed around 1970 M.O. Finkelstein and W. B. Fairley, (1970), A Bayesian approach to identification evidence, Harvard Law Review, 83, 489. Still not generally accepted D.J. Balding (1998), Court condemns Bayes, Royal Statistical Society, 25, 1-2. But statisticians keep trying A.P. Dawid, Julia Mortera, and Paola Vicard (2005), Building blocks for DNA identification from Bayesian networks, 6th International Conference on Forensic Statistics. Roderick J. Little (2006), Calibrated Bayes: A Bayes/Frequentist Roadmap, The American Statistician, 60, 213-223. Sampling Courts have always had problems with sampling Is it a sample or is it the population? The Census Can estimates based on sampling be used in drug cases where quantity determines the sentence? United States v. Shonubi, 103 F. 3d 1085 (2d Circuit 1997) Alan J. Izenman (2003), Sentencing illicit drug traffickers: How do the courts handle random sampling issues? International Statistical Review, 71, 535-556. Can damages be based on sampling? Copyright violations Robert L. Basmann and Daniel J. Slottje (2003), Copyright damages and statistics, International Statistical Review, 71, 557-564. Damages L. Walker and J. Monahan (1998), Sampling damages, Iowa Law Review, 545-568. Where next? Challenging orthodoxy Fingerprints David H. Kaye, Questioning a courtroom proof of the uniqueness of fingerprints (2003), International Statistical Review, 71, 521-533. Bullet composition Committee on Scientific Assessment of Bullet Lead Elemental Composition Comparison, National Research Council (2004), Forensic Analysis: Weighing Bullet Lead Evidence. Washington DC: National Academy of Science. DNA L.A. Foreman, C. Champod, I.W. Evett, J.A. Lambert and S. Pope (2003), Interpreting DNA Evidence: A Review, 71, 473-495. Lie detector evidence New techniques Suzanne Bell and Jennifer Wiseman (2005), Data fusion, data mining and pattern recognition applied to fiber analysis, 6th International Conference on Forensic Statistics. Kathy Barnes (2005), A Bayesian model to control for selection bias, with an application to racial profiling, 6th International Conference on Forensic Statistics. Rose M. Ray and Jeffrey S. Goldman (2005), Demonstration of minority disadvantage when minority populations are small, 6th International Conference on Forensic Statistics. James M. Curran (2003), The statistical interpretation of forensic glass evidence, International Statistical Review, 71, 497-520. Death penalty McCleskey v. Kemp, 481 U.S. 279 (1987) Callins v. Collins, 510 U.S. 114 (1994) Justice Blackmun: From this day forward, I no longer shall tinker with the machinery of death. … Rather than continue to coddle the Court's delusion that the desired level of fairness has been achieved and the need for regulation eviscerated, I feel morally and intellectually obligated simply to concede that the death penalty experiment has failed. It is virtually self-evident to me now that no combination of procedural rules or substantive regulations ever can save the death penalty from its inherent constitutional deficiencies. The basic question -- does the system accurately and consistently determine which defendants "deserve" to die? -- cannot be answered in the affirmative. … The problem is that the inevitability of factual, legal, and moral error gives us a system that we know must wrongly kill some defendants, a system that fails to deliver the fair, consistent, and reliable sentences of death required by the Constitution. J.S. Liebman, et al (2000). A Broken System: Error Rates in Capital Cases. New York: Columbia School of Law. Michael O. Finkelstein and Bruce Levin (2005). The Machinery of Death, Chance, 18, 34-37. New areas Brian Werner (March/April 2005), Distribution, abundance and reproductive biology of captive Panthera Tigris populations living within the United States of America, Feline Conservation Federation Magazine 49 no. 2. Efstathia Bura, Joseph L. Gastwirth, and Reza Modarres (2005), Statistical Methods for Assessing the Fairness of the Allocation of Shares in Initial Public Offerings, Law, Probability and Risk, 4, 143-158. Human rights Estate of Marcos Human Rights Litigation, 910 F. Supp. 1460 (D. Haw. 1995) (aff'd in Hilao v. Estate of Marcos, 103 F.3d 767 (9th Cir. 1996)). Patrick Ball and Jana Asher (2002), Statistics and Slobodan: Using data analysis and statistics in the War Crimes Trial of former president Milosevic, Chance, 15, 17-24. Robin Mejia (2006), Grim Statistics, Science, 313, 288-290. References DeGroot, M., Fienberg, S. and Kadane, J.B. (1986). Statistics and the Law, New York: Wiley. Faigman, D.L., Fienberg, S.E. and Stern, A.C. (2003) Issues in Science and Technology online. Fienberg, S. and Kadane, J.B. (1983). The presentation of Bayesian statistical analyses in legal proceedings. The Statistician, 32, 88-108. Fienberg, S. (ed.). (1989). The Evolving Role of Statistical Assessments in the Courts. New York: Springer. Fienberg, S.E., Krislov, S.H. and Straf, M.L. (1995). Understanding and evaluating statistical evidence in litigation. Jurimetr. J., 36, 1-32. Finkelstein, M. O. and Levin, B. (2001). Statistics for Lawyers, 2nd edition. New York: Springer-Verlag. Gastwirth, J.L. (1988). Statistical Reasoning in Law and Public Policy, vols. I and II, San Diego: Academic Press. Gastwirth, J.L. (1997). Statistical evidence in discrimination cases. J. Royal Statistical Society, Series A, 160, 289-303. References (continued) Gastwirth, J. L. (ed.). (2000). Statistical Science in the Courtroom, Springer-Verlag, New York. Gray, M. W. (1993). Can statistics tell us what we do not want to hear? The case of complex salary structures. Statistical Sciences, 8, 144179. Gray, M. W. (1996). The concept of “substantial proportionality” in Title IX athletics cases. Duke Journal of Gender and Social Policy, 3, 165188. Jasanoff, S. (1998). The age of everyman: Witnessing DNA in the Simpson trial, Social Studies of Science, 28, 713-740. Kadane, J.B. (2005). Ethical issues in being an expert witness, Law, Probability & Risk, 4, 21-23. Kave, David H. and Freedman, David A., Reference guide for statistics, in Reference Manual on Scientific Evidence, Federal Judicial Center, Washington 2000.