Evidence-based policing must engage a wider audience, including students and police officers new to the subject. Jerry Ratcliffe’s accessible and practical book is an ideal introduction. It not only shows why EBP is so important, but also how to do policing with better evidence for better results. If every police officer could master the content of this book, the world would be a safer place. Lawrence Sherman, Cambridge University and Honorary President, Society for Evidence-Based Policing As a police practitioner who understands the complexity of evidence-based policing, I highly recommend Jerry’s new book which breaks it down into manageable bite-size chunks. With simple figures, insightful callouts, flowcharts, and short, easy-to-read chapters, this is the perfect guide to this emerging paradigm. Renée Mitchell, Police Sergeant (Sacramento PD retd.) and President of the American Society of Evidence-Based Policing Police leaders who are interested in understanding the knowledge base of their profession need this book. It helps executives make smart, informed decisions about new plans, programs, and initiatives they may be considering. It also gives leaders the information necessary to collaborate with academics on research projects that will benefit their agencies and the profession. John Hall, Deputy Inspector, New York City Police Department and National Institute of Justice LEADS scholar This book breaks down and practically explains evidence-based policing. Not only is it a useful guide for police officers wanting to understand if their strategies, tactics, or policies are having the desired impact, it should be used by researchers wanting to work with police to better understand evidence-based policing. Mike Newman, Detective Inspector, Queensland Police Service, Australia EVIDENCE-BASED POLICING What is evidence-based policing and how is it done? This book provides an answer to both questions, offering an introduction for undergraduate students and a hands-on guide for police officers wanting to know how to put principles into practice. It serves as a gentle introduction to the terminology, ideas, and scientific methods associated with evidence-based policy, and outlines some of the existing policing applications. A couple of introductory chapters summarize evidence-based policy and its goals and origins. The core of the book eases the reader through a range of practical chapters that answer questions many people have about evidence-based practice in policing. What does good science look like? How do I find reliable research? How do I evaluate research? What is a hypothesis? How do randomized experiments work? These chapters not only provide a practical guide to reading and using existing research, but also a roadmap for readers wanting to start their own research project. The final chapters outline different ways to publish research, discuss concerns around evidencebased policing, and ask what is in the future for this emerging field. Annotated with the author’s own experiences as a police officer and researcher, and filled with simple aids, flowcharts, and figures, this practical guide is the most accessible introduction to evidencebased policing available. It is essential reading for policing students and police professionals alike. Further resources are available on the book’s website at evidencebasedpolicing.net. Jerry H. Ratcliffe is a former British police officer, Professor of Criminal Justice at Temple University, US, and host of the popular Reducing Crime podcast. EVIDENCE-BASED POLICING THE BASICS JERRY H. RATCLIFFE Cover image: barbol88 First published 2023 by Routledge 4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN and by Routledge 605 Third Avenue, New York, NY 10158 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2023 Jerry H. Ratcliffe The right of Jerry H. Ratcliffe to be identified as author of this work has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-367-70326-4 (hbk) ISBN: 978-0-367-70325-7 (pbk) ISBN: 978-1-003-14568-4 (ebk) DOI: 10.4324/9781003145684 Typeset in Bembo by Apex CoVantage, LLC Further resources are available on the book’s website at evidencebasedpolicing.net CONTENTS List of Figuresix List of Boxesxi Forewordxiii 1 What is evidence-based policing? 1 2 What are the origins of evidence-based policy? 19 3 What does good science look like? 33 4 What is the scientific method? 47 5 How do you identify a specific problem? 63 6 How do you find reliable research? 75 7 How do you evaluate policy research? 91 8 How do you develop a hypothesis and research question? 107 9 What are some core research concepts? 117 viii C ontents 10 How do you make research methodology choices? 135 11 How do randomized experiments work? 155 12 How do you design a powerful experiment? 169 13 How do you know if an intervention is significant? 181 14 Where do you publish results? 199 15 What are the challenges with evidence-based policing? 215 16 What is next for evidence-based policing? 235 Index253 FIGURES 1.1 1.2 1.3 3.1 4.1 4.2 5.1 6.1 7.1 7.2 7.3 7.4 8.1 9.1 10.1 10.2 10.3 10.4 10.5 10.6 10.7 Evidence-based policing oversight role for policing models and strategies The scientific method Organizational development of evidence-based practice The relationship between evidence and knowledge terms The scientific method Effect sizes for problem-oriented policing studies Identifying a problem as part of the scientific method Background research as part of the scientific method Evaluating research as part of the scientific method Trend and seasonality are common in serious violent crime data Evidence hierarchy for policy decision-making Decision flowchart for evidence hierarchy studies Developing a hypothesis as part of the scientific method Undertaking a study as part of the scientific method Evidence hierarchy for policy decision-making Quantitative methods flow diagram Posttest-only research design Pretest-posttest research design Posttest-only control group research design Single-group time-series research design Two-group pretest-posttest research design 7 12 14 39 50 58 66 76 92 95 99 103 109 118 137 139 140 141 141 142 143 x F igures 10.8 11.1 11.2 11.3 11.4 11.5 12.1 12.2 13.1 13.2 13.3 13.4 13.5 14.1 15.1 15.2 16.1 Two-group pretest-posttest randomized research design 144 Non-comparable group assignment can introduce selection bias 157 Potentially biased effects with unexplained pre-treatment differences157 Confounding variables can affect groups differently 158 Potentially biased outcomes to true tests with confounder effects158 Simple and block randomization methods 163 Type I and type II error example 171 Three different effects for a hypothetical wellness program177 Analysis as part of the scientific method 182 Contingency table for odd ratio calculation 185 The Kansas City Gun Experiment 187 Excel spreadsheet with Kansas City Gun Experiment values193 Interpreting confidence intervals 194 Publication as part of the scientific method 200 Publication bias can warp perception of treatment efficacy225 Replication as part of the scientific method 226 Basford and Schaninger’s building blocks for change 246 BOXES 2.1 2.2 2.3 3.1 4.1 4.2 4.3 5.1 6.1 6.2 6.3 7.1 8.1 8.2 9.1 9.2 10.1 10.2 11.1 11.2 12.1 James Lind’s scurvy experiment Types of evidence relevant to policy makers The PANDA crime reduction model Ten responsibilities of an expert Ethical considerations in policing research Systematic reviews and crime research as a self-correcting science Cold fusion and the importance of replication CHEERS elements for identifying a problem Characteristics of reputable sources A quick note on citing library sources The dangers of predatory journals Overcoming the Sports Illustrated jinx Hypothesis and theory of foot patrol in Philadelphia The PICOT framework for research questions Different sampling approaches The contribution of pilot studies General principles for focus groups Observer guidance for a ride-along Achieving randomization for the Philadelphia Foot Patrol Experiment Tips for running a randomized experiment The winner’s curse 20 29 30 36 52 55 56 70 77 79 82 96 110 113 123 130 147 149 161 166 170 xii 12.2 12.3 12.4 13.1 13.2 14.1 14.2 14.3 15.1 16.1 16.2 B oxes The errors of the boy and the wolf 172 Red lights and the dangers of underpowered studies 173 Experimental power considerations 178 A quirk regarding the odds ratio 186 Why are you not reading about p-values?190 Adapting the policy brief for a proposed research study 204 How to write a strong lede 206 Ten tips for better presentation slides 211 How Stanislav Petrov prevented nuclear war 216 Ten project implementation problems that usually occur 241 The 12-step guide to conducting an experiment 243 FOREWORD Over ten years ago, the Philadelphia Police Department put my Intelligence-Led Policing book on their promotion exams. They did not consult with me about this, because I would have advised against it. It is a more academic book written for scholars and not the type of text I would recommend for busy professionals. I first learned they had done this during fieldwork with cops in a high-crime area of the city. In the middle of a busy homicide scene, where a man had been shot to death on his porch, one grizzled sergeant with more than a few years on the job turned to me and growled, “I had to read your fucking book”. It is my hope that the text you now hold does not generate a similar reaction. This is designed to be an introductory book that covers the basics and not a thorough treatise. The chapters are shorter and more digestible. Color and graphics break up the page. Pull quotes and boxes identify key text or elaborate on specific areas, and technical language is replaced with English. All of this might frustrate seasoned academics looking for minutiae, but if that is you, bear in mind you are not the students and police officers who are the primary audience for the book. If you are seeking additional reading after this book there are a couple of authored and edited books worth exploring next. xiv F oreword Evidence-Based Policing: Translating Research into Practice by Cynthia Lum and Chris Koper was the first thorough treatment of the subject and remains an excellent resource. Advances in Evidence-Based Policing (edited by Johannes Knuttson and Lisa Tompson), Evidence Based Policing: An Introduction (edited by Renée Mitchell and Laura Huey), and The Globalization of Evidence-Based Policing: Innovations in Bridging the Research-Practice Divide (edited by Eric Piza and Brandon Welsh) all contain chapters by insightful authors. The Cambridge Journal of Evidence-Based Policing, edited by Lawrence Sherman, contains many practical journal articles freely downloadable and worth reading. The list continues to grow, and the website for this book contains additional recommendations. I am grateful to the following people who volunteered material, insights, or their time to help this book along: Katy Barrow-Grint, Rob Flanagan, John Hall, Andy Hill, Natalie Hiltz, Josh Hinkle, Renée Mitchell, Mike Newman, Jess Phillips, Frank Plunkett, Kim Rossmo, Lawrence Sherman, Tom Sutton, Kevin Thomas, David Wilson, and my undergraduate and graduate students. Much love and thanks are reserved for Shelley Hyland who went above-and-beyond by carefully reading through this manuscript several times. I would love to blame her for any errors or omissions, but alas they are all mine. I guarantee they will be gone by the second edition, though by that time I will have introduced a few more. A couple of final notes. Any text in this bold blue font style indicates it has an online glossary entry at the book’s website, evidencebasedpolicing.net. There you can also find additional information and materials to help your evidence-based policing work. And if you are an instructor and you want to assign the book for a class, there is a wealth of instructor materials available. Visit evidencebasedpolicing. net for details. 1 WHAT IS EVIDENCE-BASED POLICING? THE COPPER’S NOSE In April 2019, two British police officers went to the east London home of Zahid Younis. Younis, a 36-year-old Hungarian, had been reported missing by a friend, and the two Metropolitan Police officers decided to enter and search his apartment. They noticed a strange odor around a small, padlocked chest freezer, and something about it just did not feel right to them. They forced the freezer open with a crowbar and found the remains of two women Younis had murdered. Neither woman had been seen for nearly a year. If the apartment’s electricity supply had not been disconnected (causing the smell from the freezer), the bodies could have remained undiscovered indefinitely. After Younis received a lengthy prison sentence, the lead detective said he had “the highest praise for the officers who went to Younis’ flat that day—it was a good old-fashioned police hunch that made them force open the freezer, intuition that something just wasn’t right”.1 Intuition—the good old-fashioned police hunch—has been a cornerstone of policing since modern law enforcement began in 1829. It has also been referred to as having a good ‘copper’s nose’, unfortunately appropriate in the Younis case. Intuition featured in DOI: 10.4324/9781003145684-1 2 W hat is evidence - based policing ? sociologist Egon Bittner’s definition of the policing role as working “in accordance with the dictates of an intuitive grasp of situational exigencies”.2 Alongside supporting colleagues, demonstrating good sense and temperament, a strong work ethic, and a commitment to the esprit de corps (the feeling of pride and loyalty within policing), intuition is central to the ‘craft’ of policing. The television series The Wire described officers with these characteristics as ‘natural police’. And we can certainly agree it was insightful of the officers to break into the freezer. But is an officer’s intuition sufficient to guide police decision-making in the 21st century? These days, a wealth of police activity data and crime statistics are available to journalists, reformers, critics, and the public. We can assess police performance across topics, from success in tackling violent crime to their recruitment efforts in hiring officers of color. Police departments are being challenged about use of force policies, how they employ body-worn cameras, vehicle pursuit guidelines, and how officers deal with vulnerable populations such as people with addictions, experiencing mental illness, and those who are homeless. There are increasing demands for evidence that programs and strategies are effective, and the public are, reasonably, asking police to make the case for the work they do. In response, data and evidence are increasingly common tools of a new cohort of police leaders. This goes beyond crime, and there is an increasing desire by progressive leaders in policing to not just manage criminality, but also to demonstrate improvements in community safety, public trust, and police legitimacy. This inclusion of data and science can feel like a challenge to officers vested in the idea that policing remains a craft—a profession that functions based on individual officers’ experience, intuition, and opinion. One aim of evidence-based policing is to moderate these beliefs, subjective opinions, and systematic biases and enhance the policy decisions of police organizations with data and research evidence. As you will discover in this book, science and data analysis have the potential to improve the police profession significantly. There remains, however, a role within evidence-based policing W hat is evidence - based policing ? 3 for officer experience and intuition. Evidence-based policing can be an enhancement to current police practices, not necessarily a replacement. One aim of evidence-based policing is to moderate these beliefs, subjective opinions, and systematic biases and enhance the policy decisions of police organizations with data and research evidence. THE ORIGINS OF EVIDENCE-BASED POLICY Policing is not the first area of public life to discover the value of an evidence-informed approach. There is a centuries-long tradition of learning from the results of actions and incremental improvements in military equipment and tactics. Much of this was driven by an important motivator: if you did not learn and adapt, you lost and died. In response to armored knights roaming medieval battlefields, military engineers developed the English longbow. In Europe this morphed into the crossbow, a weapon so devastating that it was banned in 1139 by the Pope as ‘hateful to God’. Progress was, however, ad hoc, and not formalized in a specific process. The history of medical advancement is similarly anecdotal. This may explain why bloodletting survived as a medical practice for so long. Bloodletting is the archaic practice of draining some of a patient’s blood. It is sometimes referred to as ‘phlebotomy’, ‘depletion therapy’ or a ‘venesection’, evidence that giving something a fancy name does not make it any more effective. Egyptians, Greeks, and Romans thought it balanced the four ‘humors’: blood, phlegm, yellow bile, and black bile. In 1799, physician Benjamin Rush drained between 5 and 9 pints from America’s first president, George Washington, when he lay sick with a throat infection. Not surprisingly, Washington died the next day. Though the practice was largely discredited by the 20th century, it was still a recommended treatment for pneumonia in a medical 4 W hat is evidence - based policing ? textbook published in 1942.3 Why did bloodletting survive so long? Progress was hindered because doctors were hampered by an unfortunate human trait: confirmation bias,*1one of many cognitive biases to which we are vulnerable.4 Confirmation bias is our human tendency to interpret information and evidence in a way that supports our existing beliefs. The argument ran “if a patient recovered, it was because of bloodletting; and if they died, it was because bloodletting was not employed quickly enough”. From the doctor’s perspective, “heads I win, tails you lose”. Confirmation bias is our human tendency to interpret information and evidence in a way that supports our existing beliefs. With a more evidence-based approach, progress was so dramatic that improvements could be observed by simply watching patients get better. Initially, the benefits of sanitation, refrigeration of perishable food, insulin, and penicillin were so obvious that meticulous research methods were not required.5 But subsequent advances to medicine have required moving beyond trial-and-error, resulting in the development of rigorous methods of experimentation, grounded in the scientific method. This development has occurred across a range of social policy areas. When my former colleague, Joan McCord, published an evaluation of the Cambridge-Somerville Youth Study, she created quite a furor. The program provided children with family counseling, access to social workers, academic assistance, and recreational activities. It epitomized the immersive assistance for at-risk youth advocated by social work experts, so it was fully expected to be a roaring success. There was only one problem. When the program was made available to adolescent at-risk boys randomly selected from a pool of over * As this is the first such entry, please note that any standalone text in this bold blue font has a glossary entry at the book’s website, evidencebasedpolicing.net, explaining what the term means. W hat is evidence - based policing ? 5 500 similar kids, the boys that went through the program were later shown to have engaged in more crime, were more likely to be charged and appear in court, and were more likely as adults to have been convicted of serious street crime.6–7 Joan had people shout at her in presentations and was even threatened. People take umbrage when rigorous scientific evidence slays their sacred cows. Evidence-based practice has subsequently expanded into areas such as public health, fiscal policy, education, and personnel management,8 and is emerging slowly into policing. Why slowly? Even though evidence-based policing was first introduced over 20 years ago, policing has largely shunned science, preferring to cherish and prioritize the role of experience in the policing craft—to its detriment. “HOW WE’VE ALWAYS DONE IT” Being slow to get onto the ‘evidence-based’ bandwagon has had consequences for policing. In the aftermath of the murder of George Floyd when cities in the US began to discuss defunding, or even disbanding, their police departments, there was little research to evaluate these demands. One rare natural experiment happened when two New Jersey (US) police departments considered laying off officers during the 2008 economic recession. Unlike experiments that are specifically designed as studies, natural experiments are policy situations that occur in the real world without any researcher intervention that provide an opportunity for observational study. One department—Jersey City—managed to resolve their financial problems. But the Newark police department laid off 13 percent of the department. Piza and Chillar estimate this resulted in about 108 additional violent and 100 more property offences per month.9 Outside this unique case study, most police effectiveness and reform researchers concentrate less on police numbers and more on how to better deploy and use police officers. For crime control, a baseline is needed as a starting point for discussion. The standard model of policing involves random patrol across all parts of a jurisdiction, rapid response to crime when it happens, and a focus on 6 W hat is evidence - based policing ? criminal investigation.10 You can think of this baseline model as ‘how we’ve always done it’. This has been the dominant approach since modern patrol policing began in the 1820s and detectives were introduced in the 1840s. It remains in many jurisdictions, in part because “reactivity and the aspects of the standard model are institutionalized in almost every fiber of the policing organization”.11 There is now evidence that the standard model of policing is not associated with safer outcomes or improvements in the trust and confidence people have in police. The standard model is the basis against which more recent innovations are compared. There is now evidence that the standard model of policing is not associated with safer outcomes or improvements in the trust and confidence people have in police. The classic example is the Kansas City Preventative Patrol Experiment. Researchers found that varying the number of cars randomly patrolling an area had no effect on the crime rate.12 A similar lack of effectiveness occurred in the Newark Foot Patrol Experiment when police converted large car patrol areas to foot beats.13 The work of detectives came under scrutiny in the 1970s and 1980s. The detective function was initially criticized for having little to do with clearing cases.14–15 John Eck found that in most situations, the solvability of a case was not related to the amount of effort exerted by a detective.16 Problems with routine investigation continue to this day, and clearance rates have declined for the last 60 years.17 Alternative models of policing have been proposed. These include community policing, intelligence-led policing, problem-oriented policing, and harm-focused policing.18 As Cynthia Lum and Chris Koper note, “Evidence-based policing does not reject or stand in juxtaposition to any of these alternative approaches or even to the standard model.”11 Instead, if alternatives are to be used, we should W hat is evidence - based policing ? 7 be able to connect the outcomes to the programs and test if they are delivering results. Evidence-based policing is designed to reduce reliance on our experience or blind hope that strategies will work—the fingerscrossed approach. It provides an evidentiary foundation on which to evaluate any approach, whether it be the traditional standard model of policing or more innovative methodologies. As can be seen in Figure 1‑1, evidence-based policing is not a conflicting or alternative approach to other strategies, but rather the mechanism to test their effectiveness. Figure 1‑1 Evidence-based policing oversight role for policing models and strategies EXPERIENCE VERSUS EVIDENCE Experience is difficult to quantify. However, as US Supreme Court Justice Potter Stewart said about pornography, we know it when we see it. Experience is context-specific knowledge that is situated in the challenges police face. Unlike intuition, personal experience is acquired over time if, when faced with a situation, we reflect on the outcomes of actions taken in similar circumstances. In other words, we think back to how we acted in similar events and adjust 8 W hat is evidence - based policing ? our current behavior to achieve a better outcome. Bittner wrote that “the substance of police professionalism must issue mainly from police practice and police experience”.19 When viewed as a craft—rather than a science—professionalism in policing stems from experience, not research knowledge. As police officers deal with an array of situations and people, their experience grows, and with it their knowledge and skill set.20 This ‘situated knowledge’21 is specific to the background, location, and situation in which police officers finds themselves. I have worked with police in a dozen countries and often seen this situated knowledge. Downtown officers develop expertise dealing with drunken revelers, officers in prolific drug markets quickly develop an expertise around narcotics and the various stages of substance intoxication, and good detectives manage crime victims empathetically. But can this tactical experience translate into broader operational insights? The clinical expertise that police officers gain dealing with people with drug addictions may improve their street-level interactions but may be less valuable in identifying an enforcement or mitigation strategy for the area. The individual experience of a detective sergeant may be extensive but not necessarily sufficient to design a project to improve clearance rates for a city. As we will see later in the book, individual experience is often anecdotal and does not necessarily give us the broader insight to establish an effective policy decision. Research or scientific evidence can often usefully supplement local experience. WHAT IS SCIENTIFIC EVIDENCE? Scientific evidence is different from the criminal evidence that police gather to achieve a prosecution in court. Criminal evidence can help determine if there has been a crime, who might have done it, and aid in prosecution. Fingerprints, DNA evidence, confessions and eyewitness testimony will all help convict a specific robber, but may be less help in determining how to reduce robberies across a city. Scientific evidence W hat is evidence - based policing ? 9 is the accumulated wisdom from systematic studies and observations that can help a policy maker reach a conclusion about a policy choice. People define and approach scientific ‘research’ in different ways. There are books filled with discussions of research philosophy, and I make no attempt to seriously address that here; however, two broad strands are relevant to social science. Positivists lean towards the ‘science’ in ‘social science’ and favor quantitative methods to understand the laws and principles that influence human behavior and our social trends. Data and methods should be representative of the community, replicable and reliable, and represent broad patterns and relationships between people and the environment. Positivists tend to seek causes for human behavior and use statistics to avoid making causal claims for phenomena that could occur by chance. An interpretivist approach tends towards the ‘social’ in ‘social science’, arguing that people are more individualistic and respond differently to the same stimuli and societal forces. Researchers in this tradition are more interested in gaining insights into individual motivations and employ more qualitative research methods, such as interviews and observation. The aim is to better understand the nuances around how the individual explains their behavior and worldview. One maxim to consider is quantitative methods can explain ‘what’ is going on, while qualitative research can explain ‘why’. Increasingly, researchers employ mixed methods to understand not only the external influences that explain examples of human behavior, but also the individual actions that generate the broader patterns. One maxim to consider is quantitative methods can explain ‘what’ is going on, while qualitative research can explain ‘why’. Both approaches constitute forms of research, albeit in different flavors. Policy makers, such as city administrators and police leaders, benefit when drawing on a breadth of scientific evidence. 10 W hat is evidence - based policing ? DEFINING EVIDENCE-BASED POLICING Definitions of evidence-based practice often reference the conscientious, explicit, and judicious use of the best evidence available in making decisions to improve outcomes.8, 22 Lawrence Sherman has translated this to law enforcement by calling it “the use of the best available research on the outcomes of police work to implement guidelines and evaluate agencies, units and officers”.23 Cordner ­characterizes it as the use of “data, analysis, and research to complement experience and professional judgment, in order to provide the best possible police service to the public”.24 The various existing definitions have common threads: • Conscientious and explicit: evidence-based practice is an active, not passive, activity that should be energetically pursued • Best evidence available: the search for the highest quality evidence, while being realistic about what is available at the time a decision is needed • Guiding practice: being more than a theoretical exercise, evidencebased means adjusting decisions and policies based on the evidence • Integrating experience: research evidence should be integrated with the clinical expertise of police officers • Public service: evidence-based practice seeks to improve not just crime control but the whole gamut of community safety and public service These threads coalesce in a definition from the UK’s College of Policing: “In an evidence-based policing approach, police officers and staff create, review, and use the best available evidence to inform and challenge policies, practices, and decisions”.25 I like that the emphasis is on police officers and staff taking a central role in evidence-based policing, and not just academics or researchers. It also stresses the importance of challenging policies and practices. It is the preferred definition for this book. W hat is evidence - based policing ? 11 In an evidence-based policing approach, police officers and staff create, review, and use the best available evidence to inform and challenge policies, practices, and decisions. The role of practitioner experience within evidence-based practice is sometimes forgotten. Advocates of evidence-based medicine recognize the importance of integrating a doctor’s clinical expertise with the best available evidence.22 In policing, an officer’s ‘clinical expertise’ is a combination of their experience, training, and judgement. Evidence-based practice involves reconciling this clinical expertise with not just the scientific evidence, but also with the views and evidence gleaned from the police organization and from other stakeholders who have an interest in the outcome, such as local government and the community. Evidence-based policing does not demand a rigid adherence to any specific research method, though there is a hierarchy of approaches as you will learn later in the book. Neither is it just about crime control. Rather, evidence-based practice is a process for improving policing outcomes across the array of police activity, from recruitment and training to victim support. There is hardly any area of policing that will not benefit from a more scientific approach. AN OUTLINE OF THE SCIENTIFIC METHOD There is no consensus about what science ‘is’, though most scientists agree it is a combination of principles and techniques. Science aims to create, organize, and share knowledge about how the universe works, using principles of logic, rationality, and objectivity.26 The overall goal of social science is to describe and explain the human world and environment around us, and to test theories and ideas about how it works. According to Karl Popper, knowledge that stems from science requires the capacity to test a hypothesis against our observable 12 W hat is evidence - based policing ? experience.27 Feeding this into a formal process, we get the system illustrated in Figure 1‑2. Imagine you identify a specific problem like vehicle break-ins around a university (step 1, at the top of the figure). You check online libraries and websites to learn more about car crime (step 2) and from this background research find out that increased foot patrol by police might be effective. You then ponder, “Would doubling security guard patrols in high crime areas reduce vehicle crime by 20 percent?” This is your hypothesis and research question (step 3). You spend several weeks doubling the security guards in some high crime areas, constantly checking to confirm the patrols are there (step 4, undertake a study or experiment). Afterwards, you compare the amount of car crime in the target areas with car crime in other high crime areas that did not have additional guards. From this you conclude that doubling the numbers of security officers reduced crime by 25 percent (step 5). You tell colleagues Figure 1‑2 The scientific method W hat is evidence - based policing ? 13 about your findings and publish your results in a blog post (step 6). You now wonder if additional security guards on foot would work to reduce violent crime, or if you could get the same result by only increasing guards by half instead of doubling them (step 7). These questions send you back to step 1 (Figure 1‑2). BEYOND SCIENCE TO ACTION Understanding the scientific method is one thing. Doing it is another. This book introduces you to the basic principles of evidence-based policing as painlessly as possible. Of course, everyone has a different pain threshold, so I make no promises. But what follows aims to be a gentle introduction. One of the challenges for some police officers is thinking that they might benefit from a book like this at all. Bill Bratton—one of the most famous leaders in the history of American policing— summarized the overconfidence that can come with a little experience, based on his Boston police academy days in the 1970s. After just six weeks in the academy, he was deployed—before completing his training—to help with Christmas traffic and crime. When he and his colleagues returned to the academy a few short weeks later, “we thought we were veterans. Each of us had had experiences... The instructors had a hard time training us; in fact, they couldn’t wait to get rid of us because we had become unmanageable. Three months earlier we’d known nothing; now, as far as we were concerned, we knew it all.”28 I winced when first reading this, recognizing my own overconfidence as a 19-year-old police officer on the streets of east London in the 1980s. One starting point is an honest assessment of a police organization’s level in either using evidence-based practice or contributing to the profession when the evidence base is not available. In Figure 1‑3, police departments that regularly cite and use research and evidence as the foundation for their policies would be optimal organizations.29 But moving down the chart, if they do not use a research foundation for their policies and are not investing in learning to answer 14 W hat is evidence - based policing ? Figure 1‑3 Organizational development of evidence-based practice important questions about the most effective path forward, they are mired in stagnation. The scientific process is not a guarantee that the answers produced are correct. But good science is a self-correcting practice that can discard outdated ideas when new knowledge becomes available. And this is especially important in social science because human beings are less predictable than the physical world. Some doubt and a willingness to question the current orthodoxy is not only useful, but also essential to progress. A ‘culture of curiosity’—as a police colleague and friend likes to call it—is what propels us towards better policing. Chapters 2 through 4 outline the origins of evidence-based policy and the basics of the scientific method. Chapter 5 through 7 will arm you with the understanding and skills necessary to review the existing knowledge we have about policing. Chapters 8 to 13 provide a foundation for designing your own research project and understanding the results. The remaining chapters (14 to 16) review ways to publish studies, discuss challenges to evidence-based policing, and provide some final thoughts on how to move it forward. W hat is evidence - based policing ? 15 SUMMARY Nearly 100 years ago Frederick Wensley—head of detectives at Scotland Yard—wrote that “one of the first qualities of an ambitious detective must be industry. To that he must add patience, courage, tact and resourcefulness”.30 Alongside these characteristics of ‘natural police’, intuition has long been an essential ingredient to police experience. But we also need evidence-based policing because: • There is more police and crime data available publicly, and a range of stakeholders are now scrutinizing police decision-making. • We see increasing demands for police to produce evidence that their programs and strategies are more effective than alternatives. • There now exists a growing body of research on the impacts of different police approaches to problems. • While intuition can have a role, the limitations of the good old ‘copper’s nose’ are increasingly apparent. Adopting the definition that “In an evidence-based policing approach, police officers and staff create, review, and use the best available evidence to inform and challenge policies, practices, and decisions”,25 police officers and students should not only use and review scientific evidence, but also apply it to real world problems. To ‘police officers and staff’ we could also add partners in government and services providers. Academics also must step out of the ivory tower and help address issues around crime and policing. There are numerous obstacles to overcome. To this day, many police officers match their strong preference for experience and intuition with an equivalent aversion to academic study and research. One prominent police chief commented to me that “policing is the only field where the word ‘clever’ is an insult”. This is not a trait unique to policing. Author Isaac Asimov lamented “anti-intellectualism has been a constant thread winding its way through our political 16 W hat is evidence - based policing ? and cultural life, nurtured by the false notion that democracy means that ‘my ignorance is just as good as your knowledge’”.31 But there are positive signs. When officers are exposed to research integrated into their professional workplace, they are more likely to use it to make better decisions.32 And policing research has grown exponentially in the last 20 years, adding to our store of knowledge. Astrophysicist and science communicator Neil deGrasse Tyson wrote that “If you cherry-pick scientific truths to serve cultural, economic, religious or political objectives, you undermine the foundations of an informed democracy”.33 So, congratulations. By reading this book and applying the scientific and evidence-based principles within, you are contributing to democracy. REFERENCES 1. Hardy, J. (2020) Police face questions after “system failed victims” of double murderer who hid bodies in freezer. Telegraph: London, 3 September, p. 1. 2. Bittner, E. (1990) Aspects of Police Work, Northeastern University Press: Boston, p. 131 (emphasis added). 3. Thomas, D.P. (2014) The demise of bloodletting. Journal of the Royal College of Physicians of Edinburgh 44 (1), 72–77. 4. Mitchell, R.J. (2022) Twenty-One Mental Models That Can Change Policing, Routledge: New York. 5. Baron, J. (2018) A brief history of evidence-based policy. The Annals of the American Academy of Political and Social Science 678 (1), 40–50. 6. McCord, J. (2003) Cures that harm: Unanticipated outcomes of crime prevention programs. Annals of the American Academy of Political Science 587 (1), 16–30. 7. McCord, J. and McCord, W. (1959) A follow-up report on the CambridgeSomerville youth study. Annals of the American Academy of Political and Social Science 322 (1), 89–96. 8. Barends, E., Rousseau, D.M. and Briner, R.B. (2014) Evidence-Based Management: The Basic Principles, Center for Evidence-Based Management: Amsterdam. 9. Piza, E.L. and Chillar, V.F. (2021) The effect of police layoffs on crime: A natural experiment involving New Jersey’s two largest cities. Justice Evaluation 4 (2), 176–196. W hat is evidence - based policing ? 17 10. Weisburd, D., Majmundar, M.K., Aden, H., Braga, A.A., Bueermann, J., Cook, P.J., . . . Tyler, T. (2019) Proactive policing: A summary of the report of the national academies of sciences, engineering, and medicine. Asian Journal of Criminology 14 (2), 145–177. 11. Lum, C. and Koper, C.S. (2017) Evidence-Based Policing: Translating Research into Practice, Oxford University Press: Oxford, pp. 12, 16. 12. Kelling, G.L., Pate, T., Dieckman, D. and Brown, C.E. (1974) The Kansas City Preventative Patrol Experiment: A Summary Report, Police Foundation: Washington, DC, p. 56. 13. Kelling, G.L. (1981) The Newark Foot Patrol Experiment, Police Foundation: Washington, DC. 14. Chaiken, J.M., Greenwood, P.W. and Petersilia, J. (1977) Criminal investigation process—a summary report. Policy Analysis 3 (2), 187–217. 15. Greenwood, P.W., Chaiken, J.M. and Petersilia, J. (1977) The investigative function. In The Criminal Investigation Process (Greenwood, P.W. et al. eds), D.C. Heath: Lexington, MA, pp. 9–13, 225–235. 16. Eck, J.E. (1983) Solving Crimes—The Investigation of Burglary and Robbery, National Institute of Justice: Washington, DC. 17. Eck, J.E. and Rossmo, D.K. (2019) The new detective. Criminology and Public Policy 18 (3), 601–622. 18. Ratcliffe, J.H. (2019) Reducing Crime: A Companion for Police Leaders, Routledge: London. 19. Bittner, E. (1970) The Functions of the Police in Modern Society, National Institute of Mental Health, Center for Studies of Crime and Delinquency: Rockville, MD, p. 88. 20. Willis, J.J. (2013) Improving policing: What’s craft got to do with it? In Ideas in American Policing, Police Foundation: Washington, DC, Issue 16, pp. 1–14. 21. Thacher, D. (2008) Research for the front lines. Policing and Society 18 (1), 46–59. 22. Sackett, D.L., Rosenberg, W.M., Gray, J.A., Haynes, R.B. and Richardson, W.S. (1996) Evidence based medicine: What it is and what it isn’t. British Medical Journal 312 (7023), 71–72. 23. Sherman, L.W. (2002) Evidence-based policing: Social organisation of information for social control. In Crime and Social Organisation: Essays in Honour of Albert J. Reiss Jr. (Waring, E. and Weisburd, D. eds), Transaction Publishers: New Brunswick, pp. 217–248, 226. 18 W hat is evidence - based policing ? 24. Cordner, G. (2020) Evidence-Based Policing in 45 Small Bytes, National Institute of Justice: Washington, DC, p. 2. 25. U.K. College of Policing (2017) What Is Evidence-Based Policing? http:// whatworks.college.police.uk/About/Pages/What-is-EBP.aspx (accessed February 2022). 26. Chang, M. (2014) Principles of Scientific Methods, CRC Press: Boca Raton. 27. Popper, K. (1935 [2002]) The Logic of Scientific Discovery, Routledge Classics: London. 28. Bratton, W. and Knobler, P. (2021) The Profession, Penguin Press: New York, pp. 43–44. 29. Adapted from Martin, P. (2019) Moving to the inevitability of evidencebased policing. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 199–213. 30. Wensley, F.P. (1930) Forty Years of Scotland Yard, Doubleday, Doran and Company: New York, p. 68. 31. Asimov, I. (1980) A cult of ignorance. Newsweek, p. 19. 32. Fleming, J. (2015) Experience and evidence: The learning of leadership. In Rising to the Top: Lessons from Police Leadership (Fleming, J. ed), Oxford University Press: Oxford, pp. 1–16. 33. Tyson, N.D. (2016) What Science Is, and How and Why It Works. www.hay denplanetarium.org/tyson/commentary/2016-01-23-what-science-is.php. 2 WHAT ARE THE ORIGINS OF EVIDENCE-BASED POLICY? EVIDENCE-BASED MEDICINE For most of human history, medical practice was conducted by shamans, witch doctors, or an assortment of fakers, quacks, and charlatans. The cures were usually more harmful than beneficial. As we slowly adopted scientific principles, we have advanced human life from being “nasty, brutish, and short”1 to where life expectancy has doubled since 1900 and one in every 6,000 people live for at least 100 years. Evidence-based medicine merges the best research evidence with clinical expertise and a patient’s unique values and circumstances.2 Much of the emphasis with evidence-based practice is on the science; however, it also considers the willingness of the patient to accept and undergo treatment. A key determinant to adoption of evidence-based policy is the enthusiasm the field has for new ideas. Take James Lind’s unique experiment described in Box 2‑1. After Lind’s study in the Atlantic Ocean, it took the Royal Navy 42 years before they made citrus juice a mandatory requirement in the diet of sailors. This might sound like a policy failure—and it was—but there were multiple causes for the delay. Some of Lind’s recommendations for storing citrus DOI: 10.4324/9781003145684-2 20 W hat are the origins of evidence - based policy ? inadvertently destroyed the vitamin C. It was also a reality that the British Admiralty had been inundated with suggestions for scurvy cures, and Lind was of a lower status than other specialists, relegating attention to his discovery.3 BOX 2‑1 JAMES LIND’S SCURVY EXPERIMENT “On the 20th of May 1747, I took twelve patients in the scurvy, on board the Salisbury at sea. Their cases were as similar as I could have them. They all in general had putrid gums, the spots and lassitude, with weakness of the knees”.4 In this way, Scottish naval surgeon, James Lind, opened his description of the first recorded clinical trial using control groups. Lind had observed scurvy—a disease associated with a lack of vitamin C—decimate ship crews since he first went to sea in 1739. It was responsible for the deaths of more seamen than enemy action, and as surgeon to HMS Salisbury, a naval warship, Lind watched as 80 of the 350 sailors were struck down with scurvy over a ten-week period3. Lieutenant Lind took six pairs of sailors, each with similar cases of scurvy, and gave them different dietary supplements. Some treatments were better than others. One pair was given two oranges and a lemon, and another pair cider. Other unfortunate souls were made to consume a vitriolic elixir (basically diluted sulfuric acid), vinegar, sea water, or what was called a ‘purgative mixture’. Vitamins were unknown in Lind’s time, but Lind managed to get lucky with one of his treatments, writing, “The consequence was, that the most sudden and visible good effects were perceived from the use of the oranges and lemons; one of those who had taken them, being at the end of six days fit for duty”.4 Notwithstanding a 42-year delay, Lind’s modest experiment is recognized as a cornerstone in evidence-based medicine. The groundwork was, however, laid long before. Around 400 years BCE, the Greek physician Hippocrates stressed the importance of researching the existing knowledge before embarking on a new study. And a millennia later, the Persian doctor Abu Bakr Muhammad ibn W hat are the origins of evidence - based policy ? 21 ZakariyaʾAl-Razi wrote about the importance of a comparison group in his Comprehensive Book of Medicine (Kitab al-Hawi fi altibb). Al-Razi tested the effects of bloodletting in the treatment of meningitis, and to better gauge the effect, he intentionally neglected to draw blood from a similar group of sufferers. More recently (well, during the 17th century) the English scientist Francis Bacon identified the risk of confirmation bias (though he did not use that term), noting the importance of ascribing as much weight to the negative results of an experiment as the positive. Thus, while James Lind gets the credit for his experiment aboard HMS Salisbury, the medical field was long aware of the importance of a literature review, the need for a comparison group, and of documenting all treatment effects. The challenge of getting scientific findings adopted by policy makers was not unique to Lind. Ignaz Semmelweis was a Hungarian doctor and chief resident at the Vienna General Hospital. He struggled to understand why one of their two maternity wards consistently experienced a higher rate of puerperal fever (caused by an infection of the uterus immediately after childbirth that can be fatal to new mothers). In theory, the problematic ward should have had better results. Medical students—rather than midwives—participated in childbirths in that ward. They were more educated, had received lectures from distinguished faculty, and assisted at autopsies. Yet the mortality rate was three times higher than the midwives’ ward. It was so bad that some women chose to give birth in the street, due to the medical student ward’s terrifying reputation. An unfortunate incident sparked Semmelweis’ curiosity. One of his friends died after being accidentally infected with a medical student’s scalpel during a post-mortem, and Semmelweis reasoned that perhaps the mothers were being infected in some way by the medical students.5 Semmelweis implemented a regime of chlorine handwashing and overnight reduced the instances of puerperal fever. While handwashing is now heralded as one of the seminal breakthroughs in medical science, at the time it was rejected by the leading obstetricians in the city. Semmelweis’ handwashing doctrine 22 W hat are the origins of evidence - based policy ? conflicted with the theories of the day. His research was portrayed as an attack on the knowledge and integrity of other doctors in the hospital system and elicited considerable enmity towards Semmelweis. His handwashing stations were curtailed, his position at the hospital was not renewed, and he was forced to leave. AVIATION SAFETY Alongside medical science, the evolution of aviation safety has been one of the triumphs of human ingenuity. As a pilot—I have in the past owned and partially built a couple of small seaplanes—I pay particular attention to aviation safety. From Orville Wright’s first powered flight in 1903, through the first transatlantic jet passenger service in the 1950s, in little more than a hundred years, commercial aviation has become the safest way to travel. At less than 0.1 deaths per billion commercial passenger miles, it is safer than any other transportation mode. Cars, for example, have a fatality rate of 7.2 per billion miles travelled.6 Between 2010 to 2019, even though flying more than 200 million passengers for over 600 billion passenger miles a year, US commercial aviation suffered just two fatalities. The accounts of the Wright brothers’ progress from early gliders to powered flight is testament to a focus on continual evaluation and incremental progress. It has not always been this way. The accounts of the Wright brothers’ progress from early gliders to powered flight is testament to a focus on continual evaluation and incremental progress. Having seen others killed in early accidents, they made progress cautiously, with hundreds of mundane short test flights, usually as low as possible. As Wilbur Wright pointed out, “While the high flights were more spectacular, the low ones were fully as valuable for training purposes”.7 Aviation advanced quickly. In 1935, the US Army Air Corps tested three aircraft to determine which would get the contract to W hat are the origins of evidence - based policy ? 23 be the country’s long-range bomber. One candidate was the Boeing Model 299. It could fly farther and carry a greater payload than the competitors, and the company was optimistic. But on October 30th, two test pilots took off, entered a steep climb, and stalled. The crew had forgotten to remove the gust locks, devices that hold the elevator and rudder in a fixed position to prevent damage when the airplane is on the ground. The elevator is the horizontal part of the tail that can move up and down and helps the aircraft climb and descend. Now airborne, but with the elevator locked into a fixed position, the aircraft could not be controlled. The aircraft nosed over, crashed, and killed both pilots. At the time, the Model 299 was one of the most complicated aircraft in existence, and a newspaper of the day said it was “too much plane for one man to fly”.8 The Boeing’s Model 299 did not get the military contract; however, a few were purchased for test purposes. The complexity of the Boeing Model 299 combined with more testing spurred a simple, yet innovative solution: the pilot’s checklist. A checklist is a list of items that require checking, verification, action, or inspection, and is usually performed in a certain order. The great value of checklists is their ability to reduce complexity down to a series of more manageable tasks. They are not the result of an experiment, but rather an innovation, continually evaluated and incrementally improved over time. There are further policing lessons from aviation. On April 15, 2016, Allegiant Air flight 436, with 158 people on board, accelerated down the runway at Las Vegas McCarran International Airport. Partway through the takeoff, at 138 miles an hour, the aircraft nose lifted prematurely into the sky, before the crew were ready or had moved any controls. The pilots quickly cut the power, the plane settled back onto the tarmac, and they were able to stop before careering off the runway. The cause? A nut had fallen off the left elevator. The nut was supposed to be secured by a cotter pin, but after routine maintenance, a mechanic had failed to replace this finger-length strip of metal. After a couple of hundred takeoffs and landings, the nut had worked its way loose, nearly causing disaster. The last seaplane I owned had dozens of cotter pins. They cost two cents. 24 W hat are the origins of evidence - based policy ? On investigating Allegiant Air, the Federal Aviation Administration (FAA) reported “Deliberate acts of noncompliance by company personnel resulted in improper maintenance that endangered numerous lives”. The FAA inspector went on to rail against, not human error, but a deliberate lack of oversight by the maintenance company, and a ‘culture of disregard’ for managerial oversight.9 Oversight, combined with the ongoing tracking of data, has since become a vital component of maintaining flight maintenance safety. POLICY LESSONS FOR POLICING What are the policy lessons for policing from medicine and aviation? Trial and error to find solutions, while mired in complicated issues, is as important to the evolution of policing as it is to medicine and the aviation industry. Importantly, advancing an evidence-based strategy is not just a matter of developing and using research evidence, but also implementing a change in policy. Advancing an evidence-based strategy is not just a matter of developing and using research evidence, but also implementing a change in policy. Both Lind and Semmelweis brought empirical methods to their research but failed to see the science converted into lifesaving policies in their lifetimes. Having not attended one of the top universities, Lind’s status was questioned, and his findings received little attention. Semmelweis’ work was manipulated politically, and his lack of communication skills brought him into conflict with hospital administrators. It would be nice to think that people in positions of power can behave altruistically; however, human experience often suggests otherwise. The two doctors made their advances through experiments. Lind used a controlled comparison approach, while Semmelweis took advantage of local conditions to undertake a field experiment. W hat are the origins of evidence - based policy ? 25 A field experiment is a research study that uses experimental methods, not in a laboratory, but in a natural setting. Experiments can be a robust and effective way to advance knowledge, but they are not the only way to improve humanity. The Wright brothers were pioneers in an immensely risky area. As I have heard many times in aviation hangars, “There are old pilots, and there are bold pilots, but there are no old, bold pilots”. The Wright brothers moved forward with incremental change and continuous evaluation. Orville and Wilbur Wright employed two techniques that gave them a significant advantage. First, they broke their overall problem into multiple smaller and more manageable tasks. Rather than wring their hands and fixate on the overall challenges of flying, they tackled the problems of engine, flight surfaces, propeller and so forth, independently. They took an apparently inscrutable problem and broke it into solvable challenges. Second, they achieved incremental improvements by continuously monitoring and recording their activities. By tracking their successes and failures, they could advance without taking excessive risks. Armed with checklists that covered the vital flight conditions, the Boeing’s Model 299 went on to fly over a million hours without serious incident and the government placed orders for over ten thousand planes. It was renamed the B-17 and given the nickname, the “Flying Fortress”.8 The humble checklist, developed to prevent pilots having to remember all of the steps necessary to fly the B-17, has become central to multiple areas of complex human endeavor, from surgery to space flight.10 Data tracking is the frequent or continuous monitoring of specific data points to indicate what, where, and when people and systems are performing activities related to specific objectives.11 In aviation, maintenance errors such as the Allegiant Air flight are now infrequent because the industry invests in oversight and tracking to minimize errors and prevent people taking shortcuts. For example, the cockpit voice recorder and flight data recorder—aviation’s version of the police body-worn camera—became mandatory 26 W hat are the origins of evidence - based policy ? in commercial aviation in 1967 and is now an essential tool in incident review. THE EMERGENCE OF EVIDENCE IN POLICING What about evidence-based policing? Police chiefs were largely oblivious to research until the Kansas City Preventative Patrol Experiment. In 1972, police in Kansas City, Missouri, experimented with varying police patrols across the city. When George Kelling and colleagues12 showed that random patrol strength had little effect on crime or public perception of crime, it repudiated a doctrine that had existed since the first modern police in 1829. Regardless of whether an area had the normal level of patrol (control beat), two to three times the level of normal patrol (proactive beat), or no patrol and only response policing (reactive beat), the results were the same: “The three areas experienced no significant differences in the level of crime, citizens’ attitudes towards police services, citizens’ fear of crime, police response time, or citizens’ satisfaction with police response time”.12 The Kansas City work prompted some interest in police departments as ‘experimental laboratories’, places where conditions could be adjusted, and the results observed. In Newark, New Jersey, a foot patrol experiment cast doubt on the effectiveness of foot beats covering large areas,13 and spurred Broken Windows theory.14 The Minneapolis Domestic Violence Experiment influenced domestic abuse arrest policies,15 and a deeper understanding of investigative realities identified the limitations of detective work.16 Research grew through the subsequent decades, but it did not coalesce into a movement until Lawrence Sherman wrote a summary of evidence-based policing in 1998 for the Police Foundation (now called the National Policing Institute). He described evidence-based policing as using “the best evidence to shape the best practice. It is a systematic effort to parse out and codify unsystematic ‘experience’ W hat are the origins of evidence - based policy ? 27 as the basis for police work, refining it by ongoing systematic testing of hypotheses”.17 Sherman stressed the careful targeting of scarce resources, testing of police methods to identify what works best to reduce harm, and tracking of various data sources to monitor service delivery.11 Better data and analytical resources had been improving in policing since the 1980s and had spurred the rise of Compstat,18 problem-oriented policing,19 and created the analytical basis on which evidence-based policing could grow. Rather than replacing these law enforcement strategies, evidence-based policing provides the scientific foundation for their evaluation, enabling the strategies to refine their approach to public safety. Discussions around the evidence base for law enforcement have become a part of policing debates, for many of the reasons discussed in the previous chapter. There is more police and crime data available publicly, we see increasing demands for police to produce evidence for programs and strategies, and there now exists a growing body of research on the impacts of different police approaches to problems. Rather than replacing these law enforcement strategies, evidence-based policing provides the scientific foundation for their evaluation, enabling the strategies to refine their approach to public safety. BASIC EVIDENCE-BASED PRINCIPLES If there is anything easier to adopt in policing than buzzwords, I have not seen it. Thus, while many organizations and people claim to be evidence-based, the assertion is easier to make than demonstrate. A commitment to evidence-based policy is not just talking about 28 W hat are the origins of evidence - based policy ? evidence and using what we already know from existing evaluations, but also having an obligation to build knowledge to help future decisions.20–21 These basic principles are common to most evidence-based approaches including:20 • Building a rigorous evidence base for what initiatives do and do not work • Consider return on investment by monitoring benefits, costs, and negative impacts • Incorporate research evidence into policy and budget decisions • Continuously monitor programs to confirm they are implemented as planned • Track and evaluate the outcomes of programs to confirm they are achieving desired results As you can see, these principles involve understanding organizational systems and effects, and bringing evidence from a variety of sources. Box 2‑2 lists the main sources of evidence recommended by evidence-based policy advocates.22–23 They integrate clinical expertise (called here professional evidence) with organizational and scientific evidence. And from a policing perspective, it is also important to include stakeholder perspectives. SUMMARY Evidence-based policy builds a rigorous evidence base; considers benefits, costs, and impacts; integrates research into policy and budget decisions; and monitors program implementation. Data tracking, a lesson from aviation safety, is also vital. Importantly, evidence-based policy focuses on whether these programs achieve their stated outcomes. Like the medical field, policing is moving towards the integration of scientific, organizational, professional, and stakeholder evidence, albeit at a different pace. Another innovation increasingly adopted from the medical and aviation field has been checklists. For example, a number of police W hat are the origins of evidence - based policy ? 29 BOX 2‑2 TYPES OF EVIDENCE RELEVANT TO POLICY MAKERS Scientific evidence Scientific research is usually found in academic journals and books. In recent years, there has been an explosion of research into policing. You can also find relevant studies in government and agency reports. Organizational evidence Police departments record huge volumes of data that can provide significant insight yet are rarely analyzed. These include crime and arrest data as well as reports from investigative interviews and victim surveys. Employee information, budget details, internal affairs records, and sickness reports can help understand organizational problems. Professional evidence The tacit understanding of problems that officers and detectives build up over time can provide insights where hard (digital) data are lacking. Officers—especially those in a specialized role—can reflect on their wealth of experiences to illuminate meaningful lessons that can help tackle problems and implement solutions. Stakeholder evidence Stakeholders can create a context for analyzing the other forms of evidence. Any person or group likely to be affected by the outcome of a project can be a stakeholder. They can be internal to the police service or an external partner. Understanding their values and concerns can help avoid implementation problems and pushback. departments have adopted the PANDA crime reduction model as their approach to problem-solving (Box 2‑3), each stage of which includes simple checklists.22 30 W hat are the origins of evidence - based policy ? BOX 2‑3 THE PANDA CRIME REDUCTION MODEL • • • • • Problem scan Analyze problem Nominate strategy Deploy strategy Assess outcomes At the time of writing, we are still in the early days of evidencebased policing, but I anticipate that both experimentation and tracking incremental change will be fundamental to improvements in outcomes for policing and the community. Evidence-based policy is not, however, without its critics. It has, for example, been critiqued for failing to fully recognize the value of mechanistic reasoning (thinking about pathways and cause and effect) in treatment and establishing causality.24 Some proponents have also been criticized for overemphasizing the merits of randomized experiments and minimizing the value of other research traditions.25 Nonetheless, rigorous science can aid policy makers by bringing an objectivity that is sorely needed to our political environment. The public safety world would be significantly improved if more decision makers could take the advice of Neil deGrasse Tyson to “do whatever it takes to avoid fooling yourself into thinking something is true that is not, or that something is not true that is”.26 REFERENCES 1. Hobbes, T. (1651) Leviathan, Scolar Press: Menston, p. 9. 2. Straus, S., Glasziou, P., Richardson, W. and Haynes, R. (2019) EvidenceBased Medicine: How to Practice and Teach EBM, Elsevier Health Sciences: Toronto. 3. Tröhler, U. (2003) James Lind and Scurvy: 1747 to 1795. James Lind Library Bulletin. www.jameslindlibrary.org/articles/james-lind-and-scurvy-1747to-1795/. W hat are the origins of evidence - based policy ? 31 4. Lind, J. (1753) A Treatise of the Scurvy, Sands, Murray and Cochran: Edinburgh, p. 191. 5. Kadar, N. (2019) Rediscovering Ignaz Philipp Semmelweis (1818−1865). American Journal of Obstetrics and Gynecology 220 (1), 26–39. 6. Savage, I. (2013) Comparing the fatality risks in United States transportation across modes and over time. Research in Transportation Economics 43 (1), 9–22. 7. Wright, W. (1903) Experiments and observations in soaring flight. Journal of the Western Society of Engineers 3 (4), 1–18, p. 7. 8. Ludders, J.W. and McMillan, M. (2016) Errors in Veterinary Anesthesia, John Wiley and Sons: Chichester, p. 133. 9. Allen, B.E. (2015) Memorandum: Return of EIR 2015WP390002, Federal Aviation Administration, Washington, DC, pp. 8, 18. 10. Gawande, A. (2011) The Checklist Manifesto: How to Get Things Right, Picador: London. 11. Sherman, L.W. (2013) The rise of evidence-based policing: Targeting, testing and tracking. In Crime and Justice in America, 1975–2025 (Tonry, M. ed), University of Chicago Press: Chicago. 12. Kelling, G.L., Pate, T., Dieckman, D. and Brown, C.E. (1974) The Kansas City Preventative Patrol Experiment: A Summary Report, Police Foundation: Washington DC, pp. vii, 56. 13. Kelling, G.L. (1981) The Newark Foot Patrol Experiment, Police Foundation: Washington, DC. 14. Wilson, J.Q. and Kelling, G.L. (1982) Broken windows: The police and neighborhood safety. The Atlantic Monthly, pp. 29–38. 15. Sherman, L.W. and Berk, R.A. (1984) The Minneapolis Domestic Violence Experiment, Police Foundation: Washington, DC, p. 13. 16. Eck, J.E. (1983) Solving Crimes—The Investigation of Burglary and Robbery, National Institute of Justice: Washington, DC, p. 388. 17. Sherman, L.W. (1998) Evidence-Based Policing, Police Foundation: Washington, DC, p. 4. 18. Bratton, W. and Knobler, P. (2021) The Profession, Penguin Press: New York. 19. Goldstein, H. (1990) Problem-Oriented Policing, McGraw-Hill: New York. 20. Evidence-Based Policymaking Collaborative (2016) Principles of EvidenceBased Policymaking, pp. 1–11. https://www.urban.org/sites/default/files/ publication/99739/principles_of_evidence-based_policymaking.pdf. 32 W hat are the origins of evidence - based policy ? 21. Pew Charitable Trusts (2014) Evidence-Based Policymaking: A Guide for Effective Government, Pew-MacArthur Results First Initiative: Philadelphia, pp. 1–30. 22. Ratcliffe, J.H. (2019) Reducing Crime: A Companion for Police Leaders, Routledge: London. 23. Barends, E., Rousseau, D.M. and Briner, R.B. (2014) Evidence-Based Management: The Basic Principles, Center for Evidence-Based Management: Amsterdam. 24. Howick, J.H. (2011) The Philosophy of Evidence-Based Medicine, John Wiley & Sons: Chichester. 25. Tilley, N. (2010) Whither problem-oriented policing. Criminology and Public Policy 9 (1), 183–195. 26. Tyson, N.D. (2016) What Science Is, and How and Why It Works. www.hayden planetarium.org/tyson/commentary/2016-01-23-what-science-is.php. 3 WHAT DOES GOOD SCIENCE LOOK LIKE? EVERYONE HAS AN OPINION Once you spend any time in the criminal justice field, you soon discover that everyone has a perspective on crime and policing. I have sometimes been introduced at parties with “This is Jerry, he was a police officer”, inevitably generating a flurry of uninvited opinion. I doubt this happens to accountants. Not only do people have opinions on the criminal justice system, they also believe their personal involvement (often of being stopped by the police or reporting a crime) is representative of the experiences of everyone. These conversations are challenging because policing and justice are strongly associated with moral positions. Research has found people “do not want to live near, be friends with, or even sit too close to someone who does not share their core moral convictions”.1 If we are to embrace the value of evidence-based policing, we need evidence rather than anecdotal or uninformed opinion, however strongly felt. At the end of the chapter, you should have a clearer understanding of the difference between facts and opinions, how to distinguish between data and information, and how context can move us towards meaningful knowledge and informed opinions. DOI: 10.4324/9781003145684-3 34 W hat does good science look like ? OPINIONS AND INFORMED OPINIONS When reading a news article, it can be difficult to distinguish between what is happening (facts) and an interpretation of those facts and what they might mean (opinion). You may notice this distinction when watching a news channel or listening to a university lecturer. When are they conveying information, and when are they expressing their opinion? The US has the highest incarceration rate in the world. At the time of writing, this is a fact (though that might change in the future). A fact is a statement or detail that can be proven to be true or to exist. Facts can be supported by data and evidence and are truthful statements, however vehemently you might want them to be something else. For my example, you can gather data from around the world and compare them to confirm the US has the highest rate. Data are collections of facts and statistics compiled for reference or to aid analysis. I could make a statement such as “In America we incarcerate more people than any other country.” This is a fact, because it is supported by data (whether by population rate or raw numbers). I could go on to say, “I think we could reduce our prison population and crime would not go up much”. This is an opinion. An opinion is a feeling, viewpoint, or a person’s perception of a given item or event. While I would argue that Freddie Mercury is the greatest rock singer in history, it is still just an opinion. It can be agreed or disagreed with, but it cannot be verified or proven to be true. Your opinion may differ, be just as strongly held, and be equally valid. Certain phrases and verbs—such as ‘I think’ or ‘I believe’— can indicate an opinion is about to follow. They can change a statement from reporting facts into a statement in which facts are interpreted, analyzed, or commented upon. Certain phrases and verbs—such as ‘I think’ or ‘I believe’—can indicate an opinion is about to follow. They can change a statement W hat does good science look like ? 35 from reporting facts into a statement in which facts are interpreted, analyzed, or commented upon. Words such as ‘indicates’, ‘implies’, ‘suggests’ or ‘proves’ can be signposts that a fact is being used to substantiate an opinion. • “Here in America, we incarcerate more people than any other country” (fact) • “This suggests we can reduce the prison population without many negative effects” (opinion) Sometimes an opinion is grounded in the fact of a personal experience. A person could have experienced several negative interactions with police, such as being stopped for no apparent reason. They may say that “the police always stop people without justification”. That they have been stopped several times is a fact. The belief that the police in general always stop people without justification is an opinion. Over 60 million Americans had contact with the police in 2018 alone.2 Unless we can examine a substantial number of those interactions directly, the ‘without justification’ claim is not verifiable as a fact. Taken in isolation, an individual’s subjective perception is rarely a sound foundation for public policy. There is always the risk that personal incidents are idiosyncratic and not reflective of a more widely shared experience. In fact, some anecdotal occurrences are memorable precisely because they fall outside the normal realm of experience. What about more informed opinions? As Douglas Adams noted “All opinions are not equal.3 Some are a very great deal more robust, sophisticated and well supported in logic and argument than others.” Your uncle could proclaim drunkenly over the Christmas turkey, “I think we could reduce our prison population and crime would not go up much”. Because he is a stockbroker with no expertise in the criminal justice field, this would be an opinion. But if you hear this from a nationally recognized criminal justice expert on a respectable news channel, it might be an informed opinion. An informed opinion is an opinion grounded in knowledge of the available facts and carefully 36 W hat does good science look like ? considered scientific principles. An informed opinion relies more on scientific evidence than just limited personal experience. We would expect that the expert is voicing an informed opinion based on their knowledge surrounding crime and incarceration. This does not mean they are necessarily correct. But we might want to give their informed opinion more credence and consideration. If I develop a rash, my doctor may not know the cause; however, her informed opinion is more helpful than the random thoughts of my bartender friend Rob (though his knowledge of bourbon is substantial). I would trust that my doctor uses her training, education grounded in scientific evidence, and experience to rule out various possibilities, and then conducts tests to identify the cause of my plight. If you are unsure of whether someone is voicing an opinion or an informed opinion (with data and facts to support their case) ask “Why do you think that?”, or “Is there data to support that claim?” As a current or future expert in criminal justice, people may look to you for both data (facts) as well as your interpretation about what they mean (informed opinion), so it is good to be clear about the difference. It also places an onus on you to be knowledgeable in your area of expertise. I have some thoughts on the responsibilities of being a criminal justice expert in Box 3‑1. BOX 3‑1 TEN RESPONSIBILITIES OF AN EXPERT As you progress through your career, people may start to see you as an expert. I view ‘expert’ as a gift title—you can give it to someone else, but you should not claim it for yourself. However, if others do award you this honor, here are some responsibilities and principles it should entail. As an expert you should: 1. Actively seek different perspectives (even if you disagree) 2. Embrace, and encourage others to adopt, scientific principles 3. Stay current with the latest scientific knowledge W hat does good science look like ? 4. 5. 6. 7. 8. 9. 10. 37 Be open to the possibility of being wrong Be cognizant of your potential biases Always remember that facts can change Pause and check before sharing information Recognize the importance of lifelong learning Encourage others to develop their expertise Put your expertise to public good DATA, INFORMATION, KNOWLEDGE, AND CONTEXT Informed opinions stem from knowledge, but what is knowledge? The building blocks of knowledge are data and information, as well as the capacity and motivation to put them into context. As noted earlier, data are collections of facts and statistics compiled for reference or to aid analysis. They are the observations and measurements we make about topics such as crime or disorder. Data are formed from simple observations, unencumbered with additional meaning, inference, or opinion.4 Examples include crime reports, arrest databases, or spreadsheets of recruitment statistics. Information is infused with topic relevance and related meaning and context. Information can, however, be unstructured. For example, an intelligence analyst might learn that a prolific burglar was recently released from incarceration. That information is of little value until the analyst receives data showing burglaries in the vicinity of the burglar’s home address have suddenly increased. The context has given the information potential significance. Context is the wider environment in which data and information reside, often represented by a knowledge of the broader subject area. If we have watched a sport for much of our life, we can appreciate it more than newcomers to the game because we understand the rules, the tactics, and the strategies involved. I now appreciate a good screen pass in American football because I have been watching the game since I moved to the US in 2001. But I do not really understand basketball, and when I watch highlights, I do not get what all the 38 W hat does good science look like ? fuss is about. Context is an important aspect in the development of knowledge. Data and information are the core constituents of organizational evidence. Once we have this foundation of data and information, and can place it all in context, we can progress to being more knowledgeable. Knowledge is data and information within a given context that has meaning, and a particular interpretation, and reflects a deep theoretical and practical understanding of a subject. Knowledge is construed from evidence from organizations, stakeholders, and other sources. Knowledge is difficult to structure, store, or communicate, yet it is also immensely valuable. It is usually based on more than just experience because it suggests an appreciation of wider implications. Being able to recall a lot of data and information is also not the same as being knowledgeable. Winning pub quizzes does not translate to the theoretical and practical knowledge necessary to build an aircraft or design a crime prevention program. When a person (or a discipline) possesses a strong foundation of knowledge on a policy-related subject (such as crime or policing), they can usually support their position with scientific evidence. Scientific evidence (at least in the context of this book) is proof that can support a position or claim of effectiveness. Scientific evidence can be researchbased, often stemming from scientific study and experiments. It can also be practice-based, derived from the measured experiences of police officers and people involved with the criminal justice system. Scientific evidence is usually drawn from the specific context in question. For example, hot spots policing strategies are informed by studies of patrol policing from around the world. However, lessons around mentorship and field education of new police officers can come from other disciplines such as medicine or education. Scientific evidence can be research-based, often stemming from scientific study and experiments. It can also be practice-based, derived from the measured experiences of police officers and people involved with the criminal justice system. W hat does good science look like ? 39 These related strands are visualized in Figure 3‑1 where the ongoing analysis of data and information on a problem constitutes much of the organizational evidence. Evidence from practitioners (professional evidence) and other parties affected by, or who can influence, the problem (stakeholder evidence) can also heighten our understanding of the issue. To this point, the knowledge base comprises sources connected to the context of the problem. Figure 3‑1 The relationship between evidence and knowledge terms This is all enhanced when thinkers in this area incorporate scientific evidence from both within the problem domain as well as outside. This is some of that ‘thinking outside the box’ you might have heard about. Scientific research spans the boundary between the specific context and the wider research domain (see scientific evidence in Figure 3‑1). For example, while there is (at the time of writing) little research on the effectiveness of implicit bias training within policing, it has been studied more extensively in healthcare.5 That research might be relevant. With a combination of the stakeholder, organizational, professional, and scientific evidence, we have the foundation for a strong knowledge base around the problem. In an ideal scenario, this knowledge should inform and influence policy. As you can see from Figure 3‑1, policy can improve policing and the community not just within 40 W hat does good science look like ? the context of the original problem, but also beyond that immediate context. For professionals working in the knowledge environment, the move from data and information to being knowledgeable is also the move from having an opinion to having an informed opinion (top of Figure 3‑1). As Gary Cordner has noted, evidence-based policing “represents the most logical and rational approach for a law enforcement agency to adopt as it considers, implements, evaluates, and refines its strategies of policing, whatever they are”.6 The starting point for a ‘logical and rational approach’ is to develop knowledge. There is no knowledge without being evidence-based, and as you will see, evidence can come in several forms. The next section explains four important components of reliable evidence. THE FOUR COMPONENTS OF GOOD EVIDENCE To move from an opinion to having an informed opinion requires accessing evidence and developing knowledge. When assessing the evidence supporting a particular policy choice, decision-makers can judge what is presented to them against four criteria: • • • • A plausible mechanism Temporal causality Statistical association Reject competing explanations First, without a plausible mechanism linking an intervention with an outcome, we can end up with correlations that make no sense. A plausible mechanism is a reasonable or persuasive process that links a cause to an effect. For example, many cities suffer increases in violence in the summer. Ice-cream sales also increase during the same period. Does ice cream consumption cause violence? One probable cause for the summer crime spike is that people spend more time outdoors in the nice weather drinking and interacting with other people, which can lead to violence. This plausible mechanism in W hat does good science look like ? 41 more likely than jumping to the conclusion sweet frozen treats turn people into homicidal maniacs.4 A second component of good evidence is temporal causality. Temporal causality means that any effect should occur during, or after, a related cause. Notwithstanding rare exceptions (called anticipatory effects), we would expect that the impact of something occurs during or after the initiative, and not before. For example, if we saw an increase in Black and minority police applicants before a new recruitment campaign, we do not have temporal causality. It would seem illogical that the campaign caused the increase in applications. But if we have a substantial increase in applicants after the advertising, then a claim of causality is more supportable. Statistical association is also important because we want to ensure any differences or changes are not potentially the result of random fluctuations in data (i.e., chance). Statistical association helps us confirm a demonstrated relationship between a variable (a cause, such as a recruitment campaign) and the result of changing that variable (an effect, like an increase in applicants). It reduces the chance we fall prey to (and here is your word of the day) apophenia. This is the human tendency to perceive correlations between unrelated events. A statistical association can prevent us from putting too much weight in random chance connections. Finally, we would hope that good researchers reject competing explanations for any change in an outcome. Rejecting competing explanations for an observed outcome means making a concerted and good faith effort to rule out other possible reasons why a study result occurred. I once watched a police commander claim success for a street violence prevention campaign without acknowledging that another reason could have been the unseasonably bad weather that kept people off the street. Drive-by shootings are notoriously difficult if the roads are blocked by two feet of snow. By failing to reject this competing explanation, his claims could not be taken seriously. Competing explanations must be reasonable. Philosopher Bertrand Russell argued that just because nobody can prove that a china 42 W hat does good science look like ? teapot does not revolve around in space between Mars and Earth, we should not take such a claim seriously.7 Outlandish claims shift the burden of proof to the claimant. If a competing explanation is however reasonable and plausible, it is worth examining seriously. Outlandish claims shift the burden of proof to the claimant. If a competing explanation is however reasonable and plausible, it is worth examining seriously. These four tests serve as a handy check on the quality of scientific evidence. And while they do not guarantee what is presented is dependable, failing any of them is a strong indication you should be wary. Is there a plausible mechanism? Do the data suggest temporal causality? Is there statistical association suggesting more than a chance finding? And has an attempt been made to reject competing explanations for the results? A study that addresses these questions is often referred to as demonstrating good internal validity. In other words, it is more likely that a relationship between an innovation and an outcome is actually causal. I elaborate more on internal validity later. WHAT ABOUT BAD SCIENCE? It can be tough to locate good science that is thorough, self-critical, and often laborious. In contrast, bad science and science denialism can be found everywhere. How often have we seen misleading headlines, or scientific breakthroughs claimed on very small numbers of test subjects or one-off studies? While the promulgation of bad science can be the result of laziness or ignorance on the part of researchers or journalists, it can also be deliberately propagated through pseudoscience and science denialism. Pseudoscience is a collection of practices and beliefs that are mistakenly thought to be grounded in the scientific method. Pseudoscience can appear on the surface to mimic real W hat does good science look like ? 43 science yet is essentially non-scientific. Frequently, pseudoscience claims are not falsifiable. Science denialism is the use of rhetorical arguments to give the appearance of legitimate debate with the ultimate aim of rejecting a proposition on which a scientific consensus exists.8 There are five main approaches, many of which you may have seen: 1. 2. 3. 4. 5. Promotion of conspiracy theories Use of fake experts Selectively cherry-picking evidence Creation of impossible expectations Misrepresentation and logical fallacies Here is an example of the first approach. A quarter of US adults think that it is ‘probably’ or ‘definitely true’ that powerful people intentionally planned the COVID-19 coronavirus outbreak, even though there is not a scrap of evidence to support this conspiracy theory.9 The second approach is just as problematic, given that—these days—fake experts seem to flourish. One example is the advisor to a previous US federal administration on reproductive health drugs, who advocated prayer and bible reading in the management of premenstrual syndromes.10 And when people are challenged with evidence with which they disagree, they will scrutinize it for the slightest defect to dismiss it. Yet when offered the feeblest information that matches their desires, most people rush to embrace cherry-picked data.11 The fourth approach of science denialists is to set impossible expectations to derail an initiative. Predictive policing is one example, a tactic that “has been so hyped that the reality cannot live up to the hyperbole”.12 It has been argued that predictive policing specifically targets minority groups, but also “when a group feels less favorable toward local police, they are less likely to report a crime they witness”.13 This logical fallacy would, however, result in a predictive policing algorithm being less likely to target an area due to reduced crime reporting in that neighborhood. 44 W hat does good science look like ? SUMMARY When I first started teaching evidence-based policing, the lack of training around how to evaluate science and knowledge became apparent. This is a fundamental flaw in our school education. After all, social media bombards us with both science and pseudoscience, and it sometimes feels that nothing spreads more quickly than ludicrous conspiracy theories. You only need look at the warped discussions around the COVID-19 vaccines for examples. Being able to differentiate between good and bad science, between meticulous research and pseudoscience, and between facts and opinions, is essential for better decision-making, public policy, and our personal well-being. Data and information are the building blocks of knowledge, and when we situate that knowledge within a decision-making context, we can improve the quality of public policy. A plethora of intellectual tools can detect and defeat pseudoscience and science denialism. These include seeking plausible mechanisms, temporal causality, statistical association, and rejecting competing explanations. Whether bad scientific information is promoted through error or malice, the solution is the same—the promotion of better scientific education and practice. Whether bad scientific information is propagated through error or malice, the solution is the same—the promotion of better scientific education and practice. And the cornerstone to scientific rigor is the scientific method. The scientific method of rigorous and careful research has resulted in us living longer, traveling farther, and increasing our understanding of the universe. You were briefly introduced to it in Chapter 1, and it is the subject of the next chapter. REFERENCES 1. Skitka, L.J., Bauman, C.W. and Sargis, E.G. (2005) Moral conviction: Another contributor to attitude strength or something more? Journal of Personality and Social Psychology 88 (6), 895–917, p. 914. W hat does good science look like ? 45 2. Harrell, E. and Davis, E. (2020) Contacts Between Police and the Public, 2018—Statistical Tables, Bureau of Justice Statistics: Washington, DC, p. 14. 3. Adams, D. (2002) The Salmon of Doubt, Harmony Books: New York, p. 98. 4. Ratcliffe, J.H. (2016) Intelligence-Led Policing, Routledge: Abingdon, Oxon. 5. Forscher, P.S., Lai, C.K., Axt, J.R., Ebersole, C.R., Herman, M., Devine, P.G. and Nosek, B.A. (2019) A meta-analysis of procedures to change implicit measures. Journal of Personality and Social Psychology 117 (3), 522–559. 6. Cordner, G. (2020) Evidence-Based Policing in 45 Small Bytes, National Institute of Justice: Washington, DC, p. 8. 7. Russell, B. (1969) Dear Bertrand Russell: A Selection of His Correspondence with the General Public, 1950–1968, Allen & Unwin: London. 8. Diethelm, P. and McKee, M. (2009) Denialism: What is it and how should scientists respond? European Journal of Public Health 19 (1), 2–4. 9. Schaeffer, K. (2020) A Look at the Americans Who Believe There Is Some Truth to the Conspiracy Theory That COVID-19 Was Planned. www.pewresearch. org/fact-tank/2020/07/24/a-look-at-the-americans-who-believethere-is-some-truth-to-the-conspiracy-theory-that-covid-19-wasplanned/ (accessed 6 March 2021). 10. Mckee, M. and Novotny, T.E. (2004) Political interference in American science: Why Europe should be concerned about the actions of the Bush administration. European Journal of Public Health 13 (4), 289–291. 11. Russell, B. (1919) Proposed Roads to Freedom, Henry Holt and Company: London, p. 147. 12. Perry, W.L., McInnis, B., Price, C.C., Smith, S.C. and Hollywood, J.S. (2013) Predictive Policing: The Role of Crime Forecasting in Law Enforcement Operations, RAND Corporation: Washington, DC, p. xix. 13. Richardson, R., Schultz, J. and Crawford, K. (2019) Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. New York University Law Review 94, 192–233, p. 8. 4 WHAT IS THE SCIENTIFIC METHOD? THE MEANINGS OF SCIENCE The benefits of science to human achievement, longevity, and advancement are hard to overestimate. I fell over 300 feet while iceclimbing in the Scottish Highlands and am only here thanks to the skill of multiple surgical teams and the efficacy of medical science. The processes of systematic observation, experimentation, logical reasoning, and the formation and testing of hypotheses have been central to reducing suffering, promoting better health, and increasing our longevity. The word ‘science’ has many meanings. It can represent the intellectual activity of the pursuit of knowledge, the body of knowledge itself (such as ‘crime science’), and the application of a systematic methodology. This latter application epitomizes how science is ‘done’ and is centered on the scientific method. The scientific method is a procedure for the systematic observation and measurement of studies or experiments designed to test research hypotheses. Arguably, one strength of the scientific method is it restrains a natural tendency to rush to conclusions and skip the analytical stages. In press conferences and on Twitter, too frequently we can observe leaders grasp at any passing notion to explain a new phenomenon. DOI: 10.4324/9781003145684-4 48 W hat is the scientific method ? At that point, as Thomas Chamberlin wrote in 1897, “Interpretation leaves its proper place at the end of the intellectual process and rushes to the forefront. Too often a theory is promptly born and evidence hunted up to fit in afterwards”.1 The realities of scientific practice are rather messy and continually evolving. They can differ considerably between academic disciplines, research areas, and even between scientists. I outline one version of the scientific method in this chapter but bear in mind other researchers and instructors might articulate different variations. It is also useful to appreciate that we do not have to rigidly follow the scientific method to advance knowledge. If your car is running low on fuel, a thorough research study is not required to determine it is time to stop by a gas station. Also, the scientific method is not a rigid determination of which techniques are more appropriate than others. As I explain later in the book, some research methods are considered more useful for estimating policy effects, but do not take that to mean researchers using other approaches are not ‘doing science’. A final note of caution: The realities of scientific practice are rather messy and continually evolving. They can differ considerably between academic disciplines, research areas, and even between scientists. For this chapter, let us take a quick spin around the scientific method. Later chapters will go into each section in more detail. THE SCIENTIFIC METHOD Identify a specific problem Exposure to the ideas around evidence-based policing can start people on the path to becoming more hesitant about blithely accepting the prevailing wisdom of the profession. This opens the door to W hat is the scientific method ? 49 interesting questions about policing and crime control policy, and the development of a research attitude. A research attitude is a desire to advance policing by exploring different approaches, asking questions, evaluating data and information, challenging tradition, and being open to new ideas and the merits of testing different experiences. It involves being curious about the policing world. Armed with a ‘culture of curiosity’, officers can identify specific problems through observation and research. Police officers are often embedded in areas that have many challenging problems, problems that they observe every day. These can include questions about data management, recruitment, tactics, equipment, documentation, community relations, leadership, or public relations. Students of crime and policing can make use of the wealth of academic research that has increased over the last couple of decades and seek out new research opportunities. The starting point at the top of Figure 4‑1 is to identify a specific research problem. This is discussed in more depth in the next chapter. A research attitude is a desire to advance policing by exploring different approaches, asking questions, evaluating data and information, challenging tradition, and being open to new ideas and the merits of testing different experiences. It involves being curious about the policing world. Conduct background research Background research involves exploring the topic both internally and externally to learn more about the subject. First explore internal organizational data. For a crime example, are gunpoint robberies concentrated in one or two areas, or at specific times? Are particular victims targeted? For a non-crime example, are there disparities in the demographics of police department recruits? Once you have a good grasp of the problem, you can then explore the research literature and expand your understanding of the issue. 50 W hat is the scientific method ? Figure 4‑1 The scientific method Exploring the topic externally means reaching outside the organization to examine what external knowledge exists. In the last thirty years, there has been an exponential rise in scholarship and research around policing problems, and some of the questions that police leaders often ask are already well researched. Chapter 6 identifies some of the useful websites for your own research, and Chapter 7 explains how to interpret what you find. Develop a hypothesis and research question A hypothesis is not a general question about the area, but a clear, falsifiable statement about what you think might be going on. Being falsifiable means that the idea can be reasonably tested and found either true or false. “There is a china teapot in space somewhere W hat is the scientific method ? 51 between Earth and Mars” is not detectable with the technology currently in existence and so is not falsifiable. However, the hypothesis that “increasing foot patrols in a high crime area would reduce robberies”, or “conducting recruitment drives at Historically Black Colleges and Universities would increase the number of Black police applicants” are reasonable and falsifiable. We could test them both. A hypothesis is not a general question about the area, but a clear, falsifiable statement about what you think might be going on. Once you have a hypothesis, you can expand to make some sort of prediction and research question. For example, if your town has a robbery problem, your research might lead you to discover that during the Philadelphia Foot Patrol Experiment, foot beat officers reduced violence by 23 percent.2–3 You might reasonably hypothesize that your town could experience a similar reduction, and this leads to your research question: ‘Would the introduction of foot patrol in high crime areas of our town reduce violence by at least 20 percent?’ Chapter 8 goes into hypotheses and research questions in more detail. Undertake a study or experiment Armed with a hypothesis that incorporates your prediction of what could happen, and a research question to address, you now implement a formal study. Different types of research and experiments are explained later in the book (Chapters 9 to 11), but for now this is where you get to the real work. Deploy foot patrol officers, or initiate recruitment drives in colleges, or whatever you have hypothesized will address your problem. You should carefully track not just the results, but also the activities. I have learned that if you are not successful in reducing a problem, nobody takes interest in what you did. 52 W hat is the scientific method ? But if you managed to make the community safer, everyone wants to know exactly what you did. Policing is subject to a high level of public scrutiny, and it is important that evidence-based policing is conducted with the highest level of research integrity. This is especially important in the stages of undertaking a study and analyzing the results. Box 4‑1 outlines some key ethical considerations when conducting research. Analyze results and draw conclusions Once the study period has finished, it is important to not draw premature conclusions, but instead analyze the data with an open mind. Determine if the results of the study answer your research question, and whether they support or refute your hypothesis. Some police officers can be intimidated by this stage and worry that they need a statistical background. That may be true if you are conducting a ‘bleeding-edge’ scientific study that aims for the highest levels of statistical precision; however, many times there are straightforward tools that can help. Chapters 12 and 13 discuss some relatively simple approaches. If your hypothesis is not supported, you may want to return to an earlier stage and refine your hypothesis. Do not, however, forget that negative results can be a win (see Box 4‑1 for why there are also ethical reasons to report negative or null findings). It can be a huge time and money saving for policing to stop doing something that does not work. Negative results can also generate potential new questions for you and others to explore. BOX 4‑1 ETHICAL CONSIDERATIONS IN POLICING RESEARCH “Ethical behaviour helps protect individuals, communities and environments, and offers the potential to increase the sum of good in the world”.4 The US National Institutes of Health outlines seven main principles for W hat is the scientific method ? ethical research.5 Research ethics are the values, norms and institutional arrangements that are the fundamental values of research practice, and they regulate and guide scientific work based on the general ethics of morality in science and the community.6 They are broadly applicable to policing research and are adapted here for evidence-based policing. • Social value: Research studies should address questions that are important enough to justify any risk or inconvenience to people. • Scientific validity: The study should be designed and conducted in a way that maximizes the likelihood of answering the question, with feasible and appropriate research methods. To do otherwise would be to waste time and resources, and potentially cause harm. • Subject selection: The main criteria for selecting research subjects or areas should be to answer the research question thoroughly, and participating people and areas should be able to enjoy any potential benefits from the research such as improved public safety. • Favorable risk-benefit ratio: Researchers should seek to maximize the research benefits and minimize risk and inconveniences. • Independent review: To ensure a prospective study is ethically acceptable, independent review panels can assess whether researchers are sufficiently free of bias and have addressed some of the questions raised in this box. Research conducted through universities, for example, should be approved by an institutional review board. • Informed consent: Individuals who volunteer to participate in studies should get accurate and complete information about risks and rewards so they can make an informed decision about participation. • Respect for participants: Researchers have a duty to respect the decisions of people who choose to participate or decline, their information should be kept confidential, and they should be told what the research discovered. To these elements, we should also add: • Research honesty: It is important to fully report findings and not just those that agree with your hypothesis. This includes reporting any important caveats and study limitations. 53 54 W hat is the scientific method ? • Sensitivity: Some individuals and groups have been disproportionately harmed by the criminal justice system. Be sensitive to their concerns about policing. Conduct research and report findings and conclusions in a way that recognizes others may have a different interpretation. Peer review and publish findings Like asking whether a tree falling in the woods makes a sound, if a study was never published or circulated, did it really happen? Whether results of a study are positive or not, publication is vital to improving public safety and advancing good practice in policing. Readers can add to their knowledge of what works and what does not, and systematic reviews can use the study to help add to the preponderance of evidence (see next section for what I mean by this). Peer review involves submitting your research findings or ideas to review and scrutiny by other experts. In academia, research findings are generally given the greatest credence and weight if they have been published in journals that conduct double-blind peer reviews. Double-blind means that the reviewers do not know who authored the study, and the authors do not learn who reviewed their work. In this way, the reviewers are not biased, intimidated, or influenced by the status or prestige of the research team. Peer reviews frequently discover inadvertent mistakes in research methods or statistics or make other suggestions to improve a study. The system is not perfect. It requires diligence on the part of the reviewers. They must know the existing research in the field, be current on the latest statistical and methodological techniques, and have the time to provide constructive and useful feedback. While I regularly undertake reviews—because contributing to the growth of scientific knowledge in crime research is important—I am also selective. I always scrutinize the study title and abstract forwarded by the journal editors to confirm I know enough about the area before committing to do what I hope is a thorough and helpful review. W hat is the scientific method ? 55 The feedback process can extend beyond journal publication. Box 4‑2 recounts one example of where the scientific process worked to correct an error. I discuss different ways to publish research in Chapter 14. BOX 4‑2 SYSTEMATIC REVIEWS AND CRIME RESEARCH AS A SELF-CORRECTING SCIENCE It has long been known that offenders commit less crime as they get farther from home—a ‘distance decay’ in crime. One theory has argued that in the immediate area around their house, there is also a local area where offenders tend to reduce criminal activity. In this buffer zone offenders are relatively less likely to offend due to fewer opportunities and in case they are recognized by neighbors. To explore this hypothesis, two researchers from the Netherlands undertook a systematic review in 2020.7 This type of ‘study of studies’ addresses a specific question in a systematic and reproducible way, by identifying and appraising all of the literature around the topic area.8 Research reports are gathered and included in the study if they meet certain criteria for inclusion. Systematic reviews are commonly conducted in areas for which there is a large body of research, such as the buffer zone hypothesis. From 33 studies, just 11 confirmed the buffer zone hypothesis. After questions were raised about the study, the authors made the list of articles available, and one external reviewer conducted a further examination. The reviewer, Kim Rossmo, found that several of the publications the Dutch researchers used did not meet their inclusion criteria. The Dutch team subsequently retracted their study.9 No science is perfect, and errors happen. This is even in the case with experienced researchers, such as the Dutch experts to whom credit should go for retracting the article. This example also demonstrates how social science, while rarely flawless, can use review and feedback as a self-correcting mechanism. Progress occurs when researchers work together like this to collaborate, discuss—and sometimes—disagree. It could have been worse. A more egregious case within criminology has involved the suggestion of serious irregularities, a university investigation, and involvement of the police.10 56 W hat is the scientific method ? Peer review is not just useful for scholarly work in academia. It can be vital for noticing shortcomings and improving writing.11 For example, in the UK, the Thames Valley Police Journal is a place for officers and staff to share their evidence-based research and the current thinking taking place in their agency. The journal lives up to the definition of evidence-based policing (from the opening chapter) by highlighting research conducted by police. Research articles, practice notes, and discussion pieces are all reviewed by a peer within Thames Valley Police—either a police officer or police staff member—prior to being published. Publication is a form of professional collaboration. Journals, blogs, newspaper opinion pieces, and even podcasts all help to highlight your research. More on this in Chapter 14. Replicate or expand the research The last stage of the scientific method is to replicate a research finding or expand it in light of what you have learned. It is human nature to notice what is new or unusual. I have seen this in dozens of Compstat meetings across numerous agencies. Senior managers become fixated with one or two events in a new location while ignoring the mass of dots clustered in the same place month after month. It is similarly easy to read a new innovative study of crime prevention and be tempted to believe it tells the entire story. The cold fusion saga (see Box 4‑3) has become a cautionary lesson in the importance of replication. Inconsistent results cast doubt on the validity of any conclusions, and if nobody can reproduce your results (as with the cold fusion debacle) then there is clear indication of a problem. BOX 4‑3 COLD FUSION AND THE IMPORTANCE OF REPLICATION Fusion is the process by which two atoms fuse together into a single heavier atom. Under the right conditions, this releases energy, and has long been considered a future power source for humanity. The W hat is the scientific method ? 57 problem is you need a lot of initial energy to make fusion happen. Fusion powers the sun, for example, but only because the sun’s core temperature is estimated to be 27 million degrees Fahrenheit (15 million °C). Therefore, when electrochemists Stanley Pons and Martin Fleischmann held a press conference in 1989 to claim they had produced ‘cold fusion’ at room temperature, the scientific community marveled at the possibilities. Pons and Fleischmann had taken pieces of palladium and platinum, submerged them in heavy water, and triggered electrolysis by adding an electrical charge. Unfortunately, the scientific community could not replicate the result. Every reasonable attempt to duplicate the work of Pons and Fleischmann did not produce the necessary cold fusion neutrons. The two scientists had not correctly calculated the size of the forces acting in proximity to the palladium, and fusion was not theoretically possible. It is not entirely clear what Pons and Fleischmann measured, but either fraud or simple equipment error have been raised as possibilities.12 Social science research involves human behavior, and humans do not always behave in predicable and consistent ways. As a result, rather than putting too much faith in a single report, smart researchers look across a range of studies and consider the ‘preponderance of evidence’. This term has its origins in the language of civil law13 but I find it useful when discussing social science research. The preponderance of evidence is an indication of whether the weight of evidence would leave one to conclude that a fact or theory is more likely than not. Let us look at an example. Figure 4‑2 shows the results of over 30 studies exploring the effectiveness of problem-oriented policing on various outcomes. It is adapted from a systematic review conducted by Josh Hinkle and colleagues14 and shows a value similar to odds ratios, except that they are standardized across the studies. Do not worry about the statistical numbers for now and instead look at the graph (it is called a forest plot). The blue squares show the mean differences between the areas or individuals that received problem-oriented policing approaches 58 W hat is the scientific method ? Figure 4‑2 Effect sizes for problem-oriented policing studies Adapted with permission from Hinkle, J.C., Weisburd, D., Telep, C.W., Petersen, K. (2020). Problem-oriented policing for reducing crime and disorder: An updated systematic review and meta-analysis. Campbell System Review 16, e1089. https://doi.org/10.1002/cl2.1089. W hat is the scientific method ? 59 compared to similar areas or people that did not (the controls). It can be interpreted similarly to an odds ratio, except that values to the right of the solid black 0.00 line suggest that problem-oriented policing was effective.*1Most—but not all—of the studies showed the benefit from problem-oriented policing. This is confirmed by the blue ‘summary’ diamond at the bottom, indicating a beneficial average effect across all the studies. It shows a positive value of 0.183. You can also see the numerous positive values in blue on the right. Some studies are close to the 0.0 line, and three studies reported negative effects. These can be seen below the horizontal dashed line I added. Overall, however, the preponderance of evidence would clearly confirm that problem-oriented policing is an effective crime reduction strategy. The preponderance of evidence does not demand that every study show positive results; rather, it suggests that in general, a police department is more likely to be successful if they employ the approach. While we should still remain cautious of this inductive approach to scholarly research,15 interpreting a preponderance of evidence can be a useful option when police leaders have to make a decision, regardless of the state of the scholarly research. Not only is it important to replicate interesting research; it is also important to expand our understanding around new findings. While my colleagues and I found that foot patrols in crime hot spots reduced violence3 and focusing marked police cars in predicted grids reduced property crime,16 those studies generated as many new questions as they answered. For example, would the foot patrols be as effective if they focused on community contacts? Or on field investigations of suspicious persons? Or with fewer officers? With predictive policing efforts, how long should officers spend in the predicted grids? And what should they do when they are there? These questions were not specifically addressed by the existing research, but you can imagine answering them might be very useful to police leaders. Good research frequently generates more interesting questions. * With odds ratios, greater than 1.0 (rather than 0.0 as in Figure 4-2) suggests treatment effectiveness. 60 W hat is the scientific method ? SUMMARY Like checklists for pilots and surgeons, having an established process for doing complicated work can improve the quality of the work. And scientific discovery is no different, involving the following generalized stages: 1. 2. 3. 4. 5. 6. 7. Identify a specific problem Conduct background research Develop a hypothesis and research question Undertake a study or experiment Analyze results and draw conclusions Peer review and publish findings Replicate or expand the research Science can advance in many ways, but adherence to these steps maximizes the chances that the researcher will be effective and will limit the possibility of error. Following these steps and adhering to the fundamental philosophy of conducting honest, open research places a moral obligation on the researcher. As the preceding chapter showed, there is enough pseudoscience and science denialism around for us to be cautious. But when science is conducted honestly and diligently, it can advance the quality of human life. The cornerstone of scientific progress—the scientific method—remains as vital today as it did when the principles were articulated by Francis Bacon in the 17th century.17 The next chapters expand on the steps of the scientific method in more detail. REFERENCES 1. Chamberlin, T.C. (1897) The method of multiple working hypotheses. Journal of Geology 5 (8), 837–848, p. 838. 2. Ratcliffe, J.H. and Sorg, E.T. (2017) Foot Patrol: Rethinking the Cornerstone of Policing, Springer (CriminologyBriefs): New York. W hat is the scientific method ? 61 3. Ratcliffe, J.H., Taniguchi, T., Groff, E.R. and Wood, J.D. (2011) The Philadelphia foot patrol experiment: A randomized controlled trial of police patrol effectiveness in violent crime hotspots. Criminology 49 (3), 795–831. 4. Israel, M. and Hay, I. (2006) Research Ethics for Social Scientists, Sage: London. 5. National Institutes of Health (2022) Guiding Principles for Ethical Research. www.nih.gov/health-information/nih-clinical-research-trials-you/guid ing-principles-ethical-research (accessed May 2022). 6. Adapted from Norwegian National Research Ethics Committees (2019) Guidelines for Research Ethics in the Social Sciences, Humanities, Law and Theology. www.forskningsetikk.no/en/guidelines/social-sciences-humanitieslaw-and-theology/guidelines-for-research-ethics-in-the-social-scienceshumanities-law-and-theology/ (accessed February 2022). 7. Bernasco, W. and van Dijke, R. (2020) Do offenders avoid offending near home? A systematic review of the buffer zone hypothesis. Crime Science 9 (8). 8. Neyroud, P. (2019) Systematic reviews: “Better evidence for a better world”. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 103–116. 9. Bernasco, W. and van Dijke, R. (2021) Retraction note to: Do offenders avoid offending near home? A systematic review of the buffer zone hypothesis. Crime Science 10 (8), 1. 10. Bartlett, T. (2019) The criminologist accused of cooking the books. The Chronicle of Higher Education: Washington, DC, 24 September. 11. Nicholl, J. (2009) Task definition. In Strategic Thinking in Criminal Intelligence (Ratcliffe, J.H. ed), Federation Press: Sydney, 2nd ed, pp. 66–84. 12. Taubes, G. (1993) Bad Science: The Short Life and Weird Times of Cold Fusion, Random House: London. 13. Weiss, C. (2003) Expressing scientific uncertainty. Law, Probability and Risk 2 (1), 25–46. 14. Hinkle, J.C., Weisburd, D., Telep, C.W. and Petersen, K. (2020) Problemoriented policing for reducing crime and disorder: An updated systematic review and meta-analysis. Campbell Updated Systematic Review 1–86. 15. Eck, J.E. (2017) Some solutions to the evidence-based crime prevention problem. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 45–63. 62 W hat is the scientific method ? 16. Ratcliffe, J.H., Taylor, R.B., Askey, A.P., Thomas, K., Grasso, J., Bethel, K., . . . Koehnlein, J. (2021) The Philadelphia predictive policing experiment. Journal of Experimental Criminology 17 (1), 15–41. 17. Bacon, F. (1676) The Novum Organum of Sir Francis Bacon, Baron of Verulam, Viscount St. Albans Epitomiz’d, for a Clearer Understanding of His Natural History Translated and Taken Out of the Latine by M.D. London, Printed for Thomas Lee, Thomas Lee: London, p. 5. 5 HOW DO YOU IDENTIFY A SPECIFIC PROBLEM? DEVELOP A RESEARCH ATTITUDE Spend enough time with police officers and you soon identify the ‘expert’ group. These are the cops that ‘know’ how to fix the crime problem. Regardless of the issue, they immediately propose a solution with unfaltering confidence. Their pronouncement sometimes starts with the phrase “in my experience . . .” This can be followed by a lament that if it were not for politicians, prosecutors or senior officers holding them back, they could magically bring tranquility to the community. Having been a police officer, I agree a good hunch is often valuable—just reread the opening story in this book. Given many areas of policing and crime control policy still lack a strong foundation for good practice, a good hunch may be all there is on which to base a decision. Furthermore, in fast-moving and dynamic operational situations on the street, decisiveness and a ‘task-not-ask’ approach to management can be essential. But when time allows, and policies and strategies are being determined, overconfidence in hunches can be damaging to good policing. Appearing decisive is not a replacement for a little well-placed doubt, and confidence is only weakly associated with actual ability.1 DOI: 10.4324/9781003145684-5 64 H ow do you identify a specific problem ? Like General Custer boldly yelling “Charge!” at the Battle of Little Bighorn, confidence does not always win the day.*1 If you are a police officer reading this, you might recognize this trait in one or two of your colleagues. Cops with this mindset lack an essential ingredient for an evidence-based approach: a research attitude. A research attitude is a desire to advance policing by exploring different approaches, asking questions, evaluating information, challenging tradition, and being open to new ideas and the merits of different people’s experiences. A research attitude requires embracing uncertainty and doubt, and many police officers find this challenging. It asks them to be skeptical about the limits of their own knowledge and practice, to be comfortable with not having all the answers, and to try new ways of policing. A research attitude requires embracing uncertainty and doubt, and many police officers find this challenging. It asks them to be skeptical about the limits of their own knowledge and practice, to be comfortable with not having all the answers, and to try new ways of policing. In a job that stresses decisiveness and experience, doubt—and a recognition of the limits of one person’s knowledge—seems the antithesis of good policing. When frontline officers are bouncing from one radio call to the next, who has time for the luxury of doubt? I am not the only person to have noticed there are only two things coppers hate: how we are doing policing right now, and change. Chesterton’s fence, a heuristic paradox that cautions against rushing to change something until the purpose of the thing is discovered, reminds us to exercise restraint.2 Furthermore, change carries risk, as * It is also worth noting that General George Armstrong Custer graduated from West Point Military Academy . . . at the bottom of his class. H ow do you identify a specific problem ? 65 was understood hundreds of years ago by Nicolo Machiavelli when he commented:3 there is nothing more difficult to carry out, nor more doubtful of success, nor more difficult to handle, than to initiate a new order of things. For the reformer has enemies all those who profit by the old order, and only lukewarm defenders in all those who would profit by the new order. And yet, officers who have embraced a research attitude move policing forward in myriad ways. None of the classic policing studies touched on in this chapter could have been possible without the support of leadership and frontline officers involved in design, implementation, and data collection. Pracademics—police officers with not only professional expertise but also academic training capable of examining research problems—are now central to many policing experiments. In this chapter, I will explore how researchers and pracademics develop the first part of the research process, identifying a problem (Figure 5‑1). IDENTIFY A PROBLEM THROUGH OBSERVATION The shift to a research attitude is a shift in mindset. It requires reimagining the business of policing, often by actively observing your surroundings. Scanning your environment for good questions means challenging the status quo and not accepting the “how we’ve always done it” approach. It might mean simply asking, “Is there a better way to do that?” As Isaac Asimov is quoted as saying, the classic phrase that precedes some new discovery in science is not “Eureka!” but more like “Hey, wait a minute”.4 If you are in policing, you might wonder if there is a better way to do morning rollcall? Could a change in advertising strategy improve recruitment of female police officers? Are officers patrolling on their own more proactive than patrolling 66 H ow do you identify a specific problem ? Figure 5‑1 Identifying a problem as part of the scientific method with a colleague? Just a few of the topics that researchers and officers have explored include: • • • • • • • • • • • • The benefits of community support officers5 Hot spots policing6 Foot patrol7 Implicit bias training8 Offender-focused proactive strategies9 Public satisfaction with police in violent crime hot spots10 Problem-oriented policing11 Community policing12 Focused deterrence13 Body-worn cameras14–15 Public perception of police officers in different types of uniform16 10–15-minute proactive patrols in crime hot spots17 H ow do you identify a specific problem ? • • • • • • • 67 Patrols using predictive policing18 The work of crime analysts19–20 Saturation patrols21 Policing drug hot spots22 Acoustic gunshot detection systems23 Crisis intervention and de-escalation training24 Procedural justice at traffic stops25 These are but a handful of thousands of policing studies that have explored how we police our communities (you will learn where to access these studies in the next chapter). Most of these topics need more study, and there are countless other areas for which little or no research exists. Being up to date with current affairs and the news can also spark interesting research topics. But for practitioners, just looking around and doing the job can be enough to ignite ideas and curiosity. If you are police staff, listen to conversations around the workplace. Speak to people involved in policy and find out where they perceive your department’s strengths and weaknesses. If you are not in policing, you can still talk to officers in the street, or if they come as guests to a class. Perhaps, like Sir Alexander Fleming—the microbiologist who discovered penicillin—you will notice something unusual in the workplace and remark, “That’s funny”.26 IDENTIFY A PROBLEM THROUGH RESEARCH For students of policing, the academic literature can be a source of questions. For this, there really is no replacement for reading regularly (and I can hear my students groaning while reading this). Research into policing is relatively recent, and a range of new publications covering tactics, strategies, attitudes, technologies, crimes, or problems can all spur ideas. The discussion and conclusion sections of academic articles sometimes contain gaps that need to be filled or indications from the research team as to what they are doing next, but that does not preclude anyone from using that idea. The more, the merrier! 68 H ow do you identify a specific problem ? The grey literature can be another source for motivation. Grey literature is research that is published in non-commercial form such as from industry or government, including internal reports, criminal justice data, working papers, or policies. Professional publications and websites can be sources for new study ideas or replication of an existing approach. Examples include Police Chief magazine in the US, Police Science (the journal of the Australian and New Zealand Society of Evidence-Based Policing), or in the UK Policing Insight and the Going Equipped publications. Finally, greater engagement with the profession can improve understanding of where policing is going. As an addition to reading (no, you cannot skip the reading), one popular approach is to use social media to follow interesting people. There is a risk here as social media can be a massive time suck and anti-intellectual rabbit hole if you are not careful. But if you are diligent about whom you follow, police leaders (of any rank), researchers, policy makers, and people at think tanks can be insightful and interesting. Through social media you can engage with them and learn where their perspective can differ from your own because it is good to have our ideas challenged. Importantly, they can point you in the direction of other people or blogs, podcasts, and other outlets where the latest ideas are frequently circulated before they ever appear in print. I have been often told that the guests on my podcast, Reducing Crime, have been a source of new ideas for listeners from across the policing world.†2 EMBRACE PRACTICALITY The more you engage with the politics around policing, the more you encounter policy suggestions that are more aspirational than practical. It can seem wonderfully aspiring to suggest that we can solve all of the problems with policing if we just end poverty.27 After † A shameless plug I know, but the Reducing Crime podcast features monthly interviews with influential thinkers in the police service and leading crime and policing researchers. It is available from most podcast providers including Apple podcasts, Spotify, iHeartRadio, Stitcher, Overcast, and SoundCloud. H ow do you identify a specific problem ? 69 all, does anybody not want to end poverty? But is this feasible within the constraints of any government budget, let alone the police mandate? And at a reasonable cost and timeframe to resolve a local crime problem for the community? This is not to argue that if you are looking for a problem to study, you should already have selected a feasible solution. That is getting ahead of the scientific process. And we should not shirk from considering aspirational and ambitious goals within policing and the wider criminal justice system. It is, however, an argument for embracing practicality and feasibility in the selection of a topic. There is no shortage of current areas where the application of law enforcement or the work of the police can improve. Bear in mind that medical researchers take an experimental approach to seemingly tiny details of their work, such as the effects of miniscule differences in lancets used to draw a blood sample from a finger.28 If there are benefits to the community, no area should be immune to scrutiny. For my part, I try and stay practically engaged with policing by going on ride-alongs, walking patrols, or otherwise spending time with officers from agencies I work with on projects. This gets me out of the university and engaged with frontline policing, a nice dose of reality after spending time in higher-level policy meetings. A ride-along or other practical experience can provide significant opportunities to identify research topics. USE THE CHEERS FRAMEWORK One starting place can be choosing a topic that is an existing priority for police leaders and the community. Recurring problems, and especially ones that are getting worse, can concentrate the attention of policy makers and help promote your project. This may be why patrol policing continues to be the focus of so much experimentation in police research. The CHEERS framework from Ron Clarke and John Eck (Box 5‑1) is a tool to help identify a community problem that police are expected to address.29 For example, with gunpoint robberies in 70 H ow do you identify a specific problem ? a police district, the neighborhood residents comprise the affected community, the loss of property and potential loss of life is clearly harmful, there is a clear expectation that police participate in addressing the issue, there are measurable events that are recurring, and gunpoint robberies in the street would be considered similar, at least initially unless analysis subsequently suggested otherwise. BOX 5‑1 CHEERS ELEMENTS FOR IDENTIFYING A PROBLEM Community The problems are experienced by some members of the public, such as individuals, businesses, government agencies, or other groups. Harmful The problem should not just be against the law; it must cause harm that can involve property loss or damage, injury or death, or serious mental anguish. Expectation It must be evident that some members of the public expect the police to do something about the causes of the harm or participate in the response. Events There must be more than one event associated with the problem. These might be break-ins to cars, shootings outside a notorious bar, or loud motorcycles disturbing citizens. Recurring Having more than one event implies that events must recur. They may be symptoms of a spike or a chronic problem, but the key is that there is an expectation of continuation. Similarity The events must be similar or related. They may all be committed by the same person, happen to the same type of victim, occur in the same types of locations, take place in similar circumstances, involve the same type of weapon, or have one or more other factors in common. H ow do you identify a specific problem ? 71 You could also identify a problem that might retain only some elements of the CHEERS framework. For example, a paucity of minority recruitment to the police department might not be considered harmful within the definition in Box 5‑1, yet it likely harms police legitimacy in the eyes of the community. The problem does share many of the other CHEERS characteristics. In many departments it is a recurring problem, with negative consequences, measurable data points (each person recruited could be an event), and there is an expectation that police improve the situation. SUMMARY I am frequently surprised when I encounter officers who complain about how policing is being done yet they express no interest in figuring out how to do it better. These officers lack a research attitude, sometimes indicative that their department does not have a ‘culture of curiosity’. Embracing a research attitude means being comfortable with uncertainty and doubt. This can be hugely challenging not just in policing, but for people generally. As has been recently noted, “In contemporary American culture, the three little words in shortest supply may not be ‘I love you,’ but ‘I don’t know’”.30 Yet when police officers become more at ease with uncertainty, the insights it can bring help to advance policing. Problems can be identified through observation in the field and through knowledge and surveillance of the scholarly literature. The CHEERS framework highlights the need to address problems that are harmful to a community and where there is an expectation that police can be a part of the solution. And reading about the area generally will help you focus your question. The key to using previous research as a guide to your work is to emphasize reliable sources, as I discuss in the next chapter. Finally, it is important to keep an eye on the potential practical benefits. This will help you, as a researcher, maintain currency and value to the policing profession. 72 H ow do you identify a specific problem ? REFERENCES 1. Freund, P.A. and Kasten, N. (2012) How smart do you think you are? A meta-analysis on the validity of self-estimates of cognitive ability. Psychological Bulletin 138 (2), 296–321. 2. Chesterton, G.K. (1990) The thing. In The Collected Works of G.K. Chesterton (Chesterton, G.K. ed), Ignatius Press: San Francisco, vol. 3. 3. Machiavelli, N. (1532 [1903]) The Prince, Grant Richards: London, p. 22. 4. Burnham, R. (1997) Comet Hale-Bopp: Find and Enjoy the Great Comet, Cambridge University Press: Cambridge, p. 53. 5. Ariel, B., Weinborn, C. and Sherman, L.W. (2016) “Soft” policing at hot spots—do police community support officers work? A randomized controlled trial. Journal of Experimental Criminology 12 (3), 277–317. 6. Braga, A.A. and Weisburd, D.L. (2022) Does hot spots policing have meaningful impacts on crime? Findings from an alternative approach to estimating effect sizes from place-based program evaluations. Journal of Quantitative Criminology 38 (1), 1–22. 7. Ratcliffe, J.H. and Sorg, E.T. (2017) Foot Patrol: Rethinking the Cornerstone of Policing, Springer (CriminologyBriefs): New York. 8. Worden, R.E., McLean, S.J., Engel, R.S., Cochran, H., Corsaro, N., Reynolds, D., . . . Isaza, G.T. (2020) The Impacts of Implicit Bias Awareness Training in the NYPD, The John F. Finn Institute: Albany, NY, p. 188. 9. Groff, E.R., Ratcliffe, J.H., Haberman, C., Sorg, E., Joyce, N. and Taylor, R.B. (2015) Does what police do at hot spots matter? The Philadelphia policing tactics experiment. Criminology 51 (1), 23–53. 10. Haberman, C.P., Groff, E.R., Ratcliffe, J.H. and Sorg, E.T. (2016) Satisfaction with police in violent crime hot spots: Using community surveys as a guide for selecting hot spots policing tactics. Crime and Delinquency 62 (4), 525–557. 11. Hinkle, J.C., Weisburd, D., Telep, C.W. and Petersen, K. (2020) Problemoriented policing for reducing crime and disorder: An updated systematic review and meta-analysis. Campbell Updated Systematic Review 1–86. 12. Gill, C., Weisburd, D., Telep, C.W., Vitter, Z. and Bennett, T. (2014) Community-oriented policing to reduce crime, disorder and fear and increase satisfaction and legitimacy among citizens: A systematic review. Journal of Experimental Criminology 10 (4), 399–428. H ow do you identify a specific problem ? 73 13. Braga, A.A. and Weisburd, D.L. (2012) The Effects of “Pulling Levers” Focused Deterrence Strategies on Crime, Campbell Collaboration: Oslo, Norway. 14. Gaub, J.E., Choate, D.E., Todak, N., Katz, C.M. and White, M.D. (2016) Officer perceptions of body-worn cameras before and after deployment: A study of three departments. Police Quarterly 19 (3), 275–302. 15. Lum, C., Koper, C.S., Wilson, D.B., Stoltz, M., Goodier, M., Eggins, E., . . . Mazerolle, L. (2020) Body-worn cameras’ effects on police officers and citizen behavior: A systematic review. Campbell Systematic Reviews 40. 16. Simpson, R. (2017) The police officer perception project (POPP): An experimental evaluation of factors that impact perceptions of the police. Journal of Experimental Criminology 13 (3), 393–415. 17. Mitchell, R.J. (2016) The Sacramento Hot Spots Policing Experiment: An Extension and Sensitivity Analysis, Institute of Criminology, University of Cambridge: Cambridge. 18. Ratcliffe, J.H., Taylor, R.B., Askey, A.P., Thomas, K., Grasso, J., Bethel, K., . . . Koehnlein, J. (2021) The Philadelphia predictive policing experiment. Journal of Experimental Criminology 17 (1), 15–41. 19. Ratcliffe, J.H. and Kikuchi, G. (2019) Harm-focused offender triage and prioritization: A Philadelphia case study. Policing: An International Journal 42 (1), 59–73. 20. Smith, J.J., Santos, R.B. and Santos, R.G. (2018) Evidence-based policing and the stratified integration of crime analysis in police agencies: National survey results. Policing: A Journal of Policy and Practice 12 (3), 303–315. 21. Taylor, B., Koper, C.S. and Woods, D.J. (2011) A randomized controlled trial of different policing strategies at hot spots of violent crime. Journal of Experimental Criminology 7 (2), 149–181. 22. Weisburd, D. and Green, L. (1995) Policing drug hot spots: The Jersey City drug market analysis experiment. Justice Quarterly 12 (4), 711–735. 23. Ratcliffe, J.H., Lattanzio, M., Kikuchi, G. and Thomas, K. (2019) A partially randomized field experiment on the effect of an acoustic gunshot detection system on police incident reports. Journal of Experimental Criminology 15 (1), 67–76. 24. Peterson, J., Densley, J. and Erickson, G. (2020) Evaluation of “the R-model” crisis intervention de-escalation training for law enforcement. The Police Journal 93 (4), 271–289. 74 H ow do you identify a specific problem ? 25. Mazerolle, L., Antrobus, E., Bennett, S. and Tyler, T.R. (2013) Shaping citizen perceptions of police legitimacy: A randomized field trial of procedural justice. Criminology 51 (1), 33–63. 26. Brown, K. (2004) Penicillin Man: Alexander Fleming and the Antibiotic Revolution, Sutton Publishing: Stroud, GL. 27. Cooper, R. (2015) To end police violence, we have to end poverty. The Week. https://theweek.com/articles/573307/end-police-violence-haveend-poverty (accessed February 2022). 28. Jarus-Dziedzic, K., Zurawska, G., Banys, K. and Morozowska, J. (2019) The impact of needle diameter and penetration depth of safety lancets on blood volume and pain perception in 300 volunteers: A randomized controlled trial. Journal of Medical Laboratory and Diagnosis 10 (1), 1–12. 29. Clarke, R.V. and Eck, J.E. (2003) Becoming a Problem Solving Crime Analyst in 55 Small Steps, Jill Dando Institute: London. 30. Barker, D.C., Detamble, R. and Marietta, M. (2022) Intellectualism, antiintellectualism, and epistemic hubris in red and blue America. American Political Science Review 116 (1), 38–53, p. 49. 6 HOW DO YOU FIND RELIABLE RESEARCH? UNRELIABLE SOURCES The global COVID-19 pandemic was devastating in so many ways; however, it produced one unanticipated benefit. It revealed which colleagues and family members get their news from Facebook and conspiracy websites run by alien abduction afficionados living in their mothers’ basements. The level of science denialism and disinformation that emerged during the pandemic has been a cautionary warning as to how poorly people filter their information sources. This chapter focuses on how to conduct background research, an important part of the scientific method (Figure 6‑1). Before getting into useful resources, it is useful to first rule a few out. Facebook, Wikipedia, most social media, and the random thoughts of your overly chatty Uber driver, are all awful places to seek insight into the criminal justice system. As will be explained, these fail one or more of the four key tests for reliability (see next section). There has also been a decline in the perceived quality of journalism over the last half century. Mass media are increasingly viewed as partisan, and the majority of people now question whether they DOI: 10.4324/9781003145684-6 76 H ow do you find reliable research ? Figure 6‑1 Background research as part of the scientific method report the news fully, accurately, and fairly.*1There are still good journalists and reputable news organizations; however, if a specific news event only appears in one outlet, this should raise your suspicions. If a story interests you, read about it in multiple sources to get a more balanced perspective. Box 6‑1 has a list of key characteristics of trustworthy sources to consider when reviewing news sources and websites. In general though, at least for detailed scholarship, newspapers and news websites do not carry the detail necessary for good evidence-based policing research. *Only 36 percent of Gallup respondents in 2021 had a great deal or fair amount of trust and confidence in the mass media to report the news fully, accurately, and fairly. https://news.gallup.com/poll/1663/media-use-evalu ation.aspx. H ow do you find reliable research ? 77 BOX 6‑1 CHARACTERISTICS OF REPUTABLE SOURCES Reputable sources tend to demonstrate the following characteristics: • • • • • Identify authors Check facts and identify sources Acknowledge limitations Correct errors Generate unique content (rather than rehashing the work of others) • Identify content type (distinguishes between news and opinion) • Acknowledge biases • Write accurate (not emotional or clickbait) titles and headlines Within policing, there are three areas where one might hope to find current and up-to-date research, yet which tend to be a disappointment: • The police academy • Insights from the field • Ranking officers The National Academies of Sciences recently argued that police training “activities, tactics, and strategies should be supported by good evidence”.1 And police academies are slowly moving away from the less effective traditional militaristic pedagogic style.2–3 But as Seth Stoughton laments, at least in the US, “Police training in the United States is heavily anecdotal . . . Researchers have not studied every aspect of policing, yet what research there is tends to be ignored or discounted in police training”. Police academy training seems designed to minimize the liability of police departments rather than teach evidence-based strategies for handling crime and social problems. The focus is on officer safety, 78 H ow do you find reliable research ? law updates, and legal procedures. Frequently, I have seen instruction revert to ‘war stories’ with questionable educational insight or value. Unless you know them to be well versed in the academic literature, that one thief-taker on your shift is also not a source for reliable research. Insights from the field, such as from natural police officers, can contain a wealth of skill and intuition and be a source for tactical advice and other practical lessons. Insights from the field are venerated in police culture; however, as Stoughton continues, “a court would be remiss if it qualified a witness as an expert based on their incidental exposure to casual information over the course of their careers, yet that is how a significant amount of knowledge is disseminated in policing”.4 Within evidence-based policing, these insights generate ideas that we evaluate and study; however, until they are subjected to more rigorous scrutiny, they should not necessarily be taken as gospel. The police are a paramilitary organization, replete with rank insignia and uniforms. This results in a deference to rank that is often a proxy for experience, even though some leaders have neither extensive street time nor relevant expertise. A third source of which to be cautious is ranking officers. The police are a paramilitary organization, replete with rank insignia and uniforms. This results in a deference to rank that is often a proxy for experience, even though some leaders have neither extensive street time nor relevant expertise. Leading internal affairs or running the property office is not a training ground for a district or precinct crime command, yet some officers consider the experience of senior leaders a sufficient evidential basis for action.5 I once observed a senior officer arguing for a colleague to be awarded a post based on his twenty-five years of experience. The police chief, who was less of a fan, said “Does he have twenty-five years’ experience, or one year of experience twenty-five times?” H ow do you find reliable research ? 79 WHAT MAKES A SOURCE RELIABLE? You can improve your chances of finding good evidence if you use reliable sources. These tend to be found in the academic and grey literature. Academic literature is scholarly work written by specialists in the discipline and published in academic journals that use peer review to screen the quality of articles. If you find an article through either a library website or Google Scholar, and it is from a peer-reviewed journal, then it is more likely to be reliable (and if you use it in your writing, pay attention to the note in Box 6‑2 on source citation). Articles in academic journals tend to be more rigorous, by which they are more exact and carefully conducted, with attention to precision, thoroughness, and accuracy. Furthermore, they go through a process of peer review (though read about predatory journals in Box 6‑3). BOX 6‑2 A QUICK NOTE ON CITING LIBRARY SOURCES If you are affiliated with a university, you can often access their library resources through secure websites. This does not mean that the library and the academic literature accessed through their portal are considered ‘internet sites’ for the purposes of academic scholarship and citation. It means that the scholarly journals are content that you happen to access electronically. In your writing you should always cite the specific journal article or book, and not cite a link to the resource webpage. Websites should only be cited if the web page is the only content type. Books, podcast episodes, and journal and newspaper articles accessed through web pages should all cite the source material and not the web address. In some disciplines, conference proceedings are peer-reviewed and rigorous, but in others there is little barrier to entry. As a rule, it may be worth viewing conference proceedings as a form of grey literature. As explained in the previous chapter, grey literature is published in non-commercial form from sources like industry or government, 80 H ow do you find reliable research ? and includes internal reports, criminal justice data, working papers, or policies. Trade magazines, such as Police Chief magazine and other outlets mentioned in Chapter 14, would fall within grey literature. Caution should be exercised with any source, but more so with grey literature. Scholarly work (academic or grey) will likely be reliable if it is sourced from a university library or Google Scholar, is produced by researchers from within a university or research institute, is peerreviewed, has a reference list, and explains any sources of funding. Why is funding important? If you read a provocative article questioning climate change, I would hope it would impact your interpretation if the authors received funding from the coal industry—raising questions about reliability and bias. ABCDE source reliability test The ABCDE source reliability test is a five-item check on the reliability of sources for your research question: • • • • • Authenticity Bias Credibility Depth Evidence Authenticity refers to the need to confirm that the authorship of an article is clear and that the author is who they say they are. Can you determine authorship? Is it clear that the source can be attributed to the authors? Anonymous sources where the authorship is dubious (such as Wikipedia) are not reliable because the authenticity (and credibility) cannot be confirmed. It is also important to check the web address when accessing news sources. Fake news sites abound and deliberately mimic the real sources. Bias is an inclination to favor a group, thing, or idea. It can manifest in a tendency for a person’s beliefs or ideological perspective to H ow do you find reliable research ? 81 shape their research and writing. Do they convey a balanced perspective suggesting the author is aware of different viewpoints or data? Does the author acknowledge evidence even if it counters the main thrust of their argument? There are increasing numbers of think tanks and working groups that operate from an ideological bias or political slant. A useful litmus test is to review a few articles from a publication or author and ask yourself if they would publish something contrary to their general position. If your answer is ‘probably not’ then that suggests bias. Credibility refers to the trustworthiness and reputation of the source to write or create the material. Just because someone can establish their authenticity, it does not guarantee their credibility. For example, an Instagram influencer could say that crime is going down, but it would not have the same credibility as an official government account (unless you are in North Korea, then it is a toss-up). An academic or police chief with a long-established reputation will be a more credible source for a summary of research than a first-year undergraduate’s blog. Depth refers to whether an article conveys the nuance and complexity of an issue. Does the work consist of noise and a shallow treatment of the material, or does it contribute clarity and insight? If so, what are those key insights, and to what direction do they take your research question? Importantly, is there a sense of representativeness around the wider topic? Are they writing about an unusual case study or about the general trend? Both are useful, but it is useful to understand the difference. The murder of George Floyd was a brutal incident and devastating to police legitimacy. It was not necessarily representative of most police-community contacts, considering more than 60 million Americans interact with police each year. The Bureau of Justice Statistics report that about two percent of Americans who report contact with law enforcement experienced threats or use of force from police.6 Evidence refers—in terms of understanding and reading newspaper articles or grey literature—to the writer being clear about their sources and methods. Do they detail their sources for quotes, 82 H ow do you find reliable research ? information, and statistics? If they do not, it does not mean they are necessarily being deceptive; however, it does suggest caution. Being able to independently verify facts is an important aspect to evidencebased transparency. Each item in the ABCDE source reliability test is not, on its own, a deal-breaker. Some police officers write anonymously for fear of retaliation from their bosses. Some academic journals have an acknowledged activism and bias. These are not reasons to automatically ignore the work; however, it is good practice to be explicitly aware of these issues when reviewing and employing their writing. All of this leads us towards scholarly academic research that is published in books from respected professional or university presses, or journal articles from quality journals. What are quality academic journals? Unlike predatory journals (Box 6‑3), they tend to have processes of blind peer review, respected editors, distinguished editorial boards, ethical standards, and established reputations. BOX 6‑3 THE DANGERS OF PREDATORY JOURNALS Predatory journals are doing immeasurable damage to science. They publish articles in return for a fee—of course—but do not perform any quality control function, rarely (if ever) send articles for review by peers, or make any attempt to detect ethical failures in research. They make misleading statements on their websites and bombard unwitting scholars to publish in their ‘journals’, frequently concocting editorial boards that cannot be verified. A group of leading scholars have defined predatory journals and the publishers that promote poor scholarship as “entities that prioritize self-interest at the expense of scholarship and are characterized by false or misleading information, deviation from best editorial and publication practices, a lack of transparency, and/or the use of aggressive and indiscriminate solicitation practices”.7 Predatory journals have trapped researchers and the public alike. Researchers can inadvertently believe that a journal is legitimate, and the public can be tricked into thinking research scholarship in these H ow do you find reliable research ? 83 outlets has undergone the vetting process that exemplifies the scientific method. Worse, some unscrupulous scholars knowingly publish weak research in predatory journals just to get a line in their resume. One innovative researcher—who seems to delight in exposing predatory journals—published an article on COVID-19 in the impressive sounding (yet predatory) American Journal of Biomedical Science & Research. The editor agreed to publish the article after just four days, even though it claimed that “eating a bat-like Pokémon sparked the spread of COVID-19” and included a citation from the fictitious Gotham Forensics Quarterly, published by none other than ‘Bruce Wayne’.8 HOW TO READ ACADEMIC ARTICLES Here is a dirty secret: professors rarely read entire journal articles. There just is not time in a busy work schedule, and there are too many potentially interesting papers getting published. Therefore, many academics develop a system to keep abreast of the literature while still finding time for research, teaching, grading, peer review, grant writing, and maybe some semblance of a personal life. It helps that many scholarly articles have a similar format. The title and abstract provide an overview of the work. These are freely available. If you have access to the full article, then most start with an introduction and section that reviews the existing scholarship in the area. This literature review summarizes what is known, and not known, about the subject. In doing so it should set up why the rest of the paper exists. Then there is a section outlining the data and methods. This can frequently get a little technical. The results follow this, before the paper wraps up with a discussion of the pertinent findings and their implications. People develop their own skimming system that may vary depending on why you are reading. For example, are you looking to learn a new analytical technique, or to gather conclusions about what works to tackle crime? These different goals might result in distinct approaches; however, here is one general method to consider. 84 H ow do you find reliable research ? Start with the title. It will indicate if the paper is in your area of interest and understandable. If you cannot fathom the title, do not be hopeful that the rest of the article is any more readable. If you are still interested, read the abstract. A well-written abstract is a concise summary of the study, briefly describing the rationale, methods, and results. Only continue to the full paper if the abstract suggests that the paper will be of value to you. Otherwise, move on and seek out another article more relevant. Seriously. There is little point wasting time on an article that is not relevant to your work. If the title and abstract have piqued your interest, jump into the introduction. If the paper is in a familiar research area, you might skim through this. Many disciplines such as criminology and sociology tend towards long-winded introductions. Instead, focus on the last paragraph or two of the introduction. This is where the authors usually outline or reiterate the key research question and how they intend to tackle it. Next, read the conclusion. What?! Skip all the meat in the middle? Yes, for now. Journal styles differ, but if there is a conclusion or discussion, jump ahead to it. This is where authors summarize what they found, and what they learned from doing all the hard work that you just skipped past. Discussion sections can veer off into the authors’ interpretation of the findings, but the key first paragraph or two should summarize the results. Look at any figures and graphs. It is often true that a picture is worth a thousand words. If there are numerical tables, asterisks (*) tend to indicate variables that are statistically significant. In other words, what can matter (though heed the warning in Chapter 13). Look over the results section to see if the authors explain why they are important and how to understand them. If you are interested in the methodology, wade into the methods section. Academic methodology can often be analytically dense and jargon ridden. Bear in mind the authors may have been studying the area for decades, and both the analytical technique and terminology may be impenetrable. Cut yourself a break. You do not H ow do you find reliable research ? 85 have to understand every nuance of a study to take something useful from it. Finally, consider skimming through the literature review and remainder of the paper. You already looked at a paragraph or two of the study’s preamble, but if the area is new to you, look for any literature flagged as ‘seminal’, ‘key’, or otherwise suggested as being important. These may be foundation articles in the field and the next articles you should explore. You can find the citations in the reference list. Remember, the key here is speed and efficiency so that you only spend time on the scholarly work that is directly relevant to your study. You do not have to digest every nuance and sentence of an article to glean its pertinent details, and saving time allows you to explore a wider body of research. Dedicate the time you save to a word-by-word understanding of the most pertinent studies. WEBSITES WITH RESEARCH SUMMARIES Scholars can access journal articles electronically through academic libraries, but what about everyone else? A few government and academic websites have taken the hard work out of wading through endless research reports and academic articles by providing simple and more digestible summaries. This section describes a few. Web addresses tend to change, so if you are unable to find the sites described in this section, check for updated links at the website for this book (evidencebasedpolicing.net) The UK College of Policing Crime Reduction Toolkit college.police.uk/research/crime-reduction-toolkit The Crime Reduction Toolkit uses the five-item EMMIE framework to summarize the best available research evidence on what works to reduce crime. EMMIE stands for effect, mechanism, moderators, implementation, and economic cost. What effect did the 86 H ow do you find reliable research ? intervention have on crime? By what mechanism is it supposed to work? What moderators limit where it will and will not work? Are there implementation considerations? And what are the economic cost impacts? It only draws on systematic reviews (level 5* in Figure 7‑3) but still details over 50 different types of intervention, from after school clubs to youth curfews. CrimeSolutions crimesolutions.gov/programs.aspx This National Institute of Justice website collates both programs and practices and reviews them against a simple scale of effective, promising, and no effect. Programs are specific activities that are an attempt to solve a crime problem somewhere. For example, a study of New York’s Integrated Domestic Violence courts found no statistically significant differences in re-arrests and conviction rates when comparing dedicated court cases to traditional family court cases. The website’s programs are often illustrative and worth a read. Practices are general approaches and strategies that you might employ, such as hot spots policing, focused deterrence, or cognitive behavioral therapy for anger-related problems in children and adolescents. Practices are often the focus of evidence-based researchers. The Campbell Collaboration campbellcollaboration.org The Campbell Collaboration produces systematic reviews (level 5* in Figure 7‑3) across a range of social science areas, such as international development and climate solutions. In the area of crime and justice, many reviews are policing interventions. The website contains the full reports with detailed explanations of each study, but they also provide useful plain language summaries. Their summaries include Scared Straight, effectiveness of counterterrorism strategies, and interview techniques. If you ever conduct a systematic review, H ow do you find reliable research ? 87 their methodological documents are informative, along with the work of Neyroud.9 The Center for Problem-Oriented Policing popcenter.org The ASU Center for Problem-Oriented Policing contains over 70 problem-specific guides that explain everything you need to know about how to understand and combat a variety of problems. Topics include abandoned buildings, witness intimidation, vandalism, and gun violence. Various response guides are also available. These can tell you how, and under what conditions, several common responses to crime do or do not work. Summaries span asset forfeiture, CCTV, dealing with crime in urban parks, and police crackdowns. There are also useful guides to crime and problem analysis. The evidence-based policing matrix cebcp.org/evidence-based-policing/the-matrix The matrix categorizes evidence-based policing research on three axes: the type or scope of the target, the extent to which the strategy is proactive or reactive, and the specificity of the prevention mechanism. It is centered around a visual format designed to let you see, as they explain: clusters of studies within intersecting dimensions, or ‘realms of effectiveness’. These realms provide insights into the nature and commonalities of effective (or not effective) police strategies and can be used by police agencies to guide developing future tactics and strategies, or assessing their tactical portfolio against the evidence. The introduction has tools and videos to help you understand the matrix and its studies, as well as presentations by the authors. 88 H ow do you find reliable research ? Evidence-Based Policing: The Basics evidencebasedpolicing.net The website that supports the book you are holding is not designed as a repository of crime prevention knowledge; however, check in with it every now and again as updates, additions and corrections to this list will be published there. It is also the place to visit if the links listed under this heading do not work. SUMMARY The internet and social media are full of useless or unreliable sources of information. In an evidence-based policy environment, we must actively seek out and interrogate sources of research that are more reliable, rigorous, and transparent. This means eschewing many online sources and social media content. It also means casting a critical eye on many professional sources of information, as they often lack the objectivity needed for honest research. Fellow officers, the police academy, and those in positions of rank can provide many valuable tips to improve policing; however, they rarely possess the wider insights needed to spark an evidence-based policy. Reliable sources are authentic, credible, representative of the issue, and have meaning for your project. You will inevitably be drawn to the academic and grey literature. While the peer review and discerning editorial control that is found within academic journals is not a guarantee of reliability or objectivity, there is at present no better source for objective, quality scientific information. If you are associated with a university, seek out their library. You can also access peer-reviewed research through Google Scholar, and the websites listed in the chapter. REFERENCES 1. National Academies of Sciences, Engineering, and Medicine (2022) Police Training to Promote the Rule of Law and Protect the Population, National Academies Press: Washington, DC. H ow do you find reliable research ? 2. 3. 4. 5. 6. 7. 8. 9. 89 Buehler, E.D. (2021) State and Local Law Enforcement Training Academies: 2018—Statistical Tables, Bureau of Justice Statistics: Washington, DC, p. 28. Vodde, R.F. (2009) Andragogical Instruction for Effective Police Training, Cambria Press: New York. Stoughton, S.W. (2019) The legal framework for evidence based policing in the US. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 41–50, 47. Davies, P., Rowe, M., Brown, D.M. and Biddle, P. (2021) Understanding the status of evidence in policing research: Reflections from a study of policing domestic abuse. Policing and Society 31 (6), 687–701, p. 9. Harrell, E. and Davis, E. (2020) Contacts Between Police and the Public: 2018—Statistical Tables, Bureau of Justice Statistics: Washington, DC, p. 14. Grudniewicz, A., Moher, D., Cobey, K.D., Bryson, G.L., Cukier, S., Allen, K., . . . Lalu, M.M. (2019) Predatory journals: No definition, no defence. Nature 576, 210–212, p. 211. Shelomi, M. (2020) Opinion: Using Pokémon to detect scientific misinformation. The Scientist, p. 1. www.the-scientist.com/critic-at-large/opinionusing-pokmon-to-detect-scientific-misinformation-68098 (accessed December 2021). Neyroud, P. (2019) Systematic reviews: “Better evidence for a better world”. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 103–116. 7 HOW DO YOU EVALUATE POLICY RESEARCH? RESEARCH IS THE CORE OF SCIENCE There is a cartoon on the internet showing a character in front of a computer saying, “I’ve heard the rhetoric from both sides . . . time to do my own research on the real truth”. He then claims ‘jackpot’ as he clicks on a link that reads “Literally the first link that agrees with what you already believe”. Too many people believe that ‘research’ should confirm their preconceived notions. Good research helps us avoid cognitive traps, such as the one into which our ‘jackpot’ researcher falls. Good social science is an error reduction mechanism, but it is not capable of error removal. Inevitably, there is variability in the quality of research studies. Bad research evaluations are like flashlights with exhausted batteries: they do not shine much light on the subject. On the other hand, police leaders cannot wait for high-quality research in every area of their responsibility. That expansive body of research does not yet exist. While the ‘gold standard’ for research involves randomized controlled experiments or systematic reviews, one policing scholar complained: This is too high a standard for police. They need evaluations that address high-priority choices with information that is more DOI: 10.4324/9781003145684-7 92 H ow do you evaluate policy research ? informative than what is customarily available. Call this the ‘coal standard’—not as good as what academics like, but an improvement over anecdotes.1 Whether policing research is at the gold standard, the coal standard, or worse, this chapter explains how to differentiate these criteria, a key to good background research (Figure 7‑1). INTERNAL VALIDITY According to the humorous novel of the same title by Douglas Adams, the Hitchhiker’s Guide to the Galaxy outsold the competition because it was slightly cheaper and because the book had the words “Don’t panic” printed on the cover.2 Perhaps this book should come with an Figure 7‑1 Evaluating research as part of the scientific method H ow do you evaluate policy research ? 93 equivalent warning, because terminology such as ‘internal validity’ can discourage many readers. I will try to be as gentle as possible. Earlier in the book (The four components of good evidence, Chapter 3) I pointed out the correlation between increased summer violence and ice-cream sales. The correlation lacked a plausible mechanism. With the exception of the deadly ice-cream wars in my hometown of Glasgow (Scotland) in the 1980s,3 attempts to connect ice cream consumption with murder fail as there are more plausible causes for the association. Because the relationship between a vanilla Magnum and a .357 Magnum lacks internal validity, inferring any causality would be a mistake. The term ‘internal validity’ refers to the legitimacy of inferences we make about the causal relationship between two things. A key to policy research is recognizing that a causal relationship means changes in one thing effect or cause a change in the other. If I repeatedly punch you on the arm, you might infer a cause-and-effect relationship that being punched on the arm hurts. Given the plausible mechanism that blunt force causes pain receptors in nerve endings to activate, the inference that a punch on the arm will hurt has strong internal validity. Inferring a cause-and-effect relationship between ice cream and assaults has weak internal validity, because the move to summer—with associated temperature and social changes—is a more likely cause for the observed changes. Since the 1800s and the time of philosopher John Stuart Mill, we have understood that for any claim to have strong internal validity— in other words to establish a causal relationship—the cause should precede the effect, the cause should be related to the effect, and we have excluded alternative explanations.4 For any claim we should therefore rule out potential threats to internal validity. Shadish and colleagues outline a number of these, including:4 Selection effects: This is a problem if any effects observed reflect differences that already existed between the people or places receiving the treatment and the comparison areas. For example, consider when police inspectors volunteer to adopt a new crime prevention 94 H ow do you evaluate policy research ? strategy. It would be tempting to use areas with inspectors who did not volunteer as the comparison (or control) areas. Any observed effect could be caused by the intervention; however, the effect may also occur simply because the volunteer inspectors are more dynamic and innovative. Temporal order: Psychologists have a joke about two rats in a laboratory experiment. One says to the other: “I have this guy conditioned. Every time I press the bar, he drops in a piece of food”. It can sometimes be difficult to establish which variable changed first and this can result in confusion about whether A caused a change in B, or B caused a change in A. Confounding: The effect was caused by some other event occurring at the same time as the treatment intervention. For example, imagine measuring a reduction in after-school disorder around a major public transit center. A police chief attributes the crime reduction to the opening of a local police athletic league program that keeps kids occupied. Would you remain as confident in the effect of the police athletic league if you learned that the transit police had also doubled patrols at the transit center? Good evidence requires rejecting competing explanations like this. Trend and seasonality: Errors can occur when evaluators do not consider the normal behavior of a phenomenon being observed. What can be mistaken as a treatment effect is often just the continuation of a pre-existing trend. For example, if a project to reduce serious violence in Philadelphia had been implemented between July and November 2009, it would be tempting to claim success (the solid line indicated in Figure 7‑2). But within the context of 2009 to 2011, the reduction is clearly part of the normal seasonality of violence that peaks in the summer and declines in the winter months. Measurement effect: If the method of measuring the outcome changes over time, this can be mistaken for an effect. This can trap the unwary researcher when comparing year-to-year crime in rural and urban areas, for example. The official definitions of these areas changed for England and Wales in 2014, and the US Bureau of H ow do you evaluate policy research ? 95 Figure 7‑2 Trend and seasonality are common in serious violent crime data Justice Statistics did the same thing when reporting criminal victimization in 2019.5 Testing effect: The act of testing or studying something before the intervention affects the posttest measure. This is related to the Hawthorne effect, the change in performance of people attributed to being observed by researchers, rather than the intervention itself. It is named after a 1920s productivity study that took place at the Hawthorne Works factory in Cicero, Illinois. The effect is not unlike how you might drive more carefully when you have your mother in the car compared to when you are alone. Attrition: If people or places are lost or removed from a study, the pattern of loss can create an apparent effect. Say a police department sends youth drug abusers to either individual counseling or group counseling. If young people with supportive families are more likely to stay in one type of treatment (for example, the individual counseling), at the end of the study the individual counseling will have more participants. If the treatment appears more effective, is this because of the treatment, or because more people likely to fail the program dropped out? Attrition must be monitored carefully, because what can start as a balanced study does not necessarily end that way. 96 H ow do you evaluate policy research ? Regression to the mean: When treatment areas or people are selected because they score particularly high (or low), the tendency to revert to a more moderate value can be mistaken as a treatment effect. Regression to the mean is discussed in Box 7‑1 where it is also known as the Sports Illustrated jinx. BOX 7‑1 OVERCOMING THE SPORTS ILLUSTRATED JINX As I have pointed out elsewhere, Sports Illustrated is a magazine that often features athletes or teams on winning streaks.6 They then become victims of the Sports Illustrated jinx. The jinx involves having a poor spell after appearing on the magazine cover. Rather than being jinxed, the athletes are simply regressing to their average (or mean) level of performance. Teams tend to have good and bad streaks, but not in an easily predictable way. Otherwise, betting companies would go out of business. But after a good (or bad) run, they tend to ‘regress’ back to their usual performance level. Now imagine a police chief wants to demonstrate the benefit of saturating an area with high-visibility patrols. She picks the highest crime area, floods it with cops, and—compared to the crime rate before the saturation patrols—crime declines. She claims success, but the problem with this before/after study is that the area was picked because it was suffering an unusually high crime streak when the chief ’s project started. Instead of being a facet of the police initiative, we cannot rule out that crime declined because it simply regressed to the mean. Comparison areas tell us what might have happened under circumstances where the initiative did not occur. Comparing these to treatment areas—if chosen carefully—provides a more realistic measure of the impact of an intervention than before and after counts of crime in treatment places alone. These threats to internal validity are wrapped up in my four components of good evidence from Chapter 3, which might be easier to remember. They are: • Plausible mechanism • Temporal causality H ow do you evaluate policy research ? 97 • Statistical association • Reject competing explanations Many of these threats to internal validity can be minimized or even overcome with certain types of research study. Research that can ‘design out’ these problems tend to have greater internal validity and are viewed as more rigorous by quantitative scholars. The evidence hierarchy in the next section explains how each level improves internal validity. EVIDENCE HIERARCHY FOR POLICY MAKERS At least for determining policy, certain approaches provide more reliable findings than others. Social work and medicine have employed hierarchies of research methods for more than 30 years, often with systematic reviews at the pinnacle and expert opinion at the bottom.7 Systematic reviews tend to rely on randomized controlled trials, a type of study that—while intellectually rigorous—is not always practicable. This has understandably riled some researchers, with one leader in the field noting “people whose projects are excluded from systematic reviews correctly interpret this as a criticism of the methodological quality of their work”.8 The limited range of quality policing research suggests it is not the time to take an overly purist approach. As Ariel points out, excluding the work of leading scholars and thinkers simply because they did not take an experimental approach is unnecessarily limiting and alienating.9 Great scholarship has flowed from times when experimental methods were neither practically nor politically feasible. Furthermore, questions that relate to how initiatives are adopted and perceived by the public or police officers can be better addressed by other approaches. Given these caveats, I will explain here the evidence hierarchy for policy decision-making. It has its origins in the Maryland Scientific Methods Scale,7, 10 which in turn originates with equivalent hierarchies in the medical field.11–12 98 H ow do you evaluate policy research ? It is useful to remember that quantitative research can tell you what is going on, qualitative can tell you why. Mine is similar but tweaked a little for a policing policy audience. As such, it emphasizes quantitative studies and randomized trials. Some argue that randomized trials can be of limited value, or difficult to implement, and that observational studies and other sources of information can also inform policing. This is all true. It is useful to remember that quantitative research can tell you what is going on, qualitative can tell you why. Qualitative research can help interpret evaluation results, provide insights into why programs succeed or fail, and consider where they can move forward. Any research field has variable levels of methodological quality. This is the extent to which the design and conduct of a study has been able to prevent systematic problems that might affect the trustworthiness of the results. If you think all evaluations are useful for deciding how we spend our money, then just consider Amazon reviews. They rarely compare one product against another. You more frequently find five-star reviews alongside comments such as ‘Cannot wait to try this!’ or ‘It arrived on time and works as advertised’. A widget might indeed work as advertised, but better than other widgets? In Figure 7‑3, reproduced from Chapter 10 of Reducing Crime: A Companion for Police Leaders,6 the lowest level (0) shows examples of opinions and sources that should not be relied upon. While police chief memoirs are a fun read, they reflect only the author’s perspective and tend to be rather self-aggrandizing. Like reviews written by commercial companies, they have an incentive to only publish positively skewed information. Focus groups, even with academic experts, would be included at this level. I have been an enthusiastic attendee at many learned gatherings, and they can be fascinating and insightful. But this does not negate the reality that an organization drawing together multiple expert opinions and anecdotes does not decrease the likelihood of bias or other internal validity issues. As is commonly noted, the plural of anecdote is not data. H ow do you evaluate policy research ? Figure 7‑3 Evidence hierarchy for policy decision-making 99 100 H ow do you evaluate policy research ? At level 1, we enter the territory of research that, as explained by the last column, at least might have interesting implications if confirmed by more rigorous research. These studies have significant limitations. For example, simple observations could point out that towns with more police tend to have less crime. It might be, however, that instead of the police causing a reduction in crime, towns that start off with less crime are more affluent and can afford more police officers. At level 2, before and after tests of the same area are vulnerable to some issues with internal validity. If you recall, that is the ability of a study to demonstrate a causal mechanism, temporal causality, statistical association, and reject competing explanations. Level 2 studies struggle to do this because they cannot demonstrate what might have happened if the treatment had not occurred in the area. A police department might credit citywide de-escalation training with a reduction in use of force incidents, but what if the public generally became less aggressive towards the police, or officers conducted fewer proactive stops? Before-and-after tests can often look good, but they are too vulnerable to the effects of external factors. We can include statistical reports from government agencies, such as the UK Home Office or the US Bureau of Justice Statistics at this level if they are reoccurring reports. They typically contain nationally representative cross-sectional surveys or administrative data. Therefore, while they rarely test interventions, differences between agencies on various metrics can be informative to policy makers. A comparison group or area represents vital information in evaluating any program. It tells you what might have happened if the initiative had not occurred. Once we hit level 3, we cross an important threshold, because comparison groups enter the fray. Lawrence Sherman argues “the bare minimum, rock-bottom standard for EBP [evidence-based policing] is this: a comparison group is essential to every test to be included as ‘evidence’”.13 A comparison group or area represents vital information in H ow do you evaluate policy research ? 101 evaluating any program. It tells you what might have happened if the initiative had not occurred. Like numerous commentators I agree this should be the minimum standard that counts as scientific evidence.7, 14 Studies at level 2 assume—frequently without much evidence—that crime will continue as before, and that assumption forms the comparison point. At level 3 we seek out similar areas or groups and examine what happened when they did not get the project intervention. This overcomes one huge issue, which is regression to the mean (Box 7‑1). Studies at and above Level 3 all benefit from using comparison areas or groups as a counterfactual. A what, you say? Counterfactual areas or groups can be used to represent what would have happened in the absence of an initiative, and better estimate the real impact on the areas or people receiving the intervention. Counterfactuals—sometimes called control areas or groups—should be carefully selected to mimic as closely as possible the treated areas or people. If your control areas are a reasonable and representative comparison to the treatment areas, then they can be realistically considered the counterfactual indicator of what would have happened in the treatment areas, if the treatment had not been conducted. Like level 4 studies, successful interventions at this level indicate promising initiatives. Level 4 involves either careful and methodical exploration of crime and other issues over extensive periods of time or expands on level 3 studies by using multiple intervention areas and multiple counterfactual sites. Finally, level 5 involves randomized controlled experiments or trials, and systematic reviews. The term randomized controlled trial is often abbreviated to RCT. RCTs involve two important elements: 1. They must have control areas or groups (the same thing as comparison areas or groups) 2. Areas or groups should be selected randomly into either the intervention/treatment or control condition Randomization is important because it reduces the chance that bias is introduced to the study inadvertently or deliberately. For 102 H ow do you evaluate policy research ? example, if a police chief is asked to pick certain areas of a city for their pet project, they could be tempted to select precincts where the local area commanders would be extra diligent. This would skew the results. Also at level 5, and as was explained in Box 4‑2, a systematic review is a ‘study of studies’ that addresses a specific question in a systematic and reproducible way, by identifying and appraising all the literature on the chosen topic. Research articles and reports are gathered and included in the study if they meet certain criteria for inclusion. Systematic reviews can be consuming, but also of great value. If you are reviewing academic scholarship, the flowchart in Figure 7‑4 might help determine what type of study you are reading. The evidence hierarchy is a guide to internal validity, but it does not guarantee that the right method has been used to answer the question. Just because a study is randomized and controlled does not guarantee that the researchers explored the research question in an appropriate way. Sometimes, simple observational research can answer fundamental questions of how officers do their work, and why an initiative was a success or failure. Furthermore, hierarchies such as this have been criticized by academics like Malcolm Sparrow who complained that “the vast majority of ordinary ‘lay inquiry’ and natural science methods fall short of ” thresholds set by evidencebased policy hierarchies.15 Further critiques of randomized trials are discussed in Chapters 11 and 15. SUMMARY Finding and understanding research can be challenging enough without having to also worry about whether a study has reasonably addressed issues of internal validity. People new to research tend to accept the merits of scholarship they find on face value, and only after time come to appreciate methodological limitations. It is a simple truth that research varies in quality as well as usefulness. High H ow do you evaluate policy research ? 103 Figure 7‑4 Decision flowchart for evidence hierarchy studies quality studies that are methodologically rigorous and able to withstand methodological critique are still rare in the policing environment. And while groundbreaking studies may be interesting, do they address the problem you are trying to fix? 104 H ow do you evaluate policy research ? The evidence hierarchy is a continuum of study designs that increasingly address threats to internal validity, where we can progressively be more confident that any identified relationships are causal. Careful research design and implementation takes skill and time, otherwise research can lack a plausible mechanism, temporal causality, statistical association, or fail to reject competing explanations for the outcomes observed. The evidence hierarchy for policy makers, shown in Figure 7‑3 represents an attempt to guide you in the direction of studies with greater internal validity, but you will have to judge each study on its merits, and whether it is applicable to your research question. The evidence hierarchy is a continuum of study designs that increasingly address threats to internal validity, where we can progressively be more confident that any identified relationships are causal. REFERENCES 1. Bayley, D.H. (2016) The complexities of 21st century policing. Policing: A Journal of Policy and Practice 10 (3), 163–170, p. 169. 2. Adams, D. (1995) The Hitchhiker’s Guide to the Galaxy, Del Rey: New York. 3. Skelton, D. and Brownlie, L. (1992) Frightener: The Glasgow Ice-Cream Wars, Mainstream Publishing: Edinburgh. 4. Shadish, W.R., Cook, T.D. and Campbell, D.T. (2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Cengage Learning: Belmont, CA, 2nd ed. 5. Morgan, R.E. and Truman, J.L. (2020) Criminal Victimization, 2019 Bureau of Justice Statistics: Washington, DC. 6. Ratcliffe, J.H. (2019) Reducing Crime: A Companion for Police Leaders, Routledge: London. 7. Farrington, D.P. (2003) Methodological quality standards for evaluation research. Annals of the American Academy of Political and Social Science 587 (1), 49–68. H ow do you evaluate policy research ? 105 8. Farrington, D.P. (2002) Methodological Quality Standards for Evaluation Research, Third Annual Jerry Lee Crime Prevention Symposium, University of Maryland: College Park, p. 3. 9. Ariel, B. (2019) “Not all evidence is created equal” on the importance of matching research questions with research methods in evidence based policing. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 63–86. 10. Sherman, L.W., Gottfredson, D., MacKenzie, D., Eck, J., Reuter, P. and Bushway, S. (1998) Preventing Crime: What Works, What Doesn’t, What’s Promising, National Institute of Justice: Washington, DC. 11. Clarke, B., Gillies, D., Illari, P., Russo, F. and Williamson, J. (2014) Mechanisms and the evidence hierarchy. Topoi 33 (2), 339–360. 12. Brounstein, P.J., Emshoff, J.G., Hill, G.A. and Stoil, M.J. (1997) Assessment of methodological practices in the evaluation of alcohol and other drug (AOD) abuse prevention. Journal of Health and Social Policy 9 (2), 1–19. 13. Sherman, L.W. (2013) The rise of evidence-based policing: Targeting, testing and tracking. In Crime and Justice in America, 1975–2025 (Tonry, M. ed), University of Chicago Press: Chicago, p. 44. 14. Cook, T.D. and Campbell, D.T. (1979) Quasi-Experimentation: Design & Analysis Issues for Field Settings, Houghton Mifflin: Boston. 15. Sparrow, M.K. (2016) Handcuffed: What Holds Policing Back, and the Keys to Reform, Brookings Institution Press: Washington, DC. 8 HOW DO YOU DEVELOP A HYPOTHESIS AND RESEARCH QUESTION? THE IMPORTANCE OF A HYPOTHESIS If you ever find yourself speaking to a roomful of police officers and want them to scroll through their phones, roll their eyes, or daydream, then simply mention the word hypothesis. Its involuntary soporific qualities are almost unparalleled, rivaling those of the great sleep inducer, theory. Yet within the world of evidence-based policing, a hypothesis (or hunch) about how we might make changes to police strategies and tactics is a vital starting point. A hypothesis is a statement that proposes one possible reason or outcome for a phenomenon or relationship. There are many possible places from which to draw a hypothesis. You might read about an area of policing and from what you learn form a hypothesis. For example, prior research has shown that a police Crisis Intervention Team model might be effective at de-escalating incidents. You could then hypothesize that police departments that mandate crisis intervention training for patrol officers will have fewer use of force incidents. Alternatively, you might go on patrol or a ride-along and form a hypothesis based on your observations. While on ride-alongs with the transit police, I noticed that experienced police officers were skilled at talking empathetically with people experiencing vulnerable DOI: 10.4324/9781003145684-8 108 H ow do you develop a hypothesis and research question ? situations, such as drug addiction and homelessness. Thus, while there is a push for more street outreach from social workers, I could hypothesize that—with training and experience—police officers can encourage people into treatment or shelter at similar rates to social workers. You can pay attention to the opinions and thoughts of officers. They often form hunches based on their experiences, ad hoc hypotheses from which can spring opportunities to evaluate potential remedies to societal problems. An additional benefit is police officers are more likely to buy into a research project if the idea comes from them. And finally, a research project can spark new research hypotheses. For example, while my colleagues and I showed that foot patrol reduced violence in high crime Philadelphia police beats, we do not know whether the reduction stemmed from the proactive enforcement work of the officers, from an increase in their communityoriented engagement, or a combination of both.1 This would make an ideal series of subsequent hypotheses. A hypothesis is important within the scientific method (Figure 8‑1) because it explicitly states how you think changing one part of a system will affect another part of a system. You might hypothesize that increasing proactive policing in a high crime area will increase the perception of risk to likely offenders and thus reduce crime. You can use your hypothesis to frame the context of your study, explain to other participants what your study aims to test, and define the variables you need to collect and measure. FROM A PROBLEM TO A HYPOTHESIS How do you go from a general problem to a hypothesis? Imagine you have done some background reading, checked the scholarship and academic literature, talked to some police officers, made patrol observations, or worked alongside detectives or crime analysts, and have thought about plausible mechanisms for what you have observed. H ow do you develop a hypothesis and research question ? 109 Figure 8‑1 Developing a hypothesis as part of the scientific method You might find that the question forming in your mind has already been answered to your satisfaction. For example, if your topic is hot spots policing and you wondered if it was generally effective at reducing crime, the preponderance of evidence presented by the National Academies of Sciences, Engineering and Medicine may be all that you need. After all, they concluded that the “available research evidence suggests that hot spots policing strategies generate statistically significant crime reduction effects without simply displacing crime into immediately surrounding areas”.2 But perhaps you do not find a satisfactory answer. The reading will still improve your understanding of what is known and not known and focus you towards possible causes and effects. From this 110 H ow do you develop a hypothesis and research question ? background knowledge, you can form a hypothesis around your chosen topic. As stated earlier, a hypothesis is a statement that proposes one possible reason or outcome for a phenomenon or relationship. It is the foundation and precursor for a research study. While people tend to use the terms theory and hypothesis in similar circumstances, formally they should not be used interchangeably. A hypothesis is proposed before a study, while a theory is developed after a study or observation, using the result to clarify the idea. A scientific theory is a principle that has been formed to explain a phenomenon or relationship based on substantiated evidence and data. A theory can often be formed to explain the results observed in a study. As Box 8‑1 demonstrates, a hypothesis precedes a study and is a proposition for what might be happening, while a theory emerges as the result of study and is an attempt to explain what has occurred. BOX 8‑1 HYPOTHESIS AND THEORY OF FOOT PATROL IN PHILADELPHIA When dealing with a significant violent crime problem in the city, Philadelphia Police Commissioner Charles Ramsey and his leadership team developed the hypothesis that foot patrol officers are more effective than vehicle-bound patrolling at reducing street-level violence because they can better identify the troublesome individuals in the area. At the time, there were few robust scientific studies to support this statement, so it was only a hypothesis. With a large class of new officers emerging from the police academy, the leadership team worked with me and my colleagues to design the Philadelphia Foot Patrol Experiment. For several months, the rookie officers worked in pairs on dedicated beats throughout the city’s high crime blocks. We were able to compare the results of that activity with the crime rate in equivalent high crime beats that did not have foot patrol officers (the counterfactual). We found that after three months, violent crime reduced by 23 percent in the foot beat areas. H ow do you develop a hypothesis and research question ? 111 Based on the study, the police department confirmed the theory that officers on foot were more able to positively engage with people on the street, as well as get to know—and reduce the anonymity of— potential offenders in the area. The data showed that foot patrol officers conducted more proactive policing activity, and this activity may have acted as a crime deterrent. Their initial hypothesis proved to be true. Since the Philadelphia Foot Patrol Experiment, and based on the subsequent theory, new officers in the city have been assigned to foot patrol for the first few months of their service. For more detail, see the page for this chapter at evidencebasedpolicing.net. One way to turn your question into a hypothesis is to structure it in a format along the lines of “By changing (a) the effect will be (b)”. In practical policing terms, we can think of a hypothesis as a statement that proposes: • If we conduct this activity (a) we will get this effect (b) For example, if a neighborhood has a crime problem, we might start to think of ways that policing could be done differently, such as “Increasing the number of officers patrolling in plain clothes will reduce the number of vehicles being stolen”. In this example, the activity or thing that is changing (a) is the number of officers patrolling in plain clothes and not in uniform. The effect (b) would be a change in the number of stolen cars. Another way to think about this is to state a hypothesis in terms of a relationship. For example, “The number of school children loitering near the transport center after school is directly related to the number of graffiti and vandalism reports”. Note that a hypothesis is a statement and not a question. A good hypothesis is also: • Specific (the activity and effect are not vague or poorly defined) • Relatable (the activity is directly related to the effect) • Feasible (the activity is practicable and the effect discernable) 112 H ow do you develop a hypothesis and research question ? • Brief (long-winded hypotheses tend to get convoluted and confusing) • Falsifiable (we can measure levels of the activity and the outcome) We could hypothesize that “varying police patrols would reduce crime” but what do we mean by varying? And what type of crime could be affected? This hypothesis fails the test of being specific. If we hypothesized that “banning ice-cream sales will improve public perception of police”, it would be a challenge to explain the connection. This fails the relatable test. Furthermore, we could also hypothesize “mandatory population-wide meditation for six hours a day will reduce the number of traffic accidents” and probably be correct. After all, there would probably be fewer vehicles on the road—unless people meditated in their cars (not recommended). Mandating daily yoga for an entire city is, however, impractical. It fails the feasible test. Finally, we could hypothesize that “people who sleep better at night are less likely to engage in road rage”, but how would we quantify and test this? If a hypothesis is not realistically testable, it fails the falsifiable test. I HAVE A HYPOTHESIS; SO WHAT IS A NULL HYPOTHESIS? If you have a hypothesis that some intervention will make a change, the question becomes “a change compared to what?” The ‘what’ is the status quo, the business-as-usual case, whatever that is. It is the default position that your intervention did not have an effect. In academicspeak, this is called the null hypothesis. Your proposed hypothesis is sometimes referred to as the alternative hypothesis, because it is the alternative to the null hypothesis. One purpose of research is to exhaustively test and collect sufficient evidence that we can reject the null hypothesis and provide empirical support for the alternative hypothesis. The null hypothesis is linked to the idea of a counterfactual, first discussed back in “Evidence hierarchy for policy makers” in H ow do you develop a hypothesis and research question ? 113 Chapter 7. If you recall, if you have a control group that constitutes a reasonable and representative comparison to the treatment group, then they can be realistically considered the counterfactual indicator of what would have happened in the treatment group if the treatment had not been conducted. With law enforcement, the counterfactual is rarely ‘no policing’, but instead the normal level of policing that would occur. If your study finds that the treatment or intervention behaves just like the control group, this tends to support the null hypothesis. TURN A HYPOTHESIS INTO A RESEARCH QUESTION Armed with a testable hypothesis, we can convert that into a research question. A research question is not a simple restatement of your hypothesis as a question. This is not the television show ‘Jeopardy!’. While your hypothesis can be a general statement of a relationship— such as “foot patrols are more effective in reducing violence in small beats than vehicle patrols”—applying an evidence-based approach to this question will require focusing this into a testable premise. Let us look at how to do that. We should first be clear about certain components. What is the population, group, or place that could be affected? What is the policing activity or intervention? What outcome is expected to change, and if it does, compared to what? Similar areas or groups? Finally, over what time frame do you hypothesize the change takes place? The PICOT framework in Box 8‑2 is one way to structure a research question. BOX 8‑2 THE PICOT FRAMEWORK FOR RESEARCH QUESTIONS The PICOT framework is adapted from the medical field and can be used to restructure a testable hypothesis into a research question.3 With PICOT, you outline the population (P) or area under scrutiny, 114 H ow do you develop a hypothesis and research question ? the intervention (I), a comparison area or group (C), the outcome (O) that is important, and over what period of time (T). Here are three examples, and note that the last example is still effective even when the PICOT items are reordered: Do high crime areas (P) that have assigned foot patrols (I), compared to high crime areas without foot patrols (C), experience fewer gunpoint robberies (O) over the summer (T)? Among police applicants (P), when recruitment fairs are concentrated at historically Black colleges (I), compared to when recruitment fairs are held at state colleges (C), do we observe the ratio of Black applicants increase (O) over a three-month period (T)? Will the number of people attending police-community meetings (P), compared to the number of people attending police-community meetings before the intervention (C), significantly increase (O) if we provide childcare (I) for the next year (T)? This type of structured research question highlights variables that we can measure to test the intervention. In the last example in Box 8‑2, the count of people attending police-community meetings is the dependent variable. The dependent variable is the variable being tested and measured. In evidence-based policing, it is the group, place, or population you want to impact. This is the population (P) in the PICOT framework. We hope to effect changes on the dependent variable by making changes through the intervention (I), in this case the provision of childcare. The provision of childcare is therefore one example of an independent variable. The independent variable is something that you can adjust that you hope will influence your dependent variable. As a result, your dependent variable is dependent H ow do you develop a hypothesis and research question ? 115 on the independent variable. Do not worry if this does not make complete sense right now. We will return to this later in the book. THE OVERALL PROCESS Effectively what we have done to this point is to start with a general topic area, create a hypothesis, and transform that into a research question. In reality, once you are embedded in thinking about policing, reading about the topic, listening to podcasts, and engaging with frontline staff or executive leaders, it is likely you will start to develop a culture of curiosity. I find myself listening to politicians proposing new ideas on the radio and thinking what their hypothesis would be, and what is the mechanism they think would make their idea work. That inevitably leads to pondering how I might evaluate their plan and what my research question might look like. I also recognize that this last paragraph is starting to make me sound a little nerdy. As you explore policing more, you will find that these processes can take place throughout your academic or professional life. They can occur in order, concurrently, or can overlap between different areas. For example, thinking about how police departments use body-worn cameras may change your thinking about how police engage with proactive work or impact public trust and police legitimacy. This can spark ideas about new ways to conduct community policing or to advance recruitment of confidential informants. SUMMARY For many students of policing, getting a viable research topic is the hardest part of the process. What should you explore and study? How do you figure out a research question? Developing areas of inquiry and new hypotheses can be helped by being in the right environment. I have found that many police officers have learned to be better researchers through exposure to academics in classes, attending meetings with professors, or working alongside research-minded peers and pracademics in policing. Equally, academics become better researchers by getting out of 116 H ow do you develop a hypothesis and research question ? their office and spending time in Compstat conferences, police-community meetings, or squad cars. Once you add a healthy dose of reading into the mix, interesting questions start to emerge. A solid understanding of the problem is a precursor to establishing a hypothesis. The structured approach of (1) an activity that will generate (2) a specific effect, is one way to construct a hypothesis. From this, the PICOT framework can help you formulate a research question. Armed with a viable question, you can now move on to a study (and therefore the next chapter). A professional is constantly learning and developing their skill and expertise, staying current with good practice, demonstrating standards and integrity, and providing leadership to move the field forward. Moving to a culture of curiosity, where your thinking shifts to different approaches to policing and alternative ways to evaluate what people do, is an important professional step. This is what it means to work in a profession rather than a job. A person doing a job turns up, does what is asked, and takes a paycheck. A professional is constantly learning and developing their skill and expertise, staying current with good practice, demonstrating standards and integrity, and providing leadership to move the field forward. REFERENCES 1. 2. 3. Ratcliffe, J.H., Taniguchi, T., Groff, E.R. and Wood, J.D. (2011) The Philadelphia foot patrol experiment: A randomized controlled trial of police patrol effectiveness in violent crime hotspots. Criminology 49 (3), 795–831. Weisburd, D. and Majmundar, M.K., eds (2018) Proactive Policing: Effects on Crime and Communities, National Academies of Sciences Consensus Study Report: Washington, DC. Stillwell, S.B., Fineout-Overholt, E., Melnyk, B.M. and Williamson, K.M. (2010) Asking the clinical question: A key step in evidence-based practice. American Journal of Nursing 110 (3), 58–61. 9 WHAT ARE SOME CORE RESEARCH CONCEPTS? NOT ALL RESEARCH INVOLVES A KNIFE FIGHT Late one evening in September 2020, I was working on a field research study, talking to a couple of transit police officers working at one of the stations that comprise the Kensington transit corridor. These are a series of three elevated subway stations and bus routes along Kensington Avenue in Philadelphia, an area plagued with rampant drug dealing, drug use, and associated social problems. It was a warm night outside Allegheny station, and we were minding our own business, standing on the street chatting. Suddenly a man came half-jumping, half-falling down the steps to the station platform shouting “Yo, they’re stabbing each other up there!” Seconds later, I was on the ground helping the officers wrestle apart two men and secure the knife that one had used to stab the other. Blood was on the men and the station floor, people were agitated and shouting, and police radios were blaring the backup request. For a minute or two, it was bedlam. This is not a typical scholarly research experience. Researchers mostly work in offices, analyzing data, interviewing people, or writing reports. But policing research can take many forms. To gather the necessary data to help policing grow and improve, researchers and DOI: 10.4324/9781003145684-9 118 W hat are some core research concepts ? practitioners have to download data, attend meetings, observe training, run statistics, interview officers, speak to crime victims, scour the internet and the dark web, catalogue observations, run surveys, or—as stated previously—document ride-alongs. If you are engaged in a research project and have made it this far in the book, let us assume that you have established a problem you want to address, conducted a responsible amount of background research and reading, formed a general hypothesis, and designed a research question. You are now at the stage where you want to do some sort of study to answer your question (Figure 9‑1). We are also going to assume that you have a culture of curiosity (as discussed in previous chapters). This means you are open to the idea that a methodological approach to research can advance Figure 9‑1 Undertaking a study as part of the scientific method W hat are some core research concepts ? 119 our understanding and knowledge around areas such as policing. Embracing research as a methodology to solving problems means being skeptical of anecdote, custom, and opinion. It should also mean casting a critical and analytical eye on how knowledge is created. Do researchers use appropriate data and methods to conduct their studies? Do these data and methods allow them to draw reasonable conclusions about their research? Academics frequently disagree about data and research methods, and—especially in more quantitative fields—methodologies are being critiqued and improved all the time. Research methodology is a vast and dynamic area, and there is only the space in this book for the brief overview that follows. CORE CONCEPTS IN RESEARCH There are many ways to undertake research. Evidence-based policing research tends to be defined as policy research or applied research. Policy research explores the impacts of different government or organizational policies and how they affect various outcomes. For example, you might study whether initiating a mandatory arrest policy for people who engage in hate crime might subsequently impact prevalence or recidivism around that offense. Applied research tends to be more focused on the practical application of ideas to change the criminal justice system. For example, you might be interested to learn about the techniques used by combined police and social worker teams to encourage people in vulnerable situations to transition into shelter or treatment. Whichever approach is adopted, there are some core concepts that are important for any research: data validity, reliability, and generalizability. Validity relates to how much strength we can place in our conclusions and inferences. You may recall from Chapter 7 that internal validity refers to the legitimacy of inferences we make about the causal relationship between two things. It refers to whether there is a true causal relationship between the program being studied and the 120 W hat are some core research concepts ? outcomes observed. Questions about validity can occur around other aspects of a study, such as data collection. For example, a researcher might gather data on police-involved shootings and claim that the introduction of tasers to a police department reduced uses of force; however, the term use of force covers more than just firearm use. It might be that overall use of force increased even though firearm use declined. Concerns about validity can also occur if data are not representative. For example, a news team claimed low morale in one police agency even though only five percent of the officers in the department responded to the informal online survey.*1As any college professor reviewing course feedback will tell you, the most enthusiastic respondents are students who really loved, or really hated, a course. How accurately did the views of those 321 cops represent the 6,000 officers in the department? This also raises questions of external validity, that is, the extent to which the findings of a study can be generalized to other situations, groups, or situations. Reliability refers to whether you can consistently replicate the same results. For example, can you reproduce the findings with a similar sample size of the same group? If I have a sample of officers from a large department and ask them to rank the leadership qualities of the executive team, would I get broadly similar results by repeating the study with a different sample of cops from the same department? If not, my data may not be reliable. It is worth remembering that just because a data collection is reliable, it is not necessarily valid. It could be reliably bad. Reliability is no consolation if the validity of the construct (the core idea, underlying theme, or subject matter) being measured is weak. If you survey residents of a small town and ask how they feel about local police, yet their ratings reliably rise and fall with the frequency of controversial police-involved shootings publicized at the national level nowhere *www.nbcphiladelphia.com/investigators/nbc10-investigators-low-moraleamong-philadelphia-police/3060247/. W hat are some core research concepts ? 121 near the town, then the measure of local police legitimacy may not be valid. Generalizability speaks to the capacity of research to apply to other people or situations. How sensitive are the findings to the idiosyncrasies of the research setting? If a research study finds that a focused deterrence strategy in Oakland, California reduced ganginvolved shootings by 26 percent, would such a plan be as effective in Long Beach or San Francisco? What about an east coast city like Philadelphia or Baltimore? What about in a more rural area, or a different country? Ultimately, these are again empirical questions of external validity. At the time of writing, there are certainly reasons to be concerned about the state of policing. The cynic however does not want to try anything, and like Eeyore from Winnie-the-Pooh, prefers spreading a sense of pessimism to everyone in the room. When training police officers on crime reduction approaches, I will discuss strategies shown to have a high likelihood of success. The group cynic—there is always one—skeptical about any contribution new ideas can make, will often proclaim, “oh, that would not work here”. When pressed, they have little concrete evidence to support this negativity. At the time of writing, there are certainly reasons to be concerned about the state of policing. The cynic however does not want to try anything, and like Eeyore from Winnie-the-Pooh, prefers spreading a sense of pessimism to everyone in the room. The more often a practice is successfully replicated, the more it demonstrates a high measure of generalizability; however, the only real way to find out is to implement the practice and evaluate it. Sample size is an important consideration if data are going to be sampled. Sampling means to draw a subset from a broader population. A population is a group of individuals that share a common characteristic, such as residency of a city. Also known as the eligibility 122 W hat are some core research concepts ? pool, these are the cases or units that comprise all individuals eligible for a study.1 A sample is a smaller subset of the eligibility pool or population. For example, claiming police officers in New York take fewer proactive actions (like pedestrian or traffic stops) when issued a bodyworn camera, on the basis of observing one officer, would be nonsense give the size of the city. Conversely, it would be challenging to study every officer across the city—given there are more than 35,000 officers in the New York City Police Department (NYPD). Gathering data from a sample of officers would therefore be appropriate. When estimating a sample size, you need sufficient officers to generate enough experimental power to reasonably detect a change in behavior. Conversely, if your sample is larger than you need it, you will incur additional time and cost unnecessarily.2 There is more on experimental power later in Chapter 12. Following on from the need to determine a sample size, choosing an appropriate sampling approach that is representative of the phenomenon being studied is a key aspect of a methodologically rigorous study. Some of the most common sampling approaches are shown in Box 9‑1, along with their advantages and disadvantages. Box 9‑1 uses the example of a researcher studying body-worn cameras in the NYPD. SOME GENERAL METHODOLOGICAL APPROACHES Let us look at one distinction between types of research methods: quantitative and qualitative. Quantitative research involves measurable characteristics and is central to much of evidence-based policing. It frequently involves the use of statistical techniques to estimate how likely any observed differences between characteristics could have occurred by chance. Quantitatively focused analysts tend to embrace a positivist philosophy (remember that from Chapter 1?), believing that objective, replicable, and systematic procedures should be applied to observed phenomenon, usually expressed numerically in some way. BOX 9‑1 DIFFERENT SAMPLING APPROACHES Advantages Disadvantages Technique Example Simple random sampling Every potential candidate in the population has the same chance of being selected If the sample too small, may The process removes Every officer in the NYPD is eligible to be researcher bias from the not be representative of the selection overall organization selected for a camera regardless of role Stratified sampling Potential candidates are randomly selected from all subg roups (strata) Patrol officers are sampled from each borough Cluster sampling Sample includes patrol The population is divided into meaningful officers from randomly clusters, and the clusters selected precincts are randomly selected Systematic sampling Potential candidates are selected in a regular, systemic manner Identifying all strata can be challenging, and failure to do so will introduce bias More cost effective to focus research on specific groups of interest Groups may not be representative of the population, or the process can be prone to bias Simple to calculate and generally achieves a random spread of candidates Can inadvertently introduce bias, such as if payroll numbers have a pattern. Does not guarantee a spread of coverage of the population 123 Select ever y 10th officer ordered by senior ity, or officers with a payroll number ending in the same dig it Study can provide subgroup estimates in additional to overall estimates W hat are some core research concepts ? Method 124 Technique Example Advantages Disadvantages Pur posive sampling A sample chosen deliberately based on the subject’s specific knowledge or exper ience Officers who have exper ienced assaults recorded on bodywor n camera are invited to interview A good way to quickly access a specific group of individuals with definitive exper iences Biased sample as potentially unlikely to be representative of the experiences of the overall population Convenience A sample selected sampling because they are available and accessible to the researcher All officers from the precinct in which the researcher is conducting another project Easy way to access a sample of officers in the researcher’s immediate vicinity High bias risk, not only due to a more focused selection, but also effect of knowing researcher. Good for pilot study though Snowball sampling Officers who exper ienced assaults recorded on bodywor n camera invite officers they know with similar exper iences A useful way for a researcher to contact additional participants that are not easy for the researcher to discover or find The participants can introduce bias based on whom they know and introduce to the researcher Participants invite other people they know to participate in the study W hat are some core research concepts ? Method W hat are some core research concepts ? 125 Let’s say you are interested in whether de-escalation training is effective.3 You could track use of force incidents recorded by patrol officers in a large urban police department over a couple of years and explore any changes in the rate of incidents involving use of force after some of the officers receive training. A statistical test could tell you if any reduction in use of force by those officers could have occurred by chance, or whether random chance was unlikely and instead the reduction was probably associated with the training. Quantitative reasoning that tests outcomes—rather than explores if people think they are effective—is central to evidence-based policing. Positivists point to the importance of quantitative, objective data analysis as a foundation for statistical testing of hypotheses about the impact of interventions, but there are other approaches to learning about the policing world. While quantitative research favors numbers, qualitative research is more grounded in words and narrative. Qualitative research methods can help you understand the context of an intervention and how individuals in the criminal justice system experience change and events. To return to the training example, while quantitative research can tell us that de-escalation training reduced use of force incidents among patrol officers who took the training, it cannot tell us anything about the officers’ experience of the training, how to best implement the training, or how they think it changed their behavior. Perhaps they enjoyed the training and absorbed the lessons. Alternatively, perhaps they hated the training, but it made them aware that management were scrutinizing use of force. They may not have changed their behavior, but rather just reported fewer use of force incidents. Certain methods of research are more qualitative in nature. This may involve gathering information through interviews, group interviews (which are called focus groups), scouring through documents, or through careful observation and note taking. For example, to better understand how transit cops felt about working in one of the largest drug market areas of the US, a graduate researcher and I spent over 400 hours riding in a police vehicle, observing and talking to officers about their experiences and beliefs.4 For his doctoral dissertation on 126 W hat are some core research concepts ? police socialization, Peter Moskos went so far as to join the Baltimore Police Department as a sworn officer, quite a committed example of a research method called participant observation.5 Increasingly, researchers use mixed methods research designs to understand both what is occurring and why. Mixed methods research combines quantitative and qualitative research techniques and approaches into a single study.6 Increasingly, researchers use mixed methods research designs to understand both what is occurring and why. Mixed methods research combines quantitative and qualitative research techniques and approaches into a single study.6 Scholars sometimes start with qualitative work to learn different ways police deal with an issue and then quantitatively examined the different impacts. This method starts with qualitative and then moves to quantitative. Conversely, researchers can identify the effect of an intervention and supplement their understanding of the results through qualitative data collection. Surveys can be classified as quantitative and qualitative. Surveys involve respondents self-reporting their thoughts or behaviors on various measures. Because questions can be categorized as either closed-ended or open-ended, they can be analyzed quantitatively or qualitatively. Closed-ended survey questions limit the responses that a person can provide, such as when students were asked to rate whether different police uniforms and clothing appeared aggressive or not aggressive, the officer’s look appeared friendly versus not friendly, and whether the officer appeared approachable versus not approachable.7 These types of questions lend themselves to quantitative analysis because specific responses or numbers can be recorded and compared. Open-ended survey questions provide opportunity to answer and provide feedback more fully. An open question like “What has been the impact of the murder of George Floyd on your police W hat are some core research concepts ? 127 department?” will generate broader and more illuminating answers than to simply ask “Has the murder of George Floyd changed your police department? (yes/no)”. Open-ended questions can stimulate a wider range of responses, allowing respondents to provide potentially new insights. The downside to open-ended questions is their unstructured format can sometimes make categorization and statistical analysis challenging, if not impossible. CHOOSING A METHODOLOGICAL APPROACH The methodological approach you adopt will have a significant impact on your findings and what you can say about the intervention or object you are studying. Qualitative researchers complain that quantitative data lack meaning and context, providing no information about how people experience or operationalize crime and policing strategies. Number crunching can provide minimal understanding as to why initiatives succeed or fail. Conversely, it is often argued that qualitative research “provides only anecdotal, non-scientific examples of marginally interesting and valuable insights, . . . is the realm of pseudo-science, and provides little or no value for addressing how crime and societal responses to crime transpire”.8 Qualitative research cannot empirically ‘prove’ that a policy initiative changed a measurable outcome. There have been situations where participants expressed overwhelmingly positive views of initiatives that were subsequently found to be ineffective. Examples include Scared Straight9 and the Drug Abuse Resistance Education (DARE) program.10 On the evidence of what research tends to be favored in publications, evidence-based policing leans towards the quantitative.11 Policy makers tend to be more agnostic, swayed by a variety of research findings. Research that tests a policing policy tends to be of more benefit to police managers rather than street officers,12 but for a comprehensive policy to work, we also need to know how to implement change on the front line. Do not start by choosing an analytical approach. The starting point should be consideration of the problem you are trying to solve 128 W hat are some core research concepts ? and the research question you are trying to understand. Do you want to know if a crime prevention publicity campaign worked to reduce vehicle theft? If so, your characteristic of interest (also known as the dependent variable) is the number of vehicle thefts. That is a quantifiable variable because you can count it. But if you want to know why a program to improve officer wellness and health did not reduce the number of sick days taken by cops in a department, you may have to speak to the officers directly or ask them to complete a survey. Quantitative and qualitative approaches to research can be complementary. Mixing these methods can provide not just the important answer of what works, but also how to implement and operationalize successful programs for meaningful change. If you want to know what is going on, look at quantitative methods. If you want to know why something happened or is happening, consider qualitative approaches. Quantitative and qualitative approaches to research can be complementary. Mixing these methods can provide not just the important answer of what works, but also how to implement and operationalize successful programs for meaningful change. The next chapter unpacks different approaches in more detail. ADVANTAGES OF A PILOT STUDY Regardless of your methodological choices, a pilot study is an often underappreciated yet potentially vital phase of a project. More common in the medical field, pilot studies (also known as feasibility studies) are less common in criminology and policing, which is a shame because they can help save money and heartache. A pilot study is a preliminary project conducted at a smaller scale than the main project, so that the researcher can evaluate the feasibility and validity of the research design and methodology before embarking on the W hat are some core research concepts ? 129 full-scale study. The goal is to improve the eventual study design for the main project. Imagine spending your entire budget on a citywide survey of public safety, only to discover—too late—that the questions were confusing and misinterpreted. Better to discover this flaw early with a pilot study. A pilot study is a preliminary project conducted at a smaller scale than the main project, so that the researcher can evaluate the feasibility and validity of the research design and methodology before embarking on the fullscale study. Researchers can use a pilot study in different ways. For example, it can be useful to trial a set of survey questions to make sure they are easy to answer, the survey flows logically, and the questions are worded in a way that respondents understand. A pilot study can also test a procedure that will be evaluated, or confirm that management and resources are in place for the main study to operate smoothly.13 Pilot studies are like a dry run. This can be useful when an innovative process requires coordination between different agencies or departments. For example, researchers wanted to examine the effects of TASER on cognitive functioning.14 Exposure to electricity can impair cognitive functioning, and electricity is a key component in a TASER device. Before proceeding with a randomized study on a substantial number of human volunteers, the researchers piloted their study on a small number of police recruits at the San Bernardino County (CA) Sheriff’s Training Center. The recruits could volunteer (or not) for the research study, which involved cognitive tests before and after they were zapped; but had no choice regarding TASER exposure, as it was just part of their course! Police training has changed quite a bit since I was a recruit. 130 W hat are some core research concepts ? While pilot studies are an excellent way to prevent errors in subsequent larger studies, there is little guidance available on what size they should be. One rule of thumb suggests the pilot should involve about 10 percent of the places or people that will be in the final study.15 If piloting a survey that will be sent to 500 community residents, the ten percent rule of thumb suggests piloting the survey with 50 people. As these are only guidelines and rules of thumb, some common sense is required. For example, if you are testing a questionnaire that will be used for in-depth interviews with ten police chiefs, ten percent will not be sufficient. Whatever piloting approach you take, aim to conduct a small pilot that is cost and time effective, yet big enough that you can identify and resolve any potential problems or possible snags that might plague your full study. A range of tips for pilot studies are shown in Box 9‑2. BOX 9‑2 THE CONTRIBUTION OF PILOT STUDIES Pilot studies are great for: • • • • • • • • • Testing survey questions and questionnaires Assessing the feasibility of a research intervention Checking that administration of the study flows efficiently Establishing observation procedures Confirming reporting and paperwork protocols Training a research team Convincing stakeholders a project is feasible Checking efficacy of subject recruitment processes Estimate how burdensome project activities will be Pilot studies should not be used for: • Testing hypotheses • Estimating a sample size for the full study W hat are some core research concepts ? 131 Pilot studies work best when: • • • • • Integrated into the overall research design Designed as cost effective and reasonably quick studies Mimic the important components of the full planned study Are smaller in size and scope than the full research project Include necessary time to modify full project with lessons from pilot study • Pilot size is ten percent of the participants (one rule-of-thumb) SUMMARY Are you looking to evaluate and test a specific intervention on a measurable outcome? For example, if you are interested in whether a new patrol strategy reduces public disorder events, then the analysis of recorded disorder incidents—most likely sourced from a police crime database—will be a quantitative approach. Conversely, if you are more interested in learning if officers are correctly using a new protocol for de-escalating public disorder incidents, you may lean towards a qualitative approach such as observations or a survey. For a more rounded study that explores if a new de-escalation protocol is implemented correctly and having a measurable effect, then a mixed methods research approach that combines these methodologies would be ideal. Within evidence-based policing, data downloaded or provided by police computer systems are a staple of many analytical reports, and quantitative analyses dominate the core of the police research knowledge. But there is a place for qualitative research and surveys to better understand implementation, perspectives of new initiatives, and how people make and experience change in the dynamic world of policing. Be open to new methodological approaches to research, and always try and test them first with pilot studies before committing to your full research project. 132 W hat are some core research concepts ? REFERENCES 1. Weisburd, D. and Petrosino, A. (2005) Experiments, criminology. In Encyclopedia of Social Measurement (Kempf-Leonard, K. ed), Elsevier: Amsterdam, pp. 877–884. 2. Britt, C. and Weisburd, D. (2010) Statistical power. In Handbook of Quantitative Criminology (Weisburd, D. and Piquero, A. eds), Springer: New York, pp. 313–332. 3. Engel, R.S., McManus, H.D. and Herold, T.D. (2020) Does de-escalation training work? Systematic review and call for evidence in police use-offorce reform. Criminology and Public Policy 19 (3), 721–759. 4. Ratcliffe, J.H. and Wight, H. (2022) Policing the largest drug market on the eastern seaboard: Officer perspectives on enforcement and community safety. Policing: An International Journal 45 (5), 727–740. 5. Moskos, P. (2008) Cop in the Hood: My Year Policing Baltimore’s Eastern District, Princeton University Press: Princeton, NJ. 6. Maruna, S. (2010) Mixed method research in criminology: Why not go both ways? In Handbook of Quantitative Criminology (Weisburd, D. and Piquero, A. eds), pp. 123–140, Springer: New York. 7. Simpson, R. (2020) Officer appearance and perceptions of police: Accoutrements as signals of intent. Policing: A Journal of Policy and Practice 14 (1), 243–257. 8. Tewksbury, R. (2009) Qualitative versus quantitative methods: Understanding why qualitative methods are superior for criminology and criminal justice. Journal of Theoretical and Philosophical Criminology 1 (1), 38–58, p. 40. 9. Petrosino, A., Turpin-Petrosino, C., Hollis-Peel, M.E. and Lavenberg, J.G. (2013) Scared Straight and other juvenile awareness programs for preventing juvenile delinquency: A systematic review. Campbell Systematic Reviews 9 (1), 1–55. 10. Rosenbaum, D.P., Flewelling, R.L., Bailey, S.L., Ringwalt, C.L. and Wilkinson, D.L. (1994) Cops in the classroom: A longitudinal evaluation of drug abuse resistance education (DARE). Journal of Research in Crime and Delinquency 31, 3–31. 11. Buckler, K. (2008) The quantitative/qualitative divide revisited: A study of published research, doctoral program curricula, and journal editor perceptions. Journal of Criminal Justice Education 19 (3), 383–403. W hat are some core research concepts ? 133 12. Thacher, D. (2008) Research for the front lines. Policing and Society 18 (1), 46–59. 13. Thabane, L., Ma, J., Chu, R., Cheng, J., Ismaila, A., Rios, L.P., . . . Goldsmith, C.H. (2010) A tutorial on pilot studies: The what, why and how. BMC Medical Research Methodology 10 (1), 1–10. 14. White, M.D., Ready, J.T., Kane, R.J. and Dario, L.M. (2014) Examining the effects of the TASER on cognitive functioning: Findings from a pilot study with police recruits. Journal of Experimental Criminology 10 (3), 267–290. 15. Hertzog, M.A. (2008) Considerations in determining sample size for pilot studies. Research in Nursing & Health 31 (2), 180–191. 10 HOW DO YOU MAKE RESEARCH METHODOLOGY CHOICES? MARS AND VENUS OF RESEARCH METHODS Graduate students in criminology and criminal justice quickly discover the underlying tension that simmers between quantitatively and qualitatively trained scholars. Like children at a school dance, they stand uncomfortably on opposite sides of the gym (or conference mixer), rarely interacting unless forced. The ‘quantoids’ huddle in one corner debating p-values versus confidence intervals, and the qualitative researchers in another, discussing subjective interpretation. I am often reminded of the old joke that the battles in academia are so bitter because the stakes are so low. Perhaps qualitative scholars really are from Mars and quantitative researchers from Venus. Cops observe these esoteric disagreements with amusement, or more often frustration.1 Scholars often teach that, rather than remaining fixed in one camp, it is better to put your research question front and center and then decide how you will answer it. But the résumés of those same scholars often show a singular methodological focus. This is only an introductory book, so what follows is a too-quick overview of some general areas of quantitative, qualitative, survey and observational research design. If you are engaging in a real policing evaluation, you are strongly recommended to read more widely. DOI: 10.4324/9781003145684-10 136 H ow do you make research methodology choices ? Choose the correct analytical approach for the problem rather than try to contort the problem to match your skill set. View it as an opportunity to expand your analytical repertoire. More information is also available from the website for this book at evidencebasedpolicing. net. Also, the chapter refers to the evidence hierarchy from Chapter 7, so for ease of reference the next page shows the key figure once again. DIFFERING LEVELS OF CAUSAL SOPHISTICATION As said earlier, the evidence base standard for much policy work leans quantitatively. If your work is policy oriented, you should strive for at least a level 3 study (summarized in Figure 10‑1), what Sherman called the “bare minimum, rock-bottom standard”.2 This requires you to be able to gather data before and after any intervention took place, in both the treatment (target/intervention) area, and a comparison (control) area. If you can do this for multiple areas, even better (level 4). Unfortunately, as you proceed up the evidence hierarchy, more highly rated studies tend to involve more effort. None of this is to say that lower-level studies make no contribution. I would love to count the chats on my Reducing Crime podcast as high-quality research; however, I recognize that the anecdotes and thoughts of policing researchers and police chiefs and officers are still at the bottom of the evidence hierarchy, regardless of how insightful they might be. But they are still useful. It has always been rewarding to hear from researchers who are conducting new studies inspired by listening to one of my guests on the podcast. Given the introductory goal of this book, there is insufficient space to go into quantitative research methods in detail; however, the flow diagram in Figure 10‑2 might help guide your further reading and— along with the evidence hierarchy—give you some guidance in finding a research method.*1If you follow along with the figure, imagine you are measuring an intervention. If you can measure outcomes before and after it starts, and the intervention was applied to one * There are many variations of this type of research decision tree.3–5 H ow do you make research methodology choices ? Figure 10‑1 Evidence hierarchy for policy decision-making 137 138 H ow do you make research methodology choices ? place and one control site, then in this case, you have a level-3 quasiexperimental design or natural experiment. You may recall that natural experiments are policy situations that occur in the real world without any researcher intervention that provides an opportunity for observational study. If you were able to test the intervention in multiple treatment and control places, it would rise to a level 4 study.† Alternatively, imagine exploring whether a police officer or a social worker is better able to convince a vulnerable person to seek shelter or treatment, and you can randomize whether a police officer or social worker approaches the person. At this point, you likely have a randomized, control experiment. The benefits and challenges of these level 5 studies are discussed in the next chapter. It should be noted that the rating for survey research is only within the context of evidence-based policy decision-making and not a reflection on survey research generally. There are policies that have been assessed successful by survey respondents (DARE and Scared Straight) that were subsequently found to be ineffective. Survey methods and approaches vary so considerably that a general ranking is difficult to assign. Depending on how they are deployed in a research study, they could feature at every level of the evidence hierarchy. As Figure 10‑2 shows, a one-off survey for a treatment group only would be at level 1. If you use multiple surveys before and after a policing intervention, then the rating would be higher. My colleagues and I surveyed residents of high crime hot spots before and after the Philadelphia Policing Tactics Experiment.6 There is often a perception that focused police operations—such as foot patrol or offender focused proactive tactics—have a backfire effect on public perception and negatively impact their views of police. Comparing our pre and post surveys, we did not find any noticeable change in the community’s perceptions of crime and disorder, perceived safety, 2 † To get technical for a moment, interrupted time series or regression discontinuity designs can vary between level 2 and 4 depending on the sophistication of their statistical approaches. A dedicated textbook on time series analysis is recommended. H ow do you make research methodology choices ? 139 Figure 10‑2 Quantitative methods flow diagram satisfaction with police, or procedural justice—a finding that reassured police leaders worried about backfire effects.7 QUASI-EXPERIMENTAL RESEARCH DESIGN True experiments involve random selection of participants or areas into treatment and control groups. But even when that is not possible, we can still gain considerable insight from what are called quasi-experiments. A quasi-experimental design attempts to approximate the characteristics of a true experiment even though it does not have the benefit of random allocation of units to treatment 140 H ow do you make research methodology choices ? and control conditions.8 Because of the lack of randomization, quasi-experiments generally lack the ability to produce treatment and control areas or groups that are exactly alike on every measure and therefore have some issues with internal validity. With the introduction of time series or control groups, however, they can achieve results that can be considered ‘evidence’ within an evidence-based policing framework.9 Before proceeding, a quick caveat. There are dozens of ways to design quasi-experimental research studies, and you can read about more advanced techniques elsewhere. The book’s website (evidencebasedpolicing.net) has more information on research design in the page dedicated to this chapter. Posttest-only The following posttest-only scenario is alas a common one for crime and policing researchers. The police chief calls and says “We just finished a project, and we think it worked. Can you come and evaluate it?” I spoke to one chief who claimed that department-wide procedural justice training had improved how officers interact with the community. Being department-wide, there was no control group (Figure 10‑3), and because I was only hearing about the training after the fact, there was no opportunity to gather any pre-training data. In this situation, the best option could be to survey officers and ask if the training improved their skill, or if they think it improved their community contact. In the end, however, we are left with a weak (level 1) study (Figure 10‑3). Figure 10‑3 Posttest-only research design H ow do you make research methodology choices ? 141 Pretest-posttest design A pretest-posttest research design (Figure 10‑4) involves a pretest measure of the outcome of interest prior to conducting an initiative, followed by a posttest on the same measure after treatment occurs.10 It is an improvement over the posttest-only design, but not by much. Challenges to internal validity remain. For example, imagine measuring a reduction in bar assaults before and after the introduction of a public awareness program. Did assaults decline because of the training or because the weather deteriorated and fewer people went out? Or because football season ended? Figure 10‑4 Pretest-posttest research design Posttest-only control group design It might appear that having a control group—people or places that did not receive the intervention as a comparison—would be a strong design; however, this approach lacks two important features. It is missing pretest measurements, and it also lacks random assignment (Figure 10‑5). I am always dubious when a police leader implements a favored program in just one precinct and claims success, as I often discover that the precinct captain was known to be a dynamic leader who went to extraordinary lengths to make the program work. These programs are rarely as effective when scaled up to more precincts or locations. Regression analysis with a dummy variable can analyze this type of project; however, it still only rates a 2 on the evidence hierarchy. Figure 10‑5 Posttest-only control group research design 142 H ow do you make research methodology choices ? Single-group time-series design Politicians (and police leadership) are often unwilling to test an intervention in only a subset of locations or people. When they demand a jurisdiction or citywide implementation then all is not lost if you can measure the outcome of interest over an extended period of time before and after the policy change. Regular, repeated measurements before and after the treatment (Figure 10‑6) can be analyzed using either interrupted time series analysis or a related approach called regression discontinuity. Figure 10‑6 Single-group time-series research design The challenge is that you require enough data points (repeated pretests) to build a reliable time series that can control for seasonality, trend over time, and other challenges; at least 50 pre-intervention data points have been recommended.11 If you wanted to examine the impact of a Neighborhood Watch program on monthly theft from vehicle crime counts, you would need more than four years of pre-intervention monthly counts to reliably estimate how seasonality and trend affect crime in the area. These approaches are advanced and referring to textbooks and experts on the subject is recommended. If the analytical challenges can be overcome competently, a single-group time-series design can rise to a level 4 on the evidence hierarchy. H ow do you make research methodology choices ? 143 Two-group pretest-posttest design This is a popular evaluation design, and one that achieves the important distinction of having a comparison (also known as control) group (Figure 10‑7), thereby reaching the “bare minimum, rock-bottom standard”.2 Assuming the two areas or groups are comparable, these designs can be analyzed with a difference-in-difference type of test. Using multiple treatment and control sites can lift this to level 4 on the evidence hierarchy; however, even with only one target and one comparison area, level 3 can be attained if it can be shown that the control area is comparable to the target site. My colleagues and I did this when we used the Rollin’ 60’s gang territory as a control area when estimating that a major FBI and Los Angeles police department gang takedown—Operation Thumbs Down—of the Rollin’ 30’s Harlem Crips in South Central Los Angeles reduced violence by 22 percent.12 Figure 10‑7 Two-group pretest-posttest research design Randomized experiments A check of the evidence hierarchy in Figure 7‑3 shows that a level 5 study can be achieved if the membership of the groups in the twogroup pretest-posttest design (you just read about in the previous section) can be randomized. This is the ‘classic’ experimental design10 and has been described as the ‘gold standard’ of evaluation research.9 144 H ow do you make research methodology choices ? There are many ways to run a randomized, controlled experiment, such as the two-group randomized pre-post design shown in Figure 10‑8. Randomized experiments deserve a bit of discussion, so they are the subject of the next chapter. Figure 10‑8 Two-group pretest-posttest randomized research design SELECTING A QUALITATIVE APPROACH The evidence hierarchy ranks quantitative research methods based on their capacity to address challenges to internal validity. “Qualitative research, on the other hand, does not have a single method from which results are seen as more truthful than others”.13 Interviews If you want to better understand how a large group of people (such as every officer in a police department) understands some concept, then interviews would be time-consuming, repetitive, and exhausting. Interviews are better for gaining a deeper perspective on a subject from a small group of individuals. For example, it would be a useful technique to learn the barriers that police executives experience when trying to implement evidence-based change in their organization. There are three main types of interviews (structured, semistructured, and unstructured), each with their own benefits and weaknesses. Structured interviews follow a pre-designed interview protocol used to guide the researcher. Structured interviews do not stray from H ow do you make research methodology choices ? 145 the interview guide. If you have prepared and tested a reliable and comprehensive protocol (see Advantages of a pilot study, Chapter 9) there are numerous advantages. This approach ensures you do not miss questions, you gather information in the same manner with every subject, and if you have multiple researchers doing interviews, it increases the chances of consistent results. Unfortunately, it limits the opportunities for subjects to expand on their answers or for the interviewer to probe more deeply into different areas that come up in the interview. Semi-structured interviews start with an interview protocol as a guide but give the researcher an opportunity to probe the participant for additional details through the interview process. Think of it more as a guided conversation between the researcher and subject. Being semi-structured, this approach maintains a core set of discussion topics but can allow the conversation to delve more deeply into interesting areas or for the participant to clarify what they say. This method is popular because the interview guide retains the necessary structure to keep the interview on topic and avoids the risk of forgetting to cover an area, but also allows for greater exploration of information in areas of interest for the researcher. The main challenge is keeping track of time and ensuring the overall structure is retained. Interviews work best when the researcher has prepared and piloted an interview protocol, builds a rapport with the interviewee, is an active listener, and makes an effort to put the interviewee at their ease. Unstructured interviews tend to be the least guided and most conversational approach. Researchers use this to establish a rapport or to lead into sensitive subjects. The researcher has to take the lead and be aware that the respondent may deliberately or inadvertently draw the conversation away from the research topic. Furthermore, if you are interviewing more than one person there is no guarantee that the interviews will cover similar topics, hampering data collation. 146 H ow do you make research methodology choices ? Unstructured interviews are advantageous because there is little need to prepare an interview protocol as there is with semi-structured and structured interviews. Researchers benefit from practice in keeping the conversation on topic. Interviews work best when the researcher has prepared and piloted an interview protocol, builds a rapport with the subject, is an active listener, and makes an effort to put the interviewee at their ease. Focus groups Focus groups are moderated meetings with a small group of participants who are carefully selected for their ability to contribute insights into the research topic. Between 6 and 11 participants is often a suitable number.14 With interviews, the researcher can access a single perspective, but with focus groups the researcher can garner a variety of opinions, increase the diversity of perspectives, and observe how discussions and disagreements unfold and are resolved. The insights tend to be different from one-to-one interviews, because the participants are often responding to the comments of other participants, rather than just the prompts and questions from the interviewer or moderator. Focus groups require more preparation and organizing, and the moderator plays an important role in managing and keeping the meeting on track. By bringing a group together, the researcher can save time and money, though inevitably with more people in the group, each person may provide less individual input to the project. Attention needs to be paid to the role of the moderator, as their biases can (deliberately or inadvertently) steer the discussion. Box 10‑1 has some general principles for focus groups. Focus groups have a lot of flexibility, as they can be run with smaller or larger groups, in person or virtually (though this requires careful management), allow for more than one moderator, and should not theoretically rely on the attendance of one single participant. H ow do you make research methodology choices ? 147 BOX 10‑1 GENERAL PRINCIPLES FOR FOCUS GROUPS 1. Clear any organizational hurdles such as permission from the police department or university institutional review procedures. 2. Pay careful attention to participant selection to get the right mix of perspectives. 3. Establish a mechanism for recording the meeting through audio, video, or a note-taker. Make sure the participants are aware of this. 4. Select a moderator who will be effective at managing the group, encourage quieter members to speak up, and be aware of the potential negative effects of any biases they might have. 5. When choosing a moderator, or when having observers in the meeting, be cognizant of any power dynamics that might hinder the full involvement of participants. For example, the presence of senior police officers can limit honest feedback from line officers. To improve discussions, consider holding different focus groups by participant type. 6. Provide the moderator with a protocol that helps them steer the discussion to areas of research interest. Discuss this with the moderator in advance of the focus group. 7. Thank people for attending and take time to introduce everyone who is present, including note-takers and observers. 8. If you require documented consent from the participants, introduce the form, discuss it with them, and provide sufficient time to answer any questions they have. 9. Agree to a time limit for the focus group. In person can be up to half a day or even a day (with regular breaks), while virtual meetings should be shorter (one or two hours is ideal and no longer than four hours). 10. If discussing a controversial or sensitive subject, be empathetic and aware of everyone’s comfort level, pulling back from the sensitive topic if necessary. Field observations Field observation involves a researcher directly gathering information from a police or criminal justice setting by observing participants 148 H ow do you make research methodology choices ? in their community or work environment. The primary mechanism is as a participant observer, where researchers “conduct themselves in such a way that they become an unobtrusive part of the scene, people the participants take for granted”.15 In an ideal world, they even forget that the researcher is there. Participant observation differs from passive observation, where the researcher is remote from the work and plays no part in proceedings, such as when researchers observed shoplifting behavior through the surveillance cameras of a large pharmacy.16 Research reviewing police body-worn camera video would also fall into this category. Bodyworn cameras can only provide one perspective, a viewpoint that can be interpreted differently depending on the passive observer.17 Participant observers are in the action and can more easily place what they observe in context. The participant observer can hear the initial radio call, take in the entire scene, smell the breeze, and feel the rain. They can hear the sirens of the backup cars, read the mood of the gathering crowd, and feel the tension in the air. After an incident, the observer can chat to the officer, listen to the cops debrief, and watch as the crowd slowly disperses. In policing research, the participant observer should be careful not to influence the research setting inadvertently or unduly. Considering the nature and level of participation is important. For example, while my graduate researcher and I adopted a participant observer role with transit officers in Philadelphia’s Kensington neighborhood,18 the incident described at the start of Chapter 9 required me to temporarily take a more active role. I was not going to stand by and watch a police officer or member of the public get injured if I could help it. I acknowledge that other researchers might have made a different choice. This is never discussed in social science methodology books, but in policing research—while these situations are extremely rare— thinking about what might occur in advance is recommended. I offer further guidance in Box 10‑2. Good field research involves careful recording of field notes, either through audio or video (if the police officer agrees), or through note H ow do you make research methodology choices ? 149 BOX 10‑2 OBSERVER GUIDANCE FOR A RIDE-ALONG 1. Make sure you have all necessary permissions (university, police chief, as necessary) 2. Bring appropriate comfortable clothing, inclement weather gear (if necessary), money, and a fully charged cell phone 3. Establish if you are required to wear protective clothing such as a ballistic vest 4. Introduce yourself to the officer you will be accompanying 5. Show the officer any forms or reports you must complete 6. Record the officer’s cell phone number and immediately call it to confirm you have each other’s number in case you get separated 7. Ask the officer to explain what they would like you to do at traffic stops, incidents, and in the event of any emergencies 8. If you will document the ride-along (notes or recordings), discuss this with the officer, establish their level of comfort, and get their permission 9. On the ride-along, chat, be non-judgemental, and be good company! 10. If unsure of anything, ask immediately taking. Notes can be both descriptive (what you witnessed occurring) and reflective (how, as the researcher, you perceived and felt about the situation). I have previously used pre-printed forms with spaces for the research team to complete key research data during ride-alongs. With the officer’s permission, photographs can be a powerful tool, not only as a memory jog for your research, but also as a mechanism to convey the context of your research in reports and presentations.‡ 3 ‡ Photographs and video taken in public places where people would not have any reasonable expectation of privacy may not require informed consent; however, the rules change from place to place. Academics should clarify their research protocol with local law and respective institutional review board. 150 H ow do you make research methodology choices ? LEARNING FROM A SURVEY Surveys are a method of asking people a range of questions about related research topics, to gain understanding of a larger population. Police departments sometimes survey residents of cities or neighborhoods about their perceptions of police or experiences of crime. This is a flexible research tool and can be conducted through in-person interviews, mailed surveys, telephone surveys, or online tools. There is quite a science to survey methodology, so this will only be the briefest of overviews. Further reading is always recommended. First, give thought to how you will administer your survey. Selfreport online surveys can use a wider range of question and response options than an interviewer conducted telephone survey. But selfreport surveys do not allow for follow-up or for interviewers to explain questions. Will you ask open-ended questions that provide greater opportunity for insightful responses, but make categorization and statistical analysis challenging? Or will you choose closed-ended questions that allow for more limited responses but are more amenable to descriptive and statistical analysis? How many questions will you ask? While it is tempting to ask dozens of questions, this can lead respondents to skip questions, skim through answers (e.g., checking the same box each time without really reading a question), or even abandon the questionnaire. Careful attention to question wording to reduce ambiguity and bias is also important, and why piloting a survey is recommended. You should, however, focus on the format that best answers your research questions, in a way that maximizes the number of responses, and does so within reasonable cost, time, and convenience to everyone. And of course, you should consider how you will analyze the results. How will you handle incomplete questionnaires? Will you gather demographic information? There is a lot to consider. Computer software tools make it easy to develop and administer an online questionnaire. Additionally, the data are readily available to download. H ow do you make research methodology choices ? 151 With traditional paper surveys, you will need to enter the data into a spreadsheet before you can conduct your analyses. Here are some of the most common types of survey questions: Likert scales One of the most common question types, this psychometric scale is named after its inventor, Rensis Likert. It typically follows a fiveresponse option with a neutral option in the center. For example: • • • • • The police do a good job in my neighborhood: Strongly agree Agree Neither agree nor disagree Disagree Strongly disagree Multiple choice questions Multiple choice questions are similar to a Likert scale in that they have specific answer choices to choose from. However, the researcher can change the number of options and the number of responses, and options do not have to fall on a scale. You can even allow the respondent to choose more than one response. With some survey designs, you can add an “Other” option with space for the respondent to write in an answer not available in the multiple options. Rating questions Rating responses offer a scaled response but differ from a Likert because there is not a category for each value. For example, you could ask people to rank the helpfulness of a police officer during a community meeting on a scale of 1 (extremely unhelpful) to 10 (extremely helpful). These can be used to measure the change in respondents’ perceptions over time, if asked more than once. 152 H ow do you make research methodology choices ? Ranking questions These can be used to identify the priorities or preferences of respondents. They are a flexible survey tool, though do not quantify any distances between options like a scale, nor provide the reasoning for a choice. As an example, you could ask a respondent to rank these four community police officer characteristics in order of importance to you, from most important (1) to least important (4): • • • • Friendly Punctual Authoritative Knowledgeable Open-ended questions Open-ended options are not easily quantified; however, they provide a chance to learn about respondent sentiment and opinion. You can also learn basic information that might be otherwise difficult to categorize into other questions. For example, next to a space for a response, you could ask: “Why did you attend the community meeting today?” It can also be a way to learn precise data not available from other types of question, such as “How many robberies occurred in your city last year?” SUMMARY It can be a challenge to read and understand a research study, let alone design your own. Always ensure that the methods answer the original research question. Researchers can sometimes be vulnerable to falling in love with a certain research design, even when it does not suit their research question. If necessary, use a new project as an opportunity to learn a new approach. If you are a police officer wishing to do research, this book is a starting point, but seeking advice from academics or pracademics can be helpful. If police leaders and politicians are reluctant to engage H ow do you make research methodology choices ? 153 in true experimental research, quasi-experimental designs can still achieve meaningful answers. Qualitative research methods can generate considerable insights. Interviews with decision-makers and key stakeholders can inform your understanding of implementation issues, focus groups help you observe how these differing views play out, and field observations (such as ride-alongs) provide a chance to witness the impacts of policies on the ground. And finally, surveys can be quantitative or qualitative, opening a window on the perspectives, thoughts, and opinions on more people than we might be able to reasonably interview. True experiments occur with randomized and controlled research studies, and this is the subject of the next chapter. REFERENCES 1. Murray, A. (2019) Why is evidence based policing growing and what challenges lie ahead? In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 215–229. 2. Sherman, L.W. (2013) The rise of evidence-based policing: Targeting, testing and tracking. In Crime and Justice in America, 1975–2025 (Tonry, M. ed), University of Chicago Press: Chicago, p. 44. 3. Ariel, B. (2019) “Not all evidence is created equal” on the importance of matching research questions with research methods in evidence based policing. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 63–86. 4. Mazzucca, S., Tabak, R., Pilar, M., Ramsey, A., Baumann, A., Kryzer, E., . . . Brownson, R. (2018) Variation in research designs used to test the effectiveness of dissemination and implementation strategies: A review. Frontiers in Public Health 6 (32), 1–10. 5. Grimes, D.A. and Schulz, K.F. (2002) An overview of clinical research: The lay of the land. Lancet 359 (9300), 57–61. 6. Groff, E.R., Ratcliffe, J.H., Haberman, C., Sorg, E., Joyce, N. and Taylor, R.B. (2015) Does what police do at hot spots matter? The Philadelphia policing tactics experiment. Criminology 51 (1), 23–53. 7. Ratcliffe, J.H., Groff, E.R., Sorg, E.T. and Haberman, C.P. (2015) Citizens’ reactions to hot spots policing: Impacts on perceptions of crime, disorder, safety and police. Journal of Experimental Criminology 11 (3), 393–417. 154 H ow do you make research methodology choices ? 8. Shadish, W.R., Cook, T.D. and Campbell, D.T. (2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference, Cengage Learning: Belmont, CA, 2nd ed. 9. Farrington, D.P., Lösel, F., Braga, A.A., Mazerolle, L., Raine, A., Sherman, L.W. and Weisburd, D. (2020) Experimental criminology: Looking back and forward on the 20th anniversary of the academy of experimental criminology. Journal of Experimental Criminology 16, 649–673. 10. Bell, B.A. (2012) Pretest–posttest design. In Encyclopedia of Research Design (Salkind, N.J. ed), Sage: Thousand Oaks, pp. 1087–1091. 11. Box, G.E.P., Jenkins, G.M., Reinsel, G.C. and Ljung, G.M. (2015) Time Series Analysis: Forecasting and Control, Wiley: Hoboken, NJ, 5th ed. 12. Ratcliffe, J.H., Perenzin, A. and Sorg, E.T. (2017) Operation thumbs down: A quasi-experimental evaluation of an FBI gang takedown in South Central Los Angeles. Policing: An International Journal of Police Strategies and Management 40 (2), 442–458. 13. Huey, L., Mitchell, R.J., Kalyal, H. and Pregram, R. (2021) Implementing Evidence-Based Research: A How-to Guide for Police Organizations, Policy Press: Bristol. 14. Harrell, M.C. and Bradley, M.A. (2009) Data Collection Methods: Semi-Structured Interviews and Focus Groups, RAND Corporation: Santa Monica, CA. 15. Taylor, S.J., Bogdan, R. and DeVault, M. (2015) Introduction to Qualitative Research Methods: A Guidebook and Resource, John Wiley and Sons: Hoboken, NJ. 16. Dabney, D.A., Dugan, L., Topalli, V. and Hollinger, R.C. (2006) The impact of implicit stereotyping on offender profiling: Unexpected results from an observational study of shoplifting. Criminal Justice and Behavior 33 (5), 646–674. 17. Boivin, R., Faubert, C., Gendron, A. and Poulin, B. (2020) Explaining the body-worn camera perspective bias. Journal of Qualitative Criminal Justice & Criminology 9 (1), 2–28. 18. Ratcliffe, J.H. and Wight, H. (2022) Policing the largest drug market on the eastern seaboard: Officer perspectives on enforcement and community safety. Policing: An International Journal 45 (5), 727–740. 11 HOW DO RANDOMIZED EXPERIMENTS WORK? ARE PARACHUTES EFFECTIVE? Academic journal articles are not exactly known for their mirth. In fact, I would usually avoid recommending scholarly writing if you seek any kind of levity; however, a splendid tongue-in-cheek paper by Gordon Smith and Jill Pell is worth noting. In it, they point out the absence of randomized controlled trials of parachutes.1 As you will learn in this chapter, randomized trials involve one group of people or places receiving an intervention and a control group who miss out. They humorously argue that without robust randomized trials, we cannot be entirely sure parachutes work; therefore, rigid advocates of the importance of experiments should organize their own randomized trials. Would you like to be in the control group? Obvious situations aside, it is still the case that, “The universally accepted best method for determining a treatment effect is to conduct a controlled application of a treatment to two groups, composed at random, who either receive or do not receive the treatment”.2 The distinction of having random assignment into either treatment or control group is the core of what are called randomized controlled trials, or RCTs—also known as randomized controlled experiments. DOI: 10.4324/9781003145684-11 156 H ow do randomized experiments work ? A sensible approach to evidence in evidence-based policing is to exclude little from consideration, but to recognize that some research is better at minimizing potential biases and generating more reliable evidence. In that vein, randomized experiments were originally described as the ‘gold standard’ of evaluation research,3 due to their ability to produce the strongest internal validity. On the whole, criminologists no longer rigidly argue this and recognize a variety of methodologies can contribute evidence.4 That being said, randomized experiments are central to experimental criminology.5 As David Weisburd has written, “Despite the fact that experiments are difficult to carry out, and take tremendous effort and time to develop, they are worth their weight in gold—so to speak, if you want to influence public policy”.4 THE STUDY DESIGN PROBLEMS TACKLED BY RANDOMIZATION Faced with two groups—only one of which received some intervention—inexperienced evaluators can revert to the traditional metrics from their undergraduate studies, such as regression analysis. It is naïve to assume these tools will provide the answer if their limitations are not also considered. For example, imagine if kids enrolled in an after-school program had less subsequent involvement in disorder than children from a neighboring school without the program. While tempting to claim success, there are multiple causes of potential bias that lie ready to trap the unwary analyst. For example, what if the kids in the treatment program attend a wealthy private school, while the untreated children are from a public school? The danger to the analyst here is failing to control for what are called pre-treatment differences, such as—in this case—selection bias. Selection bias commonly occurs when the participants selected into the treatment group (that receives the program) and control group (that does not) are not comparable (Figure 11‑1). The problem can become noticeable when there are substantial differences between the treatment and the control groups on the outcome of interest at the pretest stage. In Figure 11‑2, we would have H ow do randomized experiments work ? 157 Figure 11‑1 Non-comparable group assignment can introduce selection bias more confidence that we have a real effect when both the treatment and the control group start around the same value on the pretest outcome measure (the graph on the left, A). When there are significant differences between the treatment and the control group before the intervention has been introduced (arrow in graph B), then any apparent treatment effects are potentially biased. That is, unless the difference shown at the arrow can be explained and dealt with. This requires the use of control variables and other tricks that can impact the efficacy of the study. Figure 11‑2 Potentially biased effects with unexplained pre-treatment differences Pre-treatment selection effects are not the only potential problem with the example study. Imagine if the local police department has an ongoing education contribution to the after-school program and therefore has an established relationship with the children in the treatment group. If they are more lenient when encountering those kids engaged in disorder, this creates a potential source of what is called confounding. These are external variables which can occur at the same time as the treatment intervention and interfere with 158 H ow do randomized experiments work ? the interpretation of the study. The confounding variable in this case would be police leniency towards kids in the program resulting in fewer detentions. A confounding variable is one that affects the treatment and control groups differently or only affects one of them. The possible effect of a confounding variable is shown in Figure 11‑3, where the thicker arrow from the confounder towards the participants in the program shows an unbalanced effect, biasing the outcome. Figure 11‑3 Confounding variables can affect groups differently The impact of a significant confounder on the outcome can go either way. It can suppress the true effect, or (as in the example shown in graph B of Figure 11‑4) push the posttest outcome measure (arrow in graph B) to suggest a more effective intervention than warranted. Confounder effects can be interwoven with the true effect, masking the true impact. If this additional ‘push’ results in a Figure 11‑4 Potentially biased outcomes to true tests with confounder effects H ow do randomized experiments work ? 159 statistically significant finding where the true effect would not have achieved significance, unwary evaluators can claim success for an intervention in error. When the confounding variables are unobserved, this means that they are not measured in your study. Earlier in the book, I pointed out the problem of ice-cream sales being correlated with increased violence. In that case, the unobserved variable is temperature. You could include it as a control in your study, and then it becomes observed. But there might be other possible causes for an increase in violence in the summer (such as school vacations). The challenge with unobserved confounders is that they are—well—unobserved. We do not really know what they might be. There are a couple of solutions to the problem of unobserved variables. One is the use of guru-level statistical techniques that are beyond the level of this book. The other is randomization. HOW DOES RANDOMIZATION WORK? The problems identified in the previous section are examples of systematic bias. This bias is inadvertently introduced by the research design by encouraging one outcome over another. For example, if selection bias always benefits the researcher’s intervention, whenever the same biased pre-treatment selection occurs, the result will always be a skewed outcome. The bias is baked into the system (the research design). Researchers seek an unbiased estimation of the impact of an intervention, so we aim to minimize systematic bias like selection effects, as well as accidental biases that are unknown, such as unobserved confounders. The best way to do this is through randomization, a three-stage process: 1. Identify similar individuals from the eligibility pool 2. Randomly select the individuals into either treatment or control 3. Faithfully implement study using the randomized groups 160 H ow do randomized experiments work ? First, start with similar individuals from the population to be studied so that when they are partitioned into the treatment group and the control group, they are basically the same. Small differences are acceptable, but they should be largely alike in all the important aspects with the exception of which group they are assigned.6 In other words you should not be able to differentiate them on any meaningful measure. All individual units in the experiment should be similar or can be grouped into similar blocks from which randomization can occur (see block randomization later in this chapter). The second stage is to take the chosen individuals and randomly assign them to the treatment and control groups. If you have a small number of individuals, you can toss a coin or cut a deck of cards. You can even ask digital devices such as “Hey Siri, toss a coin”.* For larger studies, you can use one of the free online number generators on the internet, or use Microsoft Excel, where the command =RAND() copied to 100 cells will produce 100 random numbers that are greater than or equal to 0, and less than 1. Numbers that are less than 0.5 could be assigned as control and 0.5 or greater as treatment (or vice versa). The important aspect is that every individual has an equal and unpredictable chance of being in either the treatment or control group. The final stage of effective randomization is to implement the randomization faithfully and honestly. For transparency purposes, it is common in psychology and medical trials to pre-register studies at websites such as clinicaltrials.gov or the Open Science Framework (http://osf.io). Pre-registering the research design and analytical methodology can improve research clarity and reduce suggestions after the study that the researchers have pulled some sneaky statistical tricks to get positive results. In some medical trials, the randomization is ‘blinded’ so that the participants do not know until after the study whether they were in the treatment or control group. In many studies, of course, that is not possible. If you are testing the effect of mandatory arrest at domestic violence incidents, folk tend to notice 1 * I bet you just tried to do this. H ow do randomized experiments work ? 161 when they are nicked. Honest implementation also means maintaining the integrity of the randomization, even when the resulting groups do not come out exactly as expected (assuming they maintain equivalence). This can be challenging when trying to conduct policing evaluations in charged political environments (Box 11‑1). BOX 11‑1 ACHIEVING RANDOMIZATION FOR THE PHILADELPHIA FOOT PATROL EXPERIMENT Randomized controlled experiments are rare in criminology and even rarer in policing. They often involve a blend of expedience, political will, funding, and sometimes skeptics from within the experimenters’ own disciplines.7 Randomization was initially a difficult sell in the Philadelphia Police Department. Several area commanders—some of whom didn’t think that foot patrol would have any impact—nevertheless contended that there were specific foot beat areas that absolutely ‘had’ to have officers. Others were (not unreasonably) concerned about the political impact on the department when it became public knowledge that there were high-crime areas of the city that didn’t have the additional patrols. The police department and researchers organized a briefing for Mayor Nutter who, on the advice of Commissioner Ramsey, gave his approval for the experiment. Ramsey’s view was that the department only had the resources to police 60 areas, so why not find 120 high crime areas and run a rigorous experiment at half of them to find out if foot patrol could help the city?8 DIFFERENT APPROACHES TO RANDOMIZATION Earlier in the book I discussed different approaches to sampling in Box 9‑1. Randomization in that context was about the selection of people whom you might survey or interview for a study. Randomization in this chapter has a different meaning. Here, the distinction is 162 H ow do randomized experiments work ? the process of selecting people or places into a treatment or a control condition during an experiment. Simple randomization The simplest approach is to include every individual person or place (unit) in your study (the eligibility pool) and randomly select them into either the control or treatment groups. Every unit has an opportunity to be in either condition, and simple randomization allows each unit to have an equal chance of being in the treatment or the control assignment condition. With enough units (places or people) in the study, you will achieve a reasonable balance across control and treatment participants and a high level of internal validity. With larger studies and studies with very similar people or areas, simple randomization can generally produce enough cases such that treatment and control groups are similar enough prior to the study, and thus any differences in the posttest are due only to the intervention. Block randomization Block randomization happens when potential participants are assigned to a block of similar units, and then random selection occurs within the block. In other words, the eligibility pool is divided into groups (known as blocks) of similar individuals. Block randomization should be considered if you are likely to have fewer than 50 individuals or places in either the treatment or control group, because with a smaller sample, block randomization will help achieve the pretest equivalence that is so important to good experimental design.9 It is also useful if there is a great deal of variation in the outcome of interest within the eligibility pool. The value of this approach is demonstrated in Figure 11‑5. Imagine you have a city with 30 police beats. You want to study the impact of a saturation patrol strategy on police beats with the most crime. You count the crime in every beat and decide to include only beats with at least 20 crimes in the preceding three H ow do randomized experiments work ? 163 Figure 11‑5 Simple and block randomization methods months. You rank these ten hot spots as shown for both simple (A) and block (B) ­outcomes in Figure 11‑5. With simple randomization, each beat has an equal chance of being selected for either treatment or control. By the luck of a coin toss, you end up with five treatment and five control areas; however, as you can see from the totals at the bottom of the assignment groups in panel A on the left, the randomization process has ended up with some imbalance between the two groups. With a total of 280 crimes, the proposed treatment beats have more crime than the control areas (247 crimes). With block randomization, you are constrained to select one treatment and one control area from each block (panel B in Figure 11‑5). Note that here we block in pairs; however, you do not need to always block in pairs. If you had 12 units, you could generate three blocks of four units and then make a random selection of two units from each block. The Jersey City Drug Market Analysis Experiment had ten drug hot spots in a ‘very high’ crime group, eight in a ‘high’ crime group, 26 in a ‘moderate’ group, and 12 in a ‘low’ crime group.10–11 164 H ow do randomized experiments work ? In Figure 11‑5 panel B, the benefit of the blocking design is evident from the total row. With 265 and 262 counts of crime in each group, the two groups are more balanced. This has methodological and statistical advantages. Importantly, even though there is still a minor difference between the two groups, this difference is not substantial nor is it introduced by systematic bias. The block randomization has addressed the major internal validity concerns. CONSIDERATIONS WITH RANDOMIZED EXPERIMENTS So far in the chapter, it might appear that running a randomized experiment is a piece of cake. You might be wondering, “if it is this easy, why are there so few experiments in policing?” Scientific principles notwithstanding, getting an experiment off the ground can run into a few hurdles, such as: • Randomized experiments are not well understood in policing • Ethical concerns about withholding the treatment are common • People in control groups may try and replicate the treatment Randomized experiments are not well understood in policing. Randomized trials are hugely beneficial to advancing police science; however, they remain relatively rare and as a result are not well understood by most police officers. Some police leaders incorrectly think that other research designs (such as pretest-posttest) are just as effective. Therefore, implementers of randomized experiments should anticipate having to spend considerable time explaining why randomization is important. I have found that practicing how to explain their benefits in lay terms—avoiding all academic jargon—is time well spent. Ethical concerns about withholding the treatment are common. Some have objected to experimentation on the grounds that it would be unethical to withhold potentially beneficial treatment from the people in the control condition. In one study, many training centers asked to participate in a job training program had ethical and public H ow do randomized experiments work ? 165 relations concerns about using random assignment to offer or withhold potentially lifechanging training skills.12 Chapter 15 discusses this further, but for now consider there is also the argument that it is more unethical to universally implement untested interventions that could have unforeseen harms and side effects. People in control groups may try and replicate the treatment. This concern relates to substitution bias, which occurs when individuals in a control group independently seek to replicate the treatment activity or try and find alternatives. During the Philadelphia Predictive Policing Experiment we randomized districts across the city.13 Some districts had access to the predictive policing software and used a dedicated car to target predicted crime areas, while others policed as usual (the control condition). There is no evidence that any ‘cheating’ took place, but we did discuss the possibility that some police captains in the control districts might try and access crime grid predictions for their districts. After all, if the initiative were successful, the crime in their districts would look progressively worse and might reflect poorly on them. We did discover a district-wide benefit to the use of marked police cars targeting property crime areas;14 however, we limited control district access to the software, and all signs suggest the experiment was implemented correctly. Other concerns include the cost and time to run RCTs, questions about whether they are applicable beyond the study population, the challenges of using them with urgent situations—in a health context such as during a global pandemic—and the limitation that they might not be possible with rare social problems due to a lack of cases that can contribute to treatment and control groups.15 And without fully understanding the mechanism by which an intervention works, “the randomized field experiments that some policing scholars view as the only truly compelling source of knowledge often provide no information that can be used to improve practice.”16 As you can see, research involving randomized assignment can be quite challenging. If you are going to take it on, some of the critiques of randomized studies are discussed in Chapter 15, and Box 11‑2 has some tips for running a randomized experiment. 166 H ow do randomized experiments work ? BOX 11‑2 TIPS FOR RUNNING A RANDOMIZED EXPERIMENT • Anticipate having to repeatedly explain the value of RCTs • Prepare a response for when people ask about the ethics of randomization • Be honest about the limitations of randomized experiments • Pay attention to the assignment process and ensure true randomization • Check the power of your experiment in advance (see next chapter) • If possible, pre-register your experiment (http://osf.io) • Monitor field implementation carefully and watch for substitution bias • Policing is a dynamic environment, so mid-experiment adjustments may be necessary • If feasible, conduct follow-up tests some months after your experiment SUMMARY The goals of academic scholars and police leaders are not the same, and they approach the challenges of public safety differently.17 In this complex environment, running experiments can be organizationally, politically, and culturally challenging. The many hurdles are “symptomatic of the fact that police performance has become a high profile, political subject, with both the approach to measurement and the accountability framework surrounding it at issue”.18 And as critics of RCTs point out, there is a lot they cannot tell us. But within policing, they are becoming more commonplace (the website for this book has summaries of some notable examples). Using random assignment to place people, districts, hot spots, police beats, or any other reasonable unit into either a treatment or a control group remains, despite all the critiques and limitations, the best way to learn whether an intervention is effective. Randomization can still result in some differences between the treatment and control group, H ow do randomized experiments work ? 167 but importantly the differences are unlikely to be due to systematic bias—bias introduced by the research design deliberately or inadvertently encouraging one outcome over another. REFERENCES 1. Smith, G.C.S. and Pell, J.P. (2003) Parachute use to prevent death and major trauma related to gravitational challenge: Systematic review of randomised controlled trials. British Medical Journal 327, 1459–1461. 2. Loughran, T.A. and Mulvey, E.P. (2010) Estimating treatment effects: Matching quantification to the question. In Handbook of Quantitative Criminology (Weisburd, D. and Piquero, A. eds), Springer: New York, pp. 163–180. 3. Sherman, L.W., Gottfredson, D., MacKenzie, D., Eck, J., Reuter, P. and Bushway, S. (1998) Preventing Crime: What Works, What Doesn’t, What’s Promising, National Institute of Justice: Washington, DC. 4. Farrington, D.P., Lösel, F., Braga, A.A., Mazerolle, L., Raine, A., Sherman, L.W. and Weisburd, D. (2020) Experimental criminology: Looking back and forward on the 20th anniversary of the academy of experimental criminology. Journal of Experimental Criminology 16, 649–673, p. 662. 5. Sherman, L.W. and Strang, H. (2004) Verdicts or inventions? Interpreting results from randomized controlled experiments in criminology. American Behavioral Scientist 47 (5), 575–607. 6. Suresh, K. (2011) An overview of randomization techniques: An unbiased assessment of outcome in clinical research. Journal of Human Reproductive Sciences 4 (1), 8–11. 7. Boruch, R. (2015) Street walking: Randomized controlled trials in criminology, education, and elsewhere. Journal of Experimental Criminology 11 (4), 485–499. 8. Ratcliffe, J.H. and Sorg, E.T. (2017) Foot Patrol: Rethinking the Cornerstone of Policing, Springer (CriminologyBriefs): New York, p. 27. 9. Farrington, D.P. and Welsh, B.C. (2005) Randomized experiments in criminology: What have we learned in the last two decades? Journal of Experimental Criminology 1 (1), 9–38. 10. Gill, C.E. and Weisburd, D. (2013) Increasing equivalence in small-sample, place-based experiments: Taking advantage of block randomization methods. In Experimental Criminology: Prospects for Advancing Science and Public 168 11. 12. 13. 14. 15. 16. 17. 18. H ow do randomized experiments work ? Policy (Welsh, B.C. et al. eds), Cambridge University Press: Cambridge, pp. 141–162. Weisburd, D. and Green, L. (1995) Policing drug hot spots: The Jersey City drug market analysis experiment. Justice Quarterly 12 (4), 711–735. Heckman, J.J. and Smith, J.A. (1995) Assessing the case for social experiments. Journal of Economic Perspectives 9 (2), 85–110. Ratcliffe, J.H., Taylor, R.B., Askey, A.P., Thomas, K., Grasso, J., Bethel, K., . . . Koehnlein, J. (2021) The Philadelphia predictive policing experiment. Journal of Experimental Criminology 17 (1), 15–41. Taylor, R.B. and Ratcliffe, J.H. (2020) Was the pope to blame? Statistical powerlessness and the predictive policing of micro-scale randomized control trials. Criminology and Public Policy 19 (3), 965–996. Frieden, T. (2017) Why the “Gold Standard” of Medical Research Is No Longer Enough. www.statnews.com/2017/08/02/randomized-controlled-trialsmedical-research/ (accessed 2 August 2017). Thacher, D. (2008) Research for the front lines. Policing and Society 18 (1), 46–59, p. 49. Neyroud, P. and Weisburd, D. (2014) Transforming the police through science: The challenge of ownership. Policing: A Journal of Policy and Practice 8 (4), 287–293. Neyroud, P. (2008) Past, present and future performance: Lessons and prospects for the measurement of police performance. Policing: A Journal of Policy and Practice 2 (3), 340–348, p. 341. 12 HOW DO YOU DESIGN A POWERFUL EXPERIMENT? EXPERIMENTS ARE USUALLY A TRADEOFF At this point you might be wondering why this book claims to cover just the basics. You have just waded through a couple of chapters of research design, and now I am about to talk about experimental power. If it is any consolation, I will briefly tell you about the boywho-cried-wolf, and the dangers of turning right on a red traffic light. But experimental power is an important aspect of a research project, so let us wade straight in. What is experimental power? The power of an experiment is a measure of its capability to detect an effect, assuming there is a real effect. It tells you the likelihood that, if your intervention really did move the needle, the experiment will detect that change. It is important, because “Scientists who knowingly run low-powered research, and the reviewers and editors who wave through tiny studies for publication, are introducing a subtle poison into the scientific literature, weakening the evidence that needs to progress.”1 This subtle poison could be introduced in different ways. For example, the statistical test (see next chapter) might not be strong enough to detect subtle differences between treatment and control outcomes. Or you might find that the experiment does not run long enough DOI: 10.4324/9781003145684-12 170 H ow do you design a powerful experiment ? to produce data that shows a difference. Experiments are considered underpowered if this combination of the analytical test and the experimental design are not able to detect the effect size in your hypothesis. Underpowered studies can also be vulnerable to the winner’s curse (explained in Box 12‑1), raising unrealistic expectations that cannot be replicated in subsequent experiments. BOX 12‑1 THE WINNER’S CURSE Because small, low-powered studies have to overcome a considerable statistical threshold to be ‘discovered’, when an effect is found it is often overinflated, which is a fancy academic way of saying the apparent effect is unrealistically high.2 What does this mean? Imagine testing a new approach to policing domestic abuse. The real effect reduces recidivism by ten percent. If it is studied in three underpowered experiments, the design of these underpowered studies means the outcome must achieve a reduction in repeat domestic violence of 20 percent to be statistically significant. Two studies of the approach achieve the correct ten percent but are deemed to be insignificant and not published due to publication bias (see Chapter 15). One study gets lucky and achieves the overinflated threshold of 20 percent. That study is the only one that gets published, and people get excited because they think they can reduce domestic abuse by 20 percent. This overinflated result masks the more accurate ten percent. Everyone, from cops to researchers and domestic abuse advocates, is subsequently disappointed that they are unable to replicate the initial positive results. This is the winner’s curse: in reality, what appears to be a win is misleading and unrealistic. We often make tradeoffs when we design an experiment. For example, to explore whether implicit bias training changed officer behavior, it would be ideal to train the more-than 4,000 sworn officers in Britain’s Thames Valley Police. But if we did that, we would not have a comparison group. We could split them in two, but training half the force is still expensive. If we trained only a handful, we would have an underpowered study. We want to find that ideal H ow do you design a powerful experiment ? 171 balance between teaching the minimum number of officers to be able to detect a real effect, while not training so many that we add unnecessary additional cost. That is where power calculations help, and to understand those it is useful to appreciate two types of potential error. DIFFERENT TYPE OF ERROR Imagine taking a COVID-19 test and it shows positive, but you do not really have the virus. This is called a false positive, or type I error. Now imagine taking the test and it shows negative, but you have COVID-19. This is called a false negative, or type II error.* You can see the difference here in Figure 12‑1. 1 Null hypothesis is true (you do not have COVID-19) Null hypothesis is false (you have COVID-19) Reject null hypothesis (test shows positive) Type I error (false positive) Correct outcome (true positive) Fail to reject null hypothesis (test shows negative) Correct outcome (true negative) Type II error (false negative) Figure 12‑1 Type I and type II error example Both type I and type II errors have significant implications. With a false positive (type I error), you must quarantine, avoid school or work, and it generally causes concern and disruption. With a false negative (type II), you go about your life as usual, potentially infecting and harming people around you. If you are confused about the difference between type I and type II errors, it might help to recall the fable about the boy and the wolf (Box 12‑2). *Verbally, we say these are “type one errors” and “type two errors”, even though we type them with a capital I. 172 H ow do you design a powerful experiment ? BOX 12‑2 THE ERRORS OF THE BOY AND THE WOLF The easy way to remember the difference between type I and type II errors is as follows. Ancient Greek storyteller Aesop recounted the fable of the boy that cried wolf. Bored tending his sheep in the pasture, the boy cried ‘wolf!’ a couple of times. Even though there was no wolf, he was amused to watch the villagers all come running (a false positive). After he did this a couple of times, the villagers got wise to his shenanigans. Later a wolf actually appeared and attacked his sheep, but when the boy yelled ‘wolf!’ the annoyed villagers ignored him (a false negative). Here’s the memory aid: The villagers committed type I and type II errors, in that order. We could solve the first problem by creating a test that is so conservative that it only shows a positive test when we have the most severe case of the virus and it is really obvious. This would reduce the number of false positives. But it would also increase the number of false negatives because the test would be so restrictive that it would not flag people with a moderate case of the virus. This tradeoff is handled by choosing a statistical significance level. As you will read in the next chapter, this is usually at the 95 percent level in the social sciences. In very general terms, we accept the risk we may incorrectly claim a positive case about one time in 20. We also need to avoid—as much as reasonably possible—type II errors. That is, we should be careful not to incorrectly reject an important finding. We can do this by adjusting parameters of the experiment and managing the experimental power. EXPERIMENTAL POWER Experimental power is an estimate of how sensitive a study is when detecting a difference between experimental groups. A highpowered experiment can detect subtle differences between groups. Ideally, we want an experiment with more power than took place H ow do you design a powerful experiment ? 173 in Virginia on red traffic light behavior in the 1970s (Box 12‑3). A more effective study might have identified the problem of rightturn-on-red and saved a lot of lives. Low-power experiments can only detect large effects, and that leaves of lot of leeway for problems. It is therefore better to have a high-power experiment than a lowpower experiment. BOX 12‑3 RED LIGHTS AND THE DANGERS OF UNDERPOWERED STUDIES When I moved to the United States, I discovered after you stop at a red traffic light, you are allowed—unless explicitly prohibited by a sign—to turn right if the way is clear. Even if the traffic light is red. It took a few weeks to understand the nuances of how drivers and pedestrians navigate right-turn-on-red (RTOR) junctions. Moving to a new country is full of unexpected challenges. Through the 1970s, US states individually began adopting RTOR and by 1980 it was permitted almost everywhere. Engineers argued it was dangerous; however, there was an oil crisis in 1973 and states were under pressure to reduce unnecessary vehicle idling and fuel wastage. A consultant for the Virginia Department of Highways and Transportation conducted a before-and-after study of twenty intersections where they implemented RTOR.3 If you recall, pre-post studies like this are below the ‘evidence threshold’ set by Sherman for the evidence hierarchy for policy makers (Figure 7‑3). Before the change there were 308 accidents at the intersections; after, there were 337 over a similar length of time;4 however, this difference was not statistically significant, and thus the consultant told the state government there was no impact on safety. The problem was not that RTOR is not dangerous. It is! The trouble was that the Virginia study was underpowered. This study (and others like it) had such a small sample that—unless the effect was huge—they were not powerful enough to detect the danger. When researchers examined multiple years of accidents in five large cities, they found that, as you might expect, RTOR is more dangerous for cyclists and pedestrians.5 By then however, the public had become 174 H ow do you design a powerful experiment ? used to turning right on a red light, and the practice continues to this day, causing accidents and killing pedestrians and cyclists in the process. You can improve the power of an experiment in some ways: • Increase the sample size • Reduce the significance level you are prepared to accept • Select cases in the eligibility pool that are very similar to each other You can see that there is a tradeoff. You can detect more subtle differences between treatment and control groups, but only if you are prepared to increase the time and cost of your study (increase sample size), reduce the variance (have similar cases), or increase the risk of claiming the intervention worked when in reality it did not (reduce significance level). Otherwise, you run the risk of claiming failure unless the effect is obvious. We use a power analysis to estimate the minimum sample size you need for an experiment. The researcher provides their desired significance level, effect size (see Cohen’s d below), and statistical power. Statistical power is measured on a range of zero (awful) to one (supercharged) with social science convention generally settling on a power of 0.8. This means that a study has an 80 percent chance of ending up detecting a statistically significant finding, if it really exists. Anything less than that becomes increasingly problematic, though there is no agreed ‘rule of thumb’ to say when it gets bad. Once power creeps below 0.5, be concerned. A final note about the Virginia red light camera study. Using G*Power, a calculator available online that can estimate power from study information, I estimated that the power of the Virginia red H ow do you design a powerful experiment ? 175 light camera study was only about 0.18.† To discern a statistically significant difference between the accident rate before and after the introduction of RTOR would have required including 229 junctions in the study, and not the 20 used. Think for a moment how many people have likely been killed or maimed due to RTOR since the 1970s. 2 CALCULATING A POWERFUL SAMPLE SIZE Imagine you want to implement an officer wellness program for the nearly 10,000 strong New Zealand Police. Your hypothesis is that the program will improve the health of the officers, as measured by a reduction in their resting heart rate. Your research question asks, “Does the program significantly reduce the resting heart rate of officers completing the program?” and the null hypothesis is that the program has no meaningful impact on officer fitness. The program is a free gym membership, a series of sessions with a personal trainer, and diet and nutrition advice from experts over a six-month period. It is likely to be expensive. Testing the program across a small sample of officers is cost effective and responsible, in the event the program does not work. The first question you want to consider is, how large is the effect you are hoping to detect? Cohen’s d The effect size quantifies the difference between the treatment and control group outcomes in the metric of the study. If you are † You really do not need to read this, because it is quite technical; however, if you are interested in how I got these numbers, it took a little work. Because the available information on the study does not include the original data or the variance, standard deviation was impossible to calculate. I therefore took the average value as the median and estimated the minimum to be 0.5 times the median and the maximum to be 1.5 times the median. The estimate of the standard deviation was drawn from the work of Wan and colleagues.6 See? I warned you did not really need to read this. 176 H ow do you design a powerful experiment ? researching violent crime, then it is a measure of violent crime. If you study recruitment, then it is a measure of recruitment. The result is tailored to the research. A standardized effect size measure is called Cohen’s d (because it was designed by an American psychologist called Jacob Cohen). If you have a statistical leaning, d is a measure of the difference between the group means, divided by the pooled standard deviation. If you do not have a statistical leaning, that last sentence made no sense. But as you will see, you do not need to know how it is calculated to understand it. Cohen’s d measures the size of the difference between two means (for our purposes here, the mean is the same as the average). If the means of the two groups are the same, then d = 0. As the two groups (treatment and control) become increasingly different, the d value increases. When it reaches d = 1 then on average, the two groups are one standard deviation apart (you can see this in Figure 12‑2). Cohen helpfully converted the numbers into a general rule of thumb, such that: • Small effect = 0.2 • Medium effect = 0.5 • Large effect = 0.8 Let us revisit the hypothetical New Zealand Police wellness program. From an eligible pool of 400 police officers, you randomly select 200 for the wellness program, leaving the unselected 200 as the control group. Before the program, both have very similar heart rate patterns, and after the program the control group has not changed. The distribution of resting heart rates for the control group officers are the white bars in each graph of Figure 12‑2. The most common heart rate is 74 beats per minute, but some officers are as high as 83 and as low as 67. Imagine your treatment reduced the resting heart rate of the officers in the treatment group by an average of one percent. The pattern of resting heart rates for those officers is shown as blue bars in panel A of Figure 12‑2. You can hardly discern any difference between the H ow do you design a powerful experiment ? Figure 12‑2 Three different effects for a hypothetical wellness program 177 178 H ow do you design a powerful experiment ? blue and the white bars with the naked eye. This shows a small effect with a Cohen’s d of around 0.2. With a three percent difference (panel B) you can see that the overall pattern of resting heart rates for the intervention group is noticeably lower, roughly indicative of a medium effect and a Cohen’s d of around 0.6. With panel C, the distinction between the groups is clear. This five percent difference reflects a large effect and a Cohen’s d around 1.0. The average difference between the groups is approximately one standard deviation. Do not expect such glaring program effects too often. Cohen warned that when you are studying something new, effect sizes are likely to be small because they have not been under “good experimental or measurement control”.7 In other words, because people have not looked at the research area, there is likely to be a lot of ‘noise’ around how it is defined and measured. Cohen’s d gets a mention because effect size calculators often ask for it. If you want to plan an experiment or study with a treatment and a control group, and you are not sure how many people or places to include in your study, there is a calculator on the website for this book at evidencebasedpolicing.net. Start at the page dedicated to this chapter. Finally, if you are designing your own study, Box 12‑4 has some tips to help avoid having a study that is underpowered for your hypothesis. BOX 12‑4 EXPERIMENTAL POWER CONSIDERATIONS Here are some things to consider when maximizing experimental power: • Increasing your sample size increases your likelihood of finding an effect. • Larger samples can detect smaller effects. • If your population is 100 or less, sample your entire population. H ow do you design a powerful experiment ? 179 • Because more sampling adds costs, you rarely need to go beyond 500 (but see next line). • Increase your sample size if you anticipate low response rates. • If underpowered, consider relaxing your statistical significance from 0.05 to 0.1. This is a more exploratory level of significance and is not normally recommended by scholars. SUMMARY “The main threats to statistical conclusion validity are insufficient statistical power to detect the effect (e.g., because of small sample size) and the use of inappropriate statistical techniques (e.g., where the data violate the underlying assumptions of a statistical test)”.8 I groan when people infer their understanding of policing from the outlier incidents they see on television. That is like claiming a paint swatch represents the Sistine chapel. To really understand policing from a scientific perspective, it is vital that we examine meaningful trends in larger and more representative studies. Like this chapter, the next one is also focused on what numbers you need to compare a treatment group to a control group, just like you might when studying a policing intervention. If you are lucky enough to be able to design your own study, you might want to learn more about experimental power than is available in this basics book. There is more information at the website for this book (evidencebasedpolicing.net). Hopefully, the sobering tale of why Americans are among the few nations to allow right-turn-on-red will demonstrate the importance of having sufficient experimental power. REFERENCES 1. 2. Ritchie, S.J. (2020) Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth, Metropolitan Books: New York, p. 143. Button, K.S., Ioannidis, J.P.A., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S.J. and Munafò, M.R. (2013) Power failure: Why small sample size 180 3. 4. 5. 6. 7. 8. H ow do you design a powerful experiment ? undermines the reliability of neuroscience. Nature Reviews: Neuroscience 14 (5), 365–376. Hauer, E. (1991) Should stop yield? Matters of method. ITE Journal 69 (9), 25–32. Hauer, E. (2004) The harm done by tests of significance. Accident Analysis and Prevention 36 (3), 495–500. Preusser, D.F., Leaf, W.A., DeBartolo, K.B., Blomberg, R.D. and Levy, M.M. (1982) The effect of right-turn-on-red on pedestrian and bicyclist accidents. Journal of Safety Research 13 (2), 45–55. Wan, X., Wang, W., Liu, J. and Tong, T. (2014) Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Medical Research Methodology 14 (135), 1–13. Cohen, J. (1988) Statistical Power Analysis for the Behavioral Sciences, Routledge: Abingdon, Oxfordshire, 2nd ed, p. 25. Farrington, D.P. (2003) Methodological quality standards for evaluation research. Annals of the American Academy of Political and Social Science 587 (1), 49–68, p. 52. 13 HOW DO YOU KNOW IF AN INTERVENTION IS SIGNIFICANT? PICK A NUMBER, ANY NUMBER There is a version of a joke about people being asked “What is two plus two?” A teacher replies “four”, a mathematician says “4.0”, and a statistician replies, “What do you want it to be?” Undergraduates find learning statistics so terrifying and miserable that universities often give the instructors latitude on the usually dire course evaluations. The previous chapter probably seemed guru level if you are a little ‘statistics-terrified’ and you are probably also dreading this one. Do not despair. I failed high school mathematics, and it was not until I had real world applications to worry about that I started to get the hang of it, so I understand and will try to be as gentle as possible. To fully appreciate much of the evidence-based policing scholarship, some basic numeracy is required. If one approach to tackling domestic violence is better than another, it is useful to know by how much. It is also good to know if it is a real difference and not a disparity that could have occurred by chance. Armed with this knowledge, policy makers can discuss if the improvement is worth the costs of the new policy. Analysis is central to science and the scientific method (Figure 13‑1). DOI: 10.4324/9781003145684-13 182 H ow do you know if an intervention is significant ? Figure 13‑1 Analysis as part of the scientific method Part of the problem is that so much social science education focuses on the issue of statistical significance. In grossly simplified terms (that will annoy most academics), a statistically significant result is unlikely to have been a fluke. But even if your intervention is statistically different than the ‘business-as-usual’ policing model, does it matter? A slight increase in the clearance rate could be statistically significant, but would you be popular if you exhausted the entire overtime budget for a police department just to move the bicycle theft clearance rate from seven to eight percent? Effect sizes are arguably more important than significance tests, because the effect size is a measure of how much your intervention changed things. And so H ow do you know if an intervention is significant ? 183 often, that is exactly what police leaders, politicians, and the community care about. This chapter is not designed to replace a university course or dedicated book on statistics; however, I will try to help you with the basics to guide your future reading should you decide to delve further into evidence-based policing. DESCRIPTIVE DATA AND INFERENTIAL STATISTICS It is not uncommon for people to use the term statistics for basic descriptive data. While this is not technically incorrect, calling descriptive tables and graphs that simply summarize the characteristics of a data set ‘statistics’ is a little generous. Analogy? I can ski competently down most slopes, but I am not exactly US Olympic skier Lindsey Vonn. Consider this example. In the year leading to July 2020, 2.3 percent of Australians were assaulted. What can you infer from this? This type of descriptive statistic tells you about the data, such as the average or the range of values—but not what it all means. A table showing the race and ethnicity of police officers in your local police department would be descriptive but would not tell you if the differences are meaningful. Descriptive data generally do not answer the ‘So what?’ question. Inferential statistics help generalize something meaningful about a larger population by inferring from a smaller sample. They can test whether an intervention’s effectiveness was more than just a fluke, and whether the intervention could be generalized to the wider population. An introductory textbook can guide you through the basics of descriptive statistics, so the remainder of this chapter moves on to some introductory inferential methods. 1 ODDS RATIOS AND THE MAGNITUDE OF YOUR INTERVENTION Researchers and pracademics like to compare a new way of policing with the status quo—the ‘business-as-usual’ model of policing (if you 184 H ow do you know if an intervention is significant ? recall, this is the null hypothesis) by asking, “How much better is the new way of doing things?’ This question is really asking, “What is the size of the effect of the intervention?” The effect size is a measure of the magnitude of the difference between one variable and another (touched on in the previous chapter). For example, if you train officers to be more skilled at identifying and stopping people carrying illegal weapons, how much of an effect would that have compared to officers who were not trained (the null hypothesis)? This effect size can be the difference between the averages (means) for two groups, a percentage difference, or even a correlational change.1 There are a couple of ways to explore effect sizes. Simple differences First, you can look at the simple difference. Sometimes called the raw difference,1–2 this simple difference approach has the advantage of being easily understood. Imagine you send several police department recruiters to a training seminar. You then send both trained and untrained recruiters to attend the same high school employment events. If trained recruiters average eight applicants per event, while untrained recruiters average five, then the effect of the training increases the average number of candidates by three per event. Pretty easy to understand. The main limitations with the simple difference method are that it does not easily allow comparisons between different research studies, and it does not give us any idea if the result could have occurred by chance. Odds ratio One of the most common effect size metrics is the odds ratio. The odds ratio is the odds that an outcome will occur given an intervention, compared to the odds of the outcome occurring in the control group. Cohen’s d is an example of an odds ratio effect size. Let us return to an earlier example from the book, assessing whether kids exposed to an after-school program had fewer subsequent crime and H ow do you know if an intervention is significant ? 185 disorder incidents than children who did not attend the program. After the experiment, we have four groups of children, which can be shown in what is called a contingency table (Figure 13‑2). Figure 13‑2 Contingency table for odd ratio calculation Looking at panel A of Figure 13‑2, kids who went through the after-school program (treatment) could have two outcomes; involved in disorder (a) or not (b). Children who did not attend the program could similarly end up in trouble (c) or not (d). The math is reasonably simple. You multiply the a and d groups, and divide that number by the multiplication (shown with an asterisk) of groups b and c, as shown here: Odds Ratio = (a * d)/(b * c) If you plug in the numbers from Figure 13‑2 panel B, we get: Odds Ratio = (11 * 45)/(69 * 6) = 1.196 = 1.2 (rounded) When the odds ratio equals one, the intervention does not affect the outcome. When the odds ratio is less than one, the intervention lowers the odds of the outcome occurring. And when the odds ratio is greater than one, the intervention increases the odds of the outcome. As we can see from this fictitious example, the after-school program did not go well. This leads to a peculiarity around odds ratios, which is how to talk about them. Odds ratios are a comparison of two odds values. With an odds ratio of about 1.2 from Figure 13‑2, the appropriate wording is to say that “the odds of youths getting into trouble after participating in the after-school program are 1.2 greater than for kids 186 H ow do you know if an intervention is significant ? who did not attend the program”. But what if you want to talk about percentage change? There is a slightly different calculation for that: % change in odds = 100 * (OR – 1) For this hypothetical study, the numbers would look like this: % change in odds = 100 * (1.2 – 1) = 100 * (0.2) = 20% In other words, the intervention program contributed to a 20 percent increase in the odds of disorder among the children that participated compared to children who did not participate. This is not the same thing as saying there is a 20 percent increase in disorder. It is a 20 percent increase in the odds of being involved in disorder. Not a good result for the program, but that is why policing and crime prevention should be evidence-based: sometimes we get results we do not expect. BOX 13‑1 A QUIRK REGARDING THE ODDS RATIO There is an obscure peculiarity with respect to the odds ratio. In the previous example, we know the row totals before the study begins. In other words, we know the number of kids who either went into the after-school program or did not. It is a closed system. It is possible, however, to create a similar contingency-type table with numbers that are not known at the start of the experiment. We might replace the counts of children in each group with the counts of gun crimes in two police beats, one of which had a police intervention to reduce gun violence. This is exactly what happened in the early 1990s during the Kansas City Gun Experiment.3 Police in Kansas City, Missouri, used intensive patrols in a large crime beat to target gun carrying. Patrol officers seized guns by frisking arrested individuals and by spotting firearms in plain view during routine traffic violation or safety stops. Results were compared to a control beat. Because the numbers of gun crimes in H ow do you know if an intervention is significant ? 187 each beat are not known beforehand and must be discovered empirically during the experiment, it is not a closed system. In situations like this, David Wilson has proposed replacing the odds ratio with a relative incident rate ratio (RIRR).2 If you are new to evidence-based policing, this distinction will appear rather pedantic, and it is unclear at the time of writing if the RIRR will become widely accepted. But at least for now you can sound super-smart the next time odds ratios come up in casual conversation. Because this nuance is not germane to understanding the basic premise, the rest of the chapter will continue to refer to odds ratios. Let us look at a specific real-world example. In the Kansas City Gun Experiment (Box 13‑1), the outcome of interest is what happened in the treatment police beat during the experiment. It is helpful to always put the outcome of interest in the ‘a’ box in the contingency table, so we set up the table as Figure 13‑3. Figure 13‑3 The Kansas City Gun Experiment Using the equation as before (though see caveat in Box 13‑1): OR (Odds Ratio) = (86 * 184)/(169 * 192) = 15,824/32,448 = 0.488 With a value of 0.488, we can say that the odds of there being a gun crime in the treatment area are about half the odds of a gun crime occurring in the control police beat. That’s quite a result for directed police patrols in gun crime hot spots.3 188 H ow do you know if an intervention is significant ? You may recall that if you want to talk about percentage change, you use the following equation: % change in odds = 100 * (OR – 1) For the Kansas City Gun Experiment, the proactive policework equated to roughly a 51 percent reduction in the odds of gun crime in the target beat: % change in odds = 100 * (0.488 – 1) = 100 * (–0.512) = –51.2% Percentages here are easier to interpret. Negative percent changes indicate a reduction in the outcome (and positive values an increase) relative to the control location or group. This is a different interpretation from the odds ratio itself. Remember: When the odds ratio is less than one, the intervention lowers the odds of the outcome occurring, and when the odds ratio is greater than one, the intervention increases the odds of the outcome. The challenge with estimates of effect size is they do not tell us whether the identified effect is statistically significant. What is statistical significance? Pause, take a break, walk the dog, and then read on. When the odds ratio is less than one, the intervention lowers the odds of the outcome occurring, and when the odds ratio is greater than one, the intervention increases the odds of the outcome. STATISTICAL SIGNIFICANCE If you ever want to watch an academic crash-and-burn in front of an audience of police officers, watch when they mention ‘statistical significance’. Most of the room reaches for their phones to tune out and catch up on Twitter. Statistical significance is one of those phrases that nearly everyone finds daunting, so be gentle with yourself. It takes a while to get the hang of statistics. H ow do you know if an intervention is significant ? 189 Statistical significance is a measure of confidence that any difference between your treatment and control groups was a real difference, and not the consequence of a skewed sample or simple chance. How could a sample be skewed? Imagine if you wanted to know the average height of students at a university, and you decided to measure or ask people walking past you. If you happen to be outside the gym just as the basketball team are leaving practice, you are likely to get a skewed sample and an unreliable result. It would be unwise to infer that your sample of surprisingly tall students is representative of the entire population of the university. Statistical significance is a measure of confidence that any difference between your treatment and control groups was a real difference, and not the consequence of a skewed sample or simple chance. What do I mean by chance? Instead of hanging outside the gym, you decide to draw your sample from different times and places across the campus. But you only record the heights of five students. If the university has 20,000 students, the chances are high that your tiny sample is not representative of the student population. In general, increasing the size of your sample improves the likelihood of detecting an accurate estimate and reduces the chance that your sample is unrepresentative. Another reason to use inferential statistics is to be able to say something about the intervention. This is a question about the process that produced a change. Imagine selecting a substantial sample of police officers from a large department (such as my old force, London’s Metropolitan Police) and randomly splitting them into two groups. One group receives de-escalation training. You then track how many use of force reports both groups submit over the next year. You find that, on average, the de-escalation trained group have nine percent fewer recorded uses of force. Is this random variation and just a quirk of your sample? Or is it a valid representation of the likely change we 190 H ow do you know if an intervention is significant ? would see if you expanded the de-escalation training across the entire Metropolitan Police? Statistical significance can answer this question. Statistical significance is a tradeoff. Because we are studying a sample and not the entire population, we can never be entirely certain that our sample reflects the behavior we would see in the entire population. In return for the convenience of only studying a subset of the eligibility pool, we must accept some uncertainty. Statistical significance evaluates that level of uncertainty.4 It answers the question “How likely is it that our sample reflects the behavior of the population?” In most social science applications, the convention is to accept a small risk that the real effect of your intervention falls outside of the estimate provided by your sample and the analysis. How ‘small’ is a ‘small risk’? The convention is usually a five percent risk, such that we would expect that the analysis reflects the real effect 95 percent of the time, or 19 times out of 20. This is only a popular convention however, and there is nothing to prevent a researcher justifying a different approach. Now at this point, if you have had any kind of introduction to statistics, you are probably expecting me to discuss p-values. A p-value is a probability—expressed as a number between 0 and 1—that your null hypothesis is likely true. It is a common misconception that the calculation of p-values is essential to establishing statistical significance, but this is not true. They are one way, even though there is a push from some academic disciplines to move away from p-values (Box 13‑2). The next section discusses confidence intervals, another way to establish the significant of your outcome. BOX 13‑2 WHY ARE YOU NOT READING ABOUT P-VALUES? If you have studied a social science such as criminology or sociology, you might have encountered p-values. Why not in this chapter? A p-value is another way of estimating the “probability that the chosen test statistic would have been at least as large as its observed H ow do you know if an intervention is significant ? 191 value if every model assumption were correct, including the test hypothesis”.4 The test hypothesis is usually that the intervention had no effect. With smaller and smaller p-values, you can be increasingly certain that your result is incompatible with the idea that the intervention had no effect. p-values are common in non-experimental research, such as studies that use regression analysis. They are less common in the experimental type of research that is favored in quantitative evidence-based policing, though still frequently reported in randomized trials. Quantitative evidence-based work often explores changes in treatment groups or places compared to control groups. In experimental designs, the effect size has more practical value and the confidence interval reflects a measure of certainty in the findings. That is the focus of this book. The American Statistical Association have warned that “scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold” and that “by itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis”.5 Some psychology journals have even gone so far as to ban the use of p values.6 HAVING CONFIDENCE YOUR RESULT IS NOT A FLUKE When you calculate an odds ratio, you are calculating an estimate of the ratio between the odds of the outcome for the control group and the odds of the outcome for the treatment group. Because you are using a sample and not the entire population, one concern is that your sample is not representative of the population. It is therefore helpful to calculate a range or interval between which we can be fairly confident that the true odds ratio value lies. You can use a formula to calculate this confidence interval. The confidence interval is the range of values on the outcome that most likely contains the actual population value. Confidence intervals incorporate the margin of error for your estimated effect. What does this mean in English? From the de-escalation training example, the results from the sample of trained Metropolitan Police 192 H ow do you know if an intervention is significant ? officers would generate a low and a high confidence interval. If you had the opportunity to train all the officers in the entire force, the real change in use of force incidents would fall within that range 95 percent of the time. The two confidence interval calculations (for the high and low estimate) are more complicated equations than earlier in this chapter, but the upper and lower bounds of the confidence interval can be calculated in Microsoft Excel. For a real example, you can use the figures from the Kansas City Gun Experiment found in Figure 13‑3. If you open an Excel worksheet, put the values from the contingency table in the cells as follows: • • • • In In In In cell cell cell cell A1 B1 A2 B2 enter enter enter enter the the the the ‘a’ value ‘b’ value ‘c’ value ‘d’ value (86) (169) (192) (184) This forms your contingency table. • In cell A3 enter the odds ratio (0.488) Now in two empty cells (say A4 and A5) enter the following two formulas: =EXP(LN(A3) – (1.96 * SQRT (1/A1 + 1/B1 + 1/A2 + 1/B2))) =EXP(LN(A3) + (1.96 * SQRT (1/A1 + 1/B1 + 1/A2 + 1/B2))) Pay careful attention to each character and type the entry exactly as written here. And if you are old school and want the traditional formula, read this footnote.† With the second formula, the only dif2 † If you are interested, the EXP instruction tells Excel to calculate an exponential value, the LN instruction calculates a natural log, and the SQRT instruction calculates the square root. If you want to impress someone with the fancy equation, this is what the Excel code is doing: c.i. = exp(log(OR) ± Zα/2*√1/a + 1/b + 1/c + 1/d). H ow do you know if an intervention is significant ? 193 ference is one sign changes from minus to plus (also note that there are three right parentheses at the end of the equation). The first line is the lower 95 percent confidence interval, and the second line is the upper 95 percent confidence interval. Figure 13‑4 Excel spreadsheet with Kansas City Gun Experiment values If you have completed all of this correctly, your Excel spreadsheet will look like Figure 13‑4. If you prefer not to do this yourself, the spreadsheet on the book’s website calculates odds ratios and confidence intervals for you. The values from the Kansas City Gun Experiment result in a confidence interval between 0.35 and 0.68. We have 95 percent confidence that the true odds ratio of the police activity in the experiment was between 0.35 and 0.68. If the confidence interval spans a value of one, you cannot be sure if the intervention made things better or worse. Now here’s the important part. The odds ratio is a useful measure, and you should include it in any report; however, the confidence interval is central to statistical significance. A quick reminder: When the odds ratio is less than one, the intervention lowers the odds of the outcome occurring, and when the odds ratio is greater than one, the intervention increases the odds of the outcome. But these results are only statistically significant if supported by the confidence interval values. If the confidence interval spans a value of one, you cannot be sure if the intervention made things 194 H ow do you know if an intervention is significant ? better or worse. Formally, we say that the intervention was not statistically significant. This is shown in Figure 13‑5 where the left and right sides of the double-headed arrows represent the lower and upper confidence intervals for four hypothetical interventions. The entire doubleheader arrow for intervention A is greater than 1 and therefore we can be confident that the treatment would increase the outcome 95 percent of the time. Using the same interpretation, we can be confident that intervention C generated a meaningful reduction in the outcome. Interventions B and D have confidence intervals that cross the 1 line, even if only a little. Because they do, we cannot state with confidence that the effect of the intervention was statistically significant, as the true population value could theoretically be either above or below the odds ratio of one. Figure 13‑5 Interpreting confidence intervals Fortunately, in the case of the Kansas City Gun Experiment the confidence interval does not span one (0.35 to 0.68) so we have 95 percent confidence that the proactive policework reduced gun crime in the target beat. PRACTICAL SIGNIFICANCE A final word on statistical significance. Statistical significance is not practical significance. The practical significance of an intervention is an assessment of not only whether the effect was unlikely to have occurred by chance, but also that the magnitude of the effect H ow do you know if an intervention is significant ? 195 justified the effort and impact of the intervention. As some statistical gurus from the American Statistical Association have pointed out, “any effect, no matter how tiny, can produce a small p-value if the sample size or measurement precision is high enough, and large effects may produce unimpressive p-values if the sample size is small or measurements are imprecise”.5 While we are not using p-values in this chapter (see Box 13‑2 to understand why) it is still possible, with large enough sample sizes, to get results that are statistically significant based on the confidence intervals. This does not guarantee they mean much in the real world. Consider this hypothetical example. Your local police department starts a policy to stop, question, and frisk individuals on the street as frequently as legally possible. While the police department stays within the law, the policy dramatically increases the number of people who are unnecessarily stopped and inconvenienced. You subsequently estimate that after six months, a tripling of the number of pedestrians stopped has resulted in a three percent reduction in violent crime. This comes, however, with a huge drop in public perception of police legitimacy and an increase in negative press coverage of the police department. Was your three percent finding of practical significance? A committee convened by the US National Academies of Sciences, Engineering and Medicine wrestled with this situation.7 While they concluded that “Evidence regarding the crime-reduction impact of stop, question, and frisk when implemented as a general, citywide crime-control strategy is mixed” (p. 177), they also noted “studies of citizens’ personal experiences with person-focused strategies do show marked negative associations between exposure to stop, question, and frisk and proactive traffic enforcement approaches and community outcomes” (p. 210). What if you achieved a 12 percent reduction in violence? Or achieved a seven percent reduction with only a modest decrease in public support? As you can tell, the issue of practical significance is not an empirical measure that you can estimate with a spreadsheet. It is a subjective decision that requires subject matter expertise, 196 H ow do you know if an intervention is significant ? an understanding of the context of the policing environment, and an appreciation for more than the field of statistics. As philosopher David Hume warned us, knowing what the result is, does not guide us in determining what to do.8 SUMMARY Some grounding in descriptive statistics is useful, but you can get that from most undergraduate courses at a university, or even from high school. Evidence-based policing practitioners are more interested in inferential statistics because they explain whether an intervention made a difference and if that difference was more than just chance. The odds ratio is easy to calculate, and the result is not difficult to understand. It is a simple tool for estimating the size of any difference between a treatment and control, as long as the effect of the treatment is the measure included in box ‘a’ in the contingency table. You can also use the spreadsheet available on the book’s website. The confidence interval is a little trickier to calculate but can be calculated with the help of online tools or Microsoft Excel. Combined with confidence intervals, odds ratios can reveal the “strength, direction, and a plausible range of an effect as well as the likelihood of chance occurrence”.9 Remember: If the confidence interval does not span one (the null hypothesis value) then you have a statistically significant finding. Lower than one? The intervention reduced the outcome. Higher than one? Increased the outcome. In all of this, do not lose sight of the fact that practical significance is still essential. Policing does not happen in a vacuum. Changes to the operating environment can have far reaching consequences. Sometimes the consequences are positive, like when my former colleague, Jim Fyfe, showed how a change to New York’s use of force policy reduced fatal shootings by police.10 Becoming more comfortable with the numeracy around evidence-based policing will inform better decisions that have real practical significance in the world. H ow do you know if an intervention is significant ? 197 REFERENCES 1. Durlak, J.A. (2009) How to select, calculate, and interpret effect sizes. Journal of Pediatric Psychology 34 (9), 917–928. 2. Wilson, D.B. (2022) The relative incident rate ratio effect size for countbased impact evaluations: When an odds ratio is not an odds ratio. Journal of Quantitative Criminology 38 (2), 323–341. 3. Sherman, L.W., Shaw, J.W. and Rogan, D.P. (1995) The Kansas City Gun Experiment, National Institute of Justice: Washington, DC, p. 11. 4. Greenland, S., Senn, S.J., Rothman, K.J., Carlin, J.B., Poole, C., Goodman, S.N. and Altman, D.G. (2016) Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology 31 (4), 337–350, p. 339. 5. Wasserstein, R.L. and Lazar, N.A. (2016) The ASA’s statement on p-values: Context, process, and purpose. The American Statistician 70 (2), 129–133, p. 132. 6. Woolston, C. (2015) Psychology journal bans P values. Nature 519, 9. 7. Weisburd, D. and Majmundar, M.K., eds (2018) Proactive Policing: Effects on Crime and Communities, National Academies of Sciences Consensus Study Report: Washington, DC. 8. Hume, D. (1739 [1986]) Treatise of Human Nature, Penguin Classics: London. 9. Grimes, D.A. and Schulz, K.F. (2002) An overview of clinical research: The lay of the land. Lancet 359 (9300), 57–61. 10. Skolnick, J.H. and Fyfe, J.J. (1993) Above the Law: Police and the Excessive Use of Force, Free Press: New York. 14 WHERE DO YOU PUBLISH RESULTS? SHOUTING INTO THE VOID Like pondering if a tree falling in the woods makes a sound if nobody is around to hear it, does research that never gets publicized have any effect? As Aldous Huxley said “Facts do not cease to exist because they are ignored”;1 however, he was not living in a century of short attention spans. Facts are often ignored, even when wellknown and published. Criminologists have a whole subfield—public criminology—dedicated to reducing the sense among academics that their work is ignored and they are ‘shouting into the void’. If you have completed a research study in policing, especially quality work that crosses Sherman’s evidence threshold (see Figure 7‑3), you might consider that you have an obligation to publish it in some fashion. Publication of studies is, after all, central to the scientific method (Figure 14‑1). The world of policing research is still in relative infancy, and the need to improve the profession for both community safety and the well-being of officers remains vital. These days, there are many different avenues for getting research into the public sphere, avenues that are not necessarily in the academic domain. DOI: 10.4324/9781003145684-14 200 W here do you publish results ? Figure 14‑1 Publication as part of the scientific method Academics complain that nobody reads their scholarly research. A lot. Whole articles and books have been written on why this is, but in general the problems include: • Academic articles are too long, technical, esoteric, and inaccessible • Academics are not rewarded for engaging with more accessible mediums • Policy makers are busy and have short attention spans • Policy makers do not understand the jargon that academics seem to relish W here do you publish results ? 201 • Academics are not well trained or socialized in the art of persuasion • Academics rarely learn how to translate their research into lay terms For all this, there are many ways that research can be more accessible. This chapter looks at a few. POLICY BRIEFS GET THE ATTENTION OF POLICY MAKERS One way to get your research noticed is to distribute a policy brief. A policy brief is a concise document that is designed to inform and advise a specific but generally non-technical audience about an issue. If you have completed a study or conducted a review of the research in an area of policing, a policy review can help decisionmakers. Before you write the document, have a clear idea of the following: • Audience: Who are they? What do they care about? How will you get their attention? What tone is most suitable? Is a policy brief the best way to reach them? For this last question you could compare to the other dissemination approaches in this chapter. • Key ideas: What is your overall message? What key data points, information, tables, or other research details are compelling and help tell your story? • Implications and/or recommendations: What are the 2–4 important things that you want the audience to know? Implications can be less direct (but still important) while recommendations must be actionable and succinct. With these key components identified, you can start to plan the policy brief. Create an outline and keep revising and restructuring it until you are happy. When you start writing, continually refer to 202 W here do you publish results ? these bullet points to make sure you remain on point and focused on the audience and their needs. Structure and formatting The UK government’s Parliamentary Office of Science and Technology specialize in writing policy briefs, from which some guidelines can be recommended:2 • • • • • • • • • No more than four pages of text (references can be extra) A summary of key points up front A clear structure with well signposted sections Use of boxes for figures, case studies, and other contextual materials Accessible language with short sentences and commonly used words Use of headings and subheadings, and bullet points where appropriate Briefs often use two columns on the page for readability Text can be broken up with tables, charts, boxes, callouts, or pictures Multiple drafts are reviewed before publication A standard format for a policy brief resembles the following (with an indication of the percentage overall contribution of each section to the whole). I have also shown roughly how many words should be in each section, assuming you are writing a brief of around 1,500 words. • Title: Aim for engaging, informative and succinct. • Executive summary: 10 percent (about 150 words). Overview of the key points, and the implications and recommendations. These can be in bullet form. • Introduction: 10 percent (about 150 words). Explain why the reader should read on, capture their attention, and explain the importance of the topic. W here do you publish results ? 203 • Methodology: 5 to 10 percent (100–150 words). If a research study, this is a non-technical overview of the research conducted. Can contain details—such as the number of control and treatment people or places and the length of study—but avoid technical jargon. • Results and conclusions: 30 percent (500 words), Summarize findings for the audience and use the results as the basis for the conclusions. Identify any specific highlights or concerns. • Implications (and recommendations if relevant): 30 percent (500 words). Implications address the ‘so what’ question about what might be happening, or what should take place. Recommendations are a clear statement of what the audience should do or how they should use the research. • Further reading or useful resources: 5 to 10 percent (100–150 words). Links to key readings or your fuller report go here. Following the upper word counts here will result in a 1,600-word document, not including title and other organizational information, such as adding your name and contact details. Once you have the document written, it can be useful to review layouts used by other organizations for design ideas. Policy briefs are often glossy color documents with high quality graphics and images because they are competing for the attention of busy people. Many writers underestimate how much time it can take to turn a bland Word document into an engaging document. The last step is to enroll a reviewer to help you proofread and revise the policy brief so the final product is polished and professional. In the criminal justice sphere, policy briefs tend to perform two functions: they review the state of research on a particular topic, or they translate new research evidence into how it might impact an issue. But they can also be used as a format to convince police leaders and politicians to test or implement a new strategy or innovation. If you have read about a new approach to a policing problem and want to propose a study to test it locally, the outline in Box 14‑1 adapts the format for a study proposal. 204 W here do you publish results ? BOX 14‑1 ADAPTING THE POLICY BRIEF FOR A PROPOSED RESEARCH STUDY With some tweaks, the policy brief format can be adapted to a layout suitable to propose a study of an innovation to policy makers or leaders in a police department. The section headings are slightly different, as are the suggested work ranges, but the general layout remains. • Title: Aim for engaging, informative, and succinct. • Executive summary: 10 percent (about 150 words). Overview of the innovation being proposed, how the innovation might help the city or town, and the research question. • Previous work: 20 percent (about 300 words). Outline key aspects of the existing academic literature relevant to the innovation to be tested. Identify what is known and unknown. • Methodology: 50 percent (about 750 words). Outline details of the proposed research study. Be specific but avoid technical jargon. Break into subheadings as necessary, such as who or what will be studied, where and for how long, using what research approach. Be clear about the outcome variable to be measured and any limitations. Remember that policy makers may need help understanding the value of randomization or the use of controls. • Conclusion (and recommendations if relevant): 10 percent (150 words). A brief conclusion that reiterates the important aspects of your study. Why it is important, what city problems it might help with, and how it might help. Recommendations might suggest next steps. • Further reading or useful resources: 10 percent (150 words). Links to key readings or your fuller proposal go here. OP-EDS GET THE ATTENTION OF THE PUBLIC An ‘op-ed’ originally meant to refer to the page ‘opposite the editorial’ (op-ed, get it?) in a newspaper, though some refer to it as the opinions and editorial page. Newspapers carry op-eds from authors who are not employees of the paper. They are guest writers who pen W here do you publish results ? 205 an opinion (usually related to their expertise) to facilitate thought and discussion on a particular topic. Fortunately, with the growth of online media alongside traditional print, there are increasing opportunities to write op-eds. While this is not an avenue to publish the expansive details of a research project (an academic journal, professional article, or policy brief would be more suitable), it is a mechanism to bring attention to the important implications or recommendations of your research. Recall the difference. An implication suggests how a research finding may be important for policy, practice, or theory. A recommendation is a specific action from the research that should be taken regarding a policy or practice. An implication suggests how a research finding may be important for policy, practice, or theory. A recommendation is a specific action should be taken regarding a policy or practice. Police officers are strongly advised to seek the permission of their management before writing an op-ed. This is because an op-ed is designed to be opinionated and stimulate conversation, thought, and action. This is not the time to be vague or unsure of your thoughts on the topic. It therefore requires content knowledge. And with op-ed writing, it can also require speed. Editors often look for expert opinions when specific public events occur, so timing can be as valuable as good writing. Op-eds tend to be about 700 to 800 words in length, clearly and simply written for a broad, public audience. Here is one possible structure: • A strong opening sentence: Called a lede in newspaper jargon, the lede must grab the reader’s attention and hook them into reading the rest of the op-ed. Some ideas are suggested in Box 14‑2. 206 W here do you publish results ? • The topic: A clear description of the place (hot spots?), issue (rising crime?), people (young crime victims?), or other aspect that is the primary focus of the op-ed. • Thesis: A statement of your argument or the problem you are addressing. This should be supported by data, information, evidence, or first-hand experience. • Research with conclusions: Your research, with a sentence or so of conclusion after each important detail, so it is clear to the reader what the research means. You can have a few of these pieces. Make your conclusions concrete and actionable, not vague and abstract. • Caveats: If you are worried you might overstate your argument or there are limitations in your work, include qualifying statements. You can also use this part of the op-ed to anticipate and address any potential criticisms. • Ending: Finish strong with a call to action, a warning if something is ignored, some key detail, or a callback to an emotive or important point from earlier. BOX 14‑2 HOW TO WRITE A STRONG LEDE As the Harvard Kennedy School note, you can snare the reader with a lede that has “a strong claim, a surprising fact, a metaphor, a mystery, or a counterintuitive observation that entices the reader into reading more”.3 You can often capitalize on a recent news article to increase the prominence of your op-ed, but there are other ways that you can grab the attention of the reader from the first sentence. • Tell a dramatic anecdote (usually not difficult to find in policing) • Mark an anniversary (end of year crime figures, or the anniversary of a notable crime) • Cite a research study (ideally your study of course: new research is often newsworthy) W here do you publish results ? 207 • Use a personal experience (if you work in policing, these can be all too frequent) • Use a startling or counterintuitive statistic (a single data point that seems shocking) One of the sins of op-ed writing is to have the lede further down the article, where it is buried among the topic detail or thesis and less likely to draw attention. This is the origin of the phrase, to “bury the lede”. PROFESSIONAL PUBLICATIONS GET THE ATTENTION OF PRACTITIONERS Professional publications are trade journals that are designed for practitioners working in specific industries and occupations. For example, the UK’s College of Policing publication Going Equipped has a principal aim of sharing of knowledge, reflections, ideas, and research, primarily for those working in frontline roles. Most of the content in Going Equipped is written by serving officers and staff. You can also write for the newsletters of law enforcement member organizations such as IADLEST,† Community Policing Dispatch from the COPS Office,‡ or the publication of the Australia and New Zealand Police Advisory Agency (ANZPAA), Policing Innovator. Pitching at the needs of the audience is key with practitionerfocused outlets. It is important to review multiple articles from the magazine that you are writing for, because these will give you a sense of the tone and style they seek. Also check the formatting and general word length of the publication. The latter is important because you can increase the likelihood of getting your submission accepted 3 4 †International Association of Directors of Law Enforcement Standards and Training. ‡ Formally known as the US Department of Justice’s Office of Community Oriented Policing Services. 208 W here do you publish results ? by conforming as much as possible to the target journal. Editors do not want to expend a lot of effort to rewrite your work to fit their magazine. Some publications have different types of articles. For example, shorter articles in Police Chief—the magazine for the International Association of Chiefs of Police (IACP)—run to about 800 words; however, longer review articles can be double that number. The College of Policing’s Going Equipped has various short-format styles. Of relevance to evidence-based practitioners, they accept ‘long read’ submissions based on academic research carried out by an officer or member of staff, with a limit of 2000 words. In general, these publications differ in writing style from scholarly journals in that they tend to move swiftly from one point to the next, rather than dwell on the specifics of a position in laborious detail. Given their wide readership, magazines and professional publications need articles that have broad implications so be cautious about delving too deeply into the local context. If you are discussing research from Glasgow in Scotland, you should make clear why the issue is of relevance outside of Scotland. A short cover letter should accompany your work and make clear that you wrote the article for that journal, why the journal should publish it, and what your credentials are for writing the submission. Finally, proofread extensively before submitting your work. Professional magazines usually have career editors who are experts in the nuances and detail of grammar. You will not engender much confidence if your submission or cover letter have glaring errors of syntax. In summary: • • • • • • Research the target publication carefully Create and rework an outline first Mimic the style and tone of published articles Have several ideas you want to convey Lean towards shorter, more crafted sentences Stick to the point and do not waffle W here do you publish results ? 209 ACADEMIC WRITING GETS THE ATTENTION OF GRADUATE STUDENTS Okay that is probably a little trite. I hope more than a few graduate students read my articles. However, to adulterate a line from poet Thomas Gray, “Full many a journal article is born to blush unseen, and waste its sweetness on the desert air”. Many journals remain behind paywalls, limiting their availability to practitioners and other scholars. And with the increase in online journals, even academics struggle to find time to really digest much of the scholarly work in their area. Because they have a specific need to complete a thesis or dissertation, graduate researchers pursuing advanced degrees can be among the most current on a topic. Getting a paper accepted in a peer-reviewed academic journal can take months (or often years) and involves considerable effort. Rejection is a part of scholarly life: some journals reject 90 percent of the articles submitted to them. So why does anyone do it? The peer review process can improve scholarly work. While getting a response from ‘Reviewer 2’§ can be disheartening, peer review is designed to help scholars improve their work. It is also a reality that getting an article into a respected journal carries with it prestige and is tied to academic promotion and tenure. It can enhance the credibility of the author and bring them to the notice of police leaders. And rather than the fleeting recognition from a magazine article, or the 5 Scientific literature is becoming increasingly incomprehensible to most people,4 so while a journal article is prestigious, it is less likely to be noticed—and therefore acted on—by busy professionals. § Journal editors often send the most critical response to the prospective author as the second review, perhaps so they can lead with a more positive review. As a result, ‘Reviewer 2’ has become academic idiom for a reviewer who is unduly harsh or overly critical. 210 W here do you publish results ? less prestigious acceptance of a chapter in a book, a scholarly journal article is an enduring credential in the intellectual world. Writing your first scholarly article can be a daunting task, and there is not the space in this introductory text to cover it. If you are considering it, you can learn more at this chapter’s page on the companion website (evidencebasedpolicing.net). It is worth considering if you wish to accompany any scholarly writing with some other suggested product from this chapter. Scientific literature is becoming increasingly incomprehensible to most people,4 so while a journal article is prestigious, it is less likely to be noticed—and therefore acted on—by busy professionals. PRESENTATIONS CAN ENGAGE A BROAD AUDIENCE Presentations at conferences or webinars are like double-edged swords. Good ones can enhance the speaker’s reputation, convey research to a broad audience, and encourage replication and adoption of the ideas and strategies. Bad presentations reflect negatively on speakers and their research, and can feel like an anesthetic-free trip to the worst dentist in the world.** Given the work that some people put into their research and analysis, it sometimes defies belief that they make a complete hash of their moment to impress and shine as an intellectual. After months of slaving over a hot keyboard, scholars often use the opportunity presented to them by a briefing or conference to confuse the audience, contradict themselves and generally annoy the whole room. This book’s website (evidencebasedpolicing.net) has some short guides to preparing presentation slides and speaking to an audience. As usual, click on the page dedicated to this chapter. For now, Box 14‑3 has some tips for effective presentation slides. 6 ** And anyone who has seen the 1976 movie Marathon Man still winces at the question, “Is it safe?” W here do you publish results ? 211 BOX 14‑3 TEN TIPS FOR BETTER PRESENTATION SLIDES • Ensure there is a strong contrast between the text and the background color • Use simple titles and bullet points • Use a font size and type that is clearly visible from anywhere in the room • Limit the number of bullet points (4–6 is a good rule of thumb) • Avoid the trap of fancy builds and dimming • Don’t rely on the spel chequer • If you want to bore your audience, include technical detail • Maps, graphs, and figures rarely disappoint • Continuity of style throughout the presentation looks professional • The last slide should be your title slide or a black screen Related to presentations, being a podcast guest is an opportunity to elaborate on your results and explain why you engaged with the study, developed the ideas, and undertook the research. Most podcasts (such as my Reducing Crime podcast) do not accept solicitations and you usually must be invited; however, if you are, here are a couple of tips. Podcast listeners tend to be relaxing, or driving, or engaged in other activities, so keep the specifics to a minimum and do not bombard them with a lot of numbers and detail. Instead, leave the listener with a couple of key details that will stick with them after the episode. Also keep your answers short and only to one or two points. If the host wants you to expand on them, they will ask. You cannot squeeze all your research into a response, and shorter answers allow it to be more of a conversation. These suggestions apply equally well to radio and television news appearances. 212 W here do you publish results ? OTHER FORMS OF PUBLICATION In today’s practically focused environment, other forms of information dissemination are important. The chapter has already discussed academic journals, professional magazines, and policy briefs. You can draw attention to you work in other ways. A Twitter thread is a series of multiple tweets that are linked, related, and tell a longer more detailed story than is possible in the standard 280-character single tweet. Threads can attract more attention than passing individual tweets and allow you to tell the story of your research. The first tweet is the equivalent of your lede (see Box 14‑2) and teases the reader to continue with your thread. Using a ‘thread’ emoji†† or listing that a tweet is one of several tweets (for example like this, 1/7) is considered good practice. Each tweet should have a specific detail that readers can respond to or retweet, and can include figures or graphs from your research. Your final tweet can be a link to your research paper. Blog posts (such as the National Policing Institute’s OnPolicing blog) are a potential outlet for evidence-based scholarship. You can either write a guest blog for someone, or it is relatively easy to start your own blog at minimal cost. Blog posts can vary in length. About 1,500 to 2,500 words is considered enough content to show details and outline your research, though if you are targeting a blog hosted by someone else, contact them in advance to seek their thoughts. You can also pen shorter 500 to 800 word ‘teasers’ and follow these up with a longer blog post that contains more detail. Then link the teaser to the detailed blog post. Finally, one of the most effective ways to reach a significant number of people quickly is to get your research featured in a media story. Twitter is often an effective way to reach journalists and get them interested in your work. If you work in policing, you may need to speak to your agency’s media office for permission or guidance before reaching out. 7 †† The emoji is a spool of thread, not just a piece of yarn. W here do you publish results ? 213 HOW TO CONVEY SCIENTIFIC RESULTS SIMPLY The Survey of Adult Skills—a large international study of adult literacy and numeracy—found adults in both England and the US lag behind the international average in basic numeracy.5 The majority of adults struggle to interpret graphs, work with percentages, or interpret information beyond simple arithmetic, so statistics can sound to them like ancient Aramaic. If you are trying to reach a broader audience, the challenge with the number-crunching aspect of evidence-based policing is the difficulty of communicating these concepts simply to the community. This does not mean you should ‘dumb down’ your research. Instead, discuss more relatable numbers. In Chapter 13 you were shown how to convert an odds ratio into a percentage change. Percentage changes can sometimes be rounded and simplified. For example, rather than discussing a 98.5 percent increase in the arrest rate, you might say that the rate of arrests “almost doubled”. Analogies to other crime types or other statistics can also be helpful to increase understanding. You also should be careful with the phrase “there is no evidence to suggest”.6 Using this phrase to describe a policing intervention is ambiguous because it can have multiple interpretations, such as, • • • • The intervention has been proven to be of no benefit The evidence is inconclusive or mixed There is a lack of quality evidence to make a call either way The intervention has not been tested at all It might also be used when the intervention could have modest benefits that are, in the view of the speaker, outweighed by negative side effects. As you can see, the use of what appears to be a scientific sounding phrase introduces ambiguity because it is not clear what is being said. SUMMARY Traditional academic peer-reviewed journal articles bring prestige to the author, but other approaches can be easier to publish and 214 W here do you publish results ? garner a larger audience. While each approach (policy brief, blog post, magazine article) has their own specific niche and writing style, some aspects are common across all mediums. Who is the target audience? What are the key ideas you want to convey, and what are the implications of your research for the audience? Once you have these basics figured out, write in a jargon-free and readable style. Evidence-based policing is not a discussion of late 14th-century Italian Renaissance art;‡‡ it has immediate, real-world practical implications. The key audience lies outside of academia, and there is an almost purist arrogance to arguing that the practitioner community must learn to understand academic jargon and statistical detail. It is beholden on scholars to do a better job of reaching across the divide in ways that engage with policy makers and practitioners. Bringing a scientific rigor and objectivity to our writing and communication does not have to rob it of spirit. What many fail to appreciate is objectivity does not negate being passionate; instead, it lends that passion the mantle of credibility. 8 REFERENCES 1. 2. 3. 4. 5. 6. Huxley, A. (1933) Proper Studies, Chatto & Windus: London, p. 205. Parliamentary Office of Science and Technology (2020) How to Write a Policy Briefing, UK Parliament: Westminster, London. Harvard Kennedy School Communications Program (n.d.) How to Write an Op-Ed or Column. Accessed October 2022 online at: https://projects. iq.harvard.edu/files/hks-communications-program/files/new_seglin_ how_to_write_an_oped_1_25_17_7.pdf Hayes, D.P. (1992) The growing inaccessibility of science. Nature 356 (6372), 739–740. U.S. Department of Education (2017) Program for the International Assessment of Adult Competencies (PIAAC), 2012–15. https://nces.ed.gov/surveys/ piaac/international_context.asp (accessed March 2022). Braithwaite, R.S. (2020) EBM’s six dangerous words. JAMA Journal 323 (17), 1676–1677. ‡‡ With apologies to my numerous esoteric fine art readers. 15 WHAT ARE THE CHALLENGES WITH EVIDENCE-BASED POLICING? I could see the pain on the chief ’s face. He understood the rationale for the proposed study. He knew the planned intervention was novel and untested. He recognized that a study with scientific reliability would tell him whether it was effective or not. He appreciated the importance of the work. Yet he could not bring himself to support a rigorous study. With a sigh of exasperation, he lamented that the political climate was neither conducive nor supportive, and the likely political backlash outweighed the potential benefits. And knowing the local political and media environment, I was sympathetic to his view. Evidence-based policing is a relatively new idea. It can be challenging to understand, risky to implement, and potential failure can lurk around every corner. This chapter explores a number of common concerns. EXPERIENCE AND CRAFT ARE IGNORED As the opening chapter explained, experience is important in policing. It forms one component of the ‘procedural knowledge’ that is central to police work.1 But some officers worry their experience is not valued in the new, more scientific world. One senior officer put DOI: 10.4324/9781003145684-15 216 W hat are the challenges with evidence - based policing ? it, “We do have to factor in professional instinct, community knowledge and understanding into these equations. It’s important also to acknowledge officers’ instincts, experience and values”.2 Gary Cordner feels “police decision-makers have to balance research and data with experience and professional judgment.”3 Overwhelmingly, the majority of the 960 officers surveyed by Telep and Lum agree.4 While not negating a contribution from research, they favored their personal experience over scientific knowledge in their day-to-day decision making. And sometimes experience can make all the difference, as evident from Box 15‑1. BOX 15‑1 HOW STANISLAV PETROV PREVENTED NUCLEAR WAR On the morning of Sept. 26, 1983, Stanislav Petrov woke up, went to work, and saved the world. And then he got reprimanded. Stanislav Yevgrafovich Petrov was at the time a 44-year-old colonel in the Soviet Air Defense Forces, acting as duty officer at a secret ballistic missile early warning station, about an hour’s drive south of the Kremlin. 1983 was a tense time in the Cold War between the Soviet block and the West, and the risk of nuclear war was always present. Suddenly, the alarms activated, and the computer warned that five Minuteman intercontinental ballistic missiles had been launched from US bases. Colonel Petrov knew that every second counted. Did he warn his superiors, the general staff of the Soviet military, who would likely recommend a retaliatory strike against the United States? After a few tense minutes, he decided it was a system fault. “As he later explained, it was a gut decision, at best a ‘50–50’ guess, based on his distrust of the early-warning system and the relative paucity of missiles that were launched”.5 He was ultimately reprimanded for failing to record the incident fully in his logbook; however, history showed he made the right call. Sunlight reflecting off clouds had fooled the satellite computers programmed to detect missile launches. Petrov’s decision was more than just luck. He was later able to articulate why he made his decision. As the New York Times reported, W hat are the challenges with evidence - based policing ? 217 he could not corroborate the computer’s findings with ground radar telemetry. Furthermore, he had been trained to expect a first strike to come with an overwhelming number of missiles, not the five that were reported. What appears to be experience or instinct is often the result of good training situated in the context of a structured decisionmaking environment. Policing experience becomes converted to policing practice at the local level, through trial-and-error, learning from repeated exposure to events, and is rooted in the immediate context (recall the role of context from Figure 3‑1). Not dissimilar to some public health efforts,6 local policing initiatives are often small scale and have a low methodological quality. Yet, they can resonate with practitioners. Rightly or wrongly, local experience is often given an elevated status. What counts as policing research usually involves an officer ringing around a few local stations or police departments and adopting some initiative that is underway in their local professional cluster. Evidence-based policing is the mechanism by which police officers can—in a structured and systematic way—learn what works from the curated experiences of other police officers. Two thirds of the officers surveyed by Telep and Lum4 rated the DARE program as either ‘effective’ or ‘somewhat effective’, even though it has long been known that DARE does not work and can even have negative effects.7–9 I suspect this is illustrative of either the power of advertising on the part of the DARE organization, or the optimism bias playing a role, whereby we fall prey to the inclination to overestimate the likelihood of experiencing positive outcomes and underestimate the chances of experiencing negative events.10 What is often fascinating in these discussions is that evidence-based policing incorporates a lot of police experience. I explain to officers 218 W hat are the challenges with evidence - based policing ? I teach that—contrary to their frequent perception—evidence-based findings are not the work of academics. Evidence-based policing results are a repository of the collected efforts of tens of thousands of fellow police officers in departments near and far. It might be that academics help to collate, analyze, and disseminate these ideas, but underneath each study are police officers testing and trialing new ways to do the job. They are really learning about how their (often equally skeptical) colleagues conducted foot patrol, or dealt with opioid overdose patients, or investigated a non-fatal shooting. Evidence-based policing is the mechanism by which police officers can—in a structured and systematic way—learn what works from the curated experiences of other police officers. Evidence-based policing results are a repository of the collected efforts of tens of thousands of fellow police officers in departments near and far. It might be that academics help to collate, analyze, and disseminate these ideas, but underneath each study are police officers testing and trialing new ways to do the job. As articulated more eloquently by Bayley and Bittner when discussing police training:11 What police say about how policing is learned is not incompatible with attempts to make instruction in the skills of policing more self-critical and systematic. Progress in police training will come by focusing on the particularities of police work as it is experienced by serving officers and by analyzing that experience and making it available to future police officers. EVIDENCE-BASED POLICING ISN’T FOR STREET COPS “The vast majority of the frontline would struggle to see the relevance [of research]”. A cursory view of the policing literature would seem to support this officer’s quote.2 Even when there is frontline benefit, officers can be reluctant to find value. On learning the result W hat are the challenges with evidence - based policing ? 219 of Sherman’s domestic violence experiment in their city, police officers in Minneapolis were reluctant to adjust their day-to-day practice. Domestic violence officers told Sherman, “Every case is different. You can’t generalize”. He argued this sentiment was not so much anti-science as “an assertion of the right to retain discretion to vary their actions for reasons other than crime control”.12 The concern that evidence-based policing will rob officers of their discretion remains a significant hurdle to this day. It has foundations in a degree of anti-intellectualism, a professional reluctance to admit that egghead academics in their ivory tower can say anything valuable to policing. My sense is that some of this is grounded in antipathy towards academia—with some justification. Quite a few criminologists are critical of law enforcement, skeptical about policing as an institution, or sometimes downright anti-police. Their arguments appear more advocacy-based than evidence-based. The concern that evidence-based policing will rob officers of their discretion remains a significant hurdle to this day. It has foundations in a degree of anti-intellectualism, a professional reluctance to admit that egghead academics in their ivory tower can say anything valuable to policing. Some feel evidence-based policing is not geared for police use in the same way as operational support such as problem-oriented policing. Harvard scholar and former British police officer Malcolm Sparrow has argued “the methods championed by proponents of [evidence-based policing] are fundamentally incompatible with the operational realities of problem-oriented policing”.13 On first glance his argument can appear persuasive. Problem-oriented policing requires tailoring responses to often similar problems, while evidence-based approaches seek to evaluate commonalities in tactic. But as I argued in the opening chapter, evidence-based policing is not a strategy or method in the same manner as problem-oriented 220 W hat are the challenges with evidence - based policing ? policing (see Figure 1‑1). Instead, as with the systematic review by Josh Hinkle and colleagues,14 it can provide the evidence that problem-oriented approaches are successful. EVIDENCE-BASED SCHOLARS DO NOT HAVE A DOG IN THE FIGHT Sherman’s original definition proposed that “evidence-based policing uses research to guide practice and evaluate practitioners”.15 Malcolm Sparrow retorted: This kind of language infuriates police practitioners. Should police managers—who carry all of the responsibility for day-today policing and suffer directly the consequences of failure—be chastised by social scientists (who carry none of the responsibility) simply because they chose to ignore a published research finding or executed an untested or unproven strategy?13 A more liberal reading of Sherman’s original article would suggest he was referring to evaluating the work done by police officers, and not the merits of their personal decision-making. However, it is worth noting that achieving a strong experimental foundation for policing is incredibly challenging. Even pracademics— professionally embedded in police departments—have struggled to get evidence-based practice implemented in their own agencies. If you refer to Box 11‑1, you will recall Commissioner Charles Ramsey and I had to petition Mayor Michael Nutter to get the city’s foot patrol experiment approved. I had nothing to lose, while Ramsey and Nutter shouldered all the risk. I have often found that research practice is a negotiation between what the researcher wants and what risk police executives can tolerate. If both parties can emerge with a study that is at least above Sherman’s ‘rock-bottom’ evidence threshold, then that’s a win. W hat are the challenges with evidence - based policing ? 221 I have often found that research practice is a negotiation between what the researcher wants, and what risk police executives can tolerate. If both parties can emerge with a study that is at least above Sherman’s ‘rock-bottom’ evidence threshold, then that is a win. In deference to John Eck’s16 marvelously titled journal article “When is a bologna sandwich better than sex?”,* sometimes a lesser quality study is better than no study at all. 1 EVIDENCE-BASED POLICING IS TOO PURIST When Sherman proposed evidence-based policing more than 20 years ago, some in academia were concerned that a purist, randomized-trials-or-nothing, brinksmanship loomed. And remembering back to those times, the push for more methodological rigor was substantial. Researchers wondered how much valuable insight was lost in a Campbell Collaboration review of problem-oriented policing that started with over 5,000 studies and resulted in just ten making it into the final product.17 In an assessment of counterterrorism interventions, Lum and Kennedy18 reviewed over 20,000 articles, finding only 80 that evaluated a crime prevention measure, and just seven that met their rigorous methodological standard. Not surprisingly, their subsequent argument that there was a lack of evidence that fortification of embassies had reduced terrorist attacks received some pushback.19–20 I hope we have moved to a ‘second wave’ of evidence-based policing advocacy.21–22 Most of the evidence-based policing scholars I speak to recognize that it is about more than just randomized controlled trials, involving “many more different types of investigations to acquire and use knowledge”.23 A British inspector pointed out, “[We need] a mixture of academic research and practical application,”2 and in reality, police leaders are swayed by expert opinion, media focus, and community inputs. This has ramifications. Policy * The answer is, nothing is better than sex, but a bologna sandwich is better than nothing. 222 W hat are the challenges with evidence - based policing ? makers tend to be evidence-informed (using scholarship as one of several decision tools), rather than evidence-based (grounding their decision predominantly in the research). Decisions in policing can rarely be postponed while academics gather research knowledge.24 Policy makers tend to be evidence-informed (using scholarship as one of several decision tools), rather than evidence-based (grounding their decision predominantly in the research). Police officers have looked on with mild amusement as proponents of a scientific rationalism (jokingly referred to as randomistas) have engaged in esoteric debate with more pragmatic evaluation realists.25 There is a risk though, as Deputy Chief Constable Alex Murray has pointed out: Many academics spend a considerable amount of time critiquing [evidence-based policing]. There is a place for challenging method but the scale, obfuscation, and quantity of literature on this subject can leave a police officer bewildered and cause them to eventually return to doing what they have always done.26 On the one hand, scholars should recognize that policy makers need a range of evidential mechanisms and cannot wait for often complicated and time-consuming experimental research. On the other, there is a reality that—however much it hurts the feelings of some in academia—some research methods are better at addressing threats to internal validity and can provide more reliable answers for people charged with crafting public policy. POLICING RESEARCH SUPPORTS THE STATUS QUO In the last couple of years, a more progressive concern has arisen. Morrell and Rowe argued evidence-based policing “struggles, in W hat are the challenges with evidence - based policing ? 223 methodological and scientific terms, to address the fundamental aim of policing.”27 The argument runs that policing scholars (like me) have sought to improve policing and not critically reflect on the harms of policing or if we need policing at all. This book does not address the discussion of police abolition, as there has long been evidence that police reduce crime.28–30 At a small scale of just a few blocks, the Capitol Hill Organized Protest in Seattle in 2020 excluded police from the neighborhood and degenerated into violence within a few weeks.31 And when the Canadian city of Montreal suffered a police strike of just 16 hours in 1969 . . . Let’s just say it did not go well: Six banks were robbed, more than 100 shops were looted, and there were twelve fires. Property damage came close to $3,000,000; at least 40 carloads of glass will be needed to replace shattered storefronts. Two men were shot dead. . . . A handful of frightened Quebec provincial police, called in to help maintain order, stood by helplessly. One was shot in the back by a sniper and died.32 Assuming that some form of policing is here to stay, Keay and Kirby observe that police should not only be expected to be effective at controlling and mitigating crime and disorder, but also to behave in a fair, proportional, non-discriminatory, and transparent manner.33 The principle of proportionality involves “balancing the seriousness of the intrusion into the privacy of the subject of the operation (or any other person who may be affected) against the need for the activity in investigative and operational terms”.34 It is thus a reasonable question to ask if the intrusion of some policing approaches outweighs any involvement of law enforcement at all. It is true that evidence-based policing has, to date, been focused on crime and disorder outcomes, and less attention has been paid to potentially negative effects of operational policing, or on other forms of public safety.35 Scholars have argued that insufficient attention is paid to the effects of high rates of police contact on community members and neighborhood health.36 My own research with 224 W hat are the challenges with evidence - based policing ? colleagues has explored whether hot spots policing in Philadelphia had a ‘backfire effect’ on the policed communities, finding no evidence of it;37 however, it is a rare example, and more research around the pros and cons of proactive policing is warranted.38 Linkages between crime policy and other societal impacts exist, such as public health.39 Bell and others have argued more research can focus on the role police play in confining marginalized groups, suppressing labor movements, exacerbating health discrepancies, and reinforcing racial segregation.40–41 This is, however, a more philosophical discussion than can be addressed in a practical introduction book. Few reasonable people doubt there are harms associated with policing, but also considerable benefits. The policy challenge is to minimize the former, while maximizing the latter. At a fundamental level it depends on whether you feel policing has an important role to play in public safety and tranquility. If so, then evidence-based policing has a part in establishing good practice in achieving beneficial outcomes with minimal side effects. If you feel there are areas where policing would be better replaced with an alternative option, there is still value in seeking an evidential foundation to support this perspective. REPLICATION AND PUBLICATION BIAS Replication is the process of repeating a study or experiment to see if results similar to the original study can be reproduced. It is an essential aspect of science. A single study with a positive finding can be interesting, but once you have repetition of the finding in similarly designed studies, then you start to elevate the intervention towards the status of scientific evidence.42 For example, while the Philadelphia Foot Patrol Experiment43 might have been a single novel data point, further research has been positive.44–45 It increases the confidence that the Philadelphia result was not unique. Science relies on repetition to confirm innovative approaches to problems. Unfortunately, lack of replication is a problem across many scientific fields,46 and has been referred to as a replication crisis.47 W hat are the challenges with evidence - based policing ? 225 Policing research is no different. First, it is difficult to conduct high quality research given police departments are often hesitant to begin with. Second, researchers get grants for new ideas and rarely attract support to replicate already-tested ideas. Third, there may be low replication of studies due to publication bias. What is publication bias? Editors of journals and book publishers have an incentive to publish novel research that will garner attention and attract interest and citations, and are biased against reproductions of existing research as they are less interesting. In the broader field of criminology, just 2.2 percent of the articles published in the toptier journals were replications, and often only because the follow-up study had conflicting findings from the original.48 A second component to publication bias is the reality that positive results are more likely to be published than null findings.49 The potential impact of this is shown in Figure 15‑1. On the left side, the desire of researchers and editors to favor studies that have positive findings can lead to the perception that a treatment has an effect. But as you can see on the right side of Figure 15‑1, if publication bias exists, and we add the studies that occurred but did not get published, the picture changes considerably. Figure 15‑1 Publication bias can warp perception of treatment efficacy Replication or expansion of the lessons from a research study are an important aspect of expanding knowledge and understanding, part of the scientific method (Figure 15‑2). Given the limited extent of replication in criminology generally, policing specifically, and the 226 W hat are the challenges with evidence - based policing ? Figure 15‑2 Replication as part of the scientific method institutional hurdles pushing against repetition studies from within academia, the lack of replicability across many areas of policing is a significant problem. RANK STIFLES INNOVATION Credit is due to the senior police officer who lamented, “There are a lot of people in the organization that would not challenge me as a superintendent because I am a superintendent. Even if the thing I said was the most absolutely ludicrous, ridiculous thing in the world”.50 I have seen too many police leaders dictate ineffective policies based on their limited experience, and force acquiescence from subordinates through the stifling power of their rank. Behind the scenes, while more informed mid-level commanders pointed out the operational flaws to me, they succumbed in public. W hat are the challenges with evidence - based policing ? 227 If we are to embrace evidence-based policing rather than eminence-based policing, senior officers will have to become more comfortable with doubt and being honest with their knowledge limitations. There are chilling comparisons to the medical field here. After all, “just a few decades ago, best medical practice was driven by things like eminence, charisma, and personal experience . . . Many doctors—especially the most senior ones—fought hard against this, regarding ‘evidence based medicine’ as a challenge to their authority”.51 If we are to embrace evidence-based policing rather than eminence-based policing, senior officers will have to become more comfortable with doubt and being honest with their knowledge limitations. They may have to learn to better explain their decisions, be open to alternative approaches, or even (gasp) accept respectful criticism from people below them. The reality is that—faced with resource constraints or unimaginative leadership regimes—innovative officers are often stifled in their attempts to advance their profession. Some officers have been informally discouraged, and some have seen their careers suffer as a result. It is not surprising that a few choose rational ignorance, where the cost of learning about evidence-based policing outweighs any benefits of possessing the information, given an absence of opportunity to employ it.52 RESEARCH STUDIES THAT WITHHOLD TREATMENT ARE UNETHICAL In some cases, engaging in a randomized controlled experiment would be considered an ethical breach. Research ethics are the values, norms, and institutional arrangements that are the fundamental values of research practice, and they regulate and guide scientific work based on the general ethics of morality in science and the 228 W hat are the challenges with evidence - based policing ? community† (refer back to Box 4‑1 for an outline of some ethical considerations). For example, an RCT to study if becoming unemployed increases the arrest rate of jobless people would require deliberately making some people lose their job; a cruel and harmful act. It would be more ethical to study the effect of unemployment in a natural experiment, so that the researcher had no hand in the loss of employment. Other ethical considerations are less obvious. Untested interventions can be theoretically plausible but not certain to address the problem. Notwithstanding the paucity of reliable research evidence across many police practices, officers can feel an ethical obligation to try and reduce community harm. This drives considerable unwillingness to engage in randomized trials. The argument runs “given our intervention is likely to succeed, it is unethical to withhold it for certain people or places”. For many human behaviors, the belief that an intervention is likely to have a positive effect on a problem resides largely in the imagination of the intervention’s originator based on little more than benevolent good intent. Like doctors who prescribed the cancerous synthetic hormone diethylstilbestrol to nearly a million pregnant women before it was discovered to be harmful,53 this good intent is laudable but dangerous in two ways. First, it inhibits actual testing of the intervention. The optimism bias is understandable: What commander worth their salt would actively promote strategies they thought would fail? But crossing our fingers and willing something to be effective does not make it so. When that optimism feeds a sense that it would be unethical to test the intervention due to its ‘obvious’ likely success, it subtly produces an ironic-unethical argument. A claim of ethical concern to support wholesale adoption of a strategy is actually unethical because it robs us of learning the real impact of the intervention. 2 †Adapted from www.forskningsetikk.no/en/guidelines/social-scienceshumanities-law-and-theology/guidelines-for-research-ethics-in-the-socialsciences-humanities-law-and-theology/. W hat are the challenges with evidence - based policing ? 229 The second problem associated with the well-meaning but flawed ironic-unethical argument, is that it permits harmful or ineffective practices to continue. The classic examples of this include the DARE program and Scared Straight—both found to be more harmful than doing nothing.9, 51, 54 As Richard Nisbett has lamented: Society pays dearly for the experiments it could have conducted but didn’t. Hundreds of thousands of people have died, millions of crimes have been committed, and billions of dollars have been wasted because people have bulled ahead on their assumptions and created interventions without testing them before they were put into place.55 SUMMARY Few argue that evidence-based policing is a panacea to all the challenges facing law enforcement in the 21st century. Numerous fields (such as aviation safety) have advanced significantly with few randomized trials.25 Police officers have complained that experimental criminologists pay little credence to their experience and the craft of policing, and current research tends to have more executive value than frontline relevance.56 Command staff have struggled to implement evidence-based policing. The scholars they work with can be perceived as too purist, failing to recognize that experiments contain risks for senior police officers, risks that are not borne by the researcher. Compromise is often necessary to mitigate the challenges of implementing a research program. Some of those challenges include the ironic-unethical argument that withholding any kind of potentially beneficial treatment is unethical. This chapter has pointed out that to not properly test a novel intervention is inherently unethical. We have an obligation to the public to properly evaluate significant interventions, not just in terms of efficient use of funds but also in the provision of community safety. Arguments that experimentation is unethical are often based in a flawed optimism bias and are responsible for allowing untested interventions to achieve greater attention than they have deserved. 230 W hat are the challenges with evidence - based policing ? REFERENCES 1. Williams, E. and Cockcroft, T. (2019) Moving to the inevitability of evidence-based policing. In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 131–141. 2. Fleming, J. and Rhodes, R.A.W. (2018) Can experience be evidence?: Craft knowledge and evidence-based policing. Policy & Politics 46 (1), 3–26, p. 14. 3. Cordner, G. (2020) Evidence-Based Policing in 45 Small Bytes, National Institute of Justice: Washington, DC, p. 6. 4. Telep, C.W. and Lum, C. (2014) The receptivity of officers to empirical research and evidence-based policing: An examination of survey data from three agencies. Police Quarterly 17 (4), 359–385. 5. Chan, S. (2017) Stanislav Petrov, soviet officer who helped avert nuclear war, is dead at 77. New York Times: New York. 6. Greenhalgh, T. (2020) Will COVID-19 be evidence-based medicine’s nemesis? PloS Med 17 (6), e1003266. 7. Aniskiewicz, R. and Wysong, E. (1990) Evaluating DARE: Drug education and the multiple meanings of success. Policy Studies Review 9 (4), 727–747. 8. Clayton, R.R., Cattarello, A.M. and Johnstone, B.M. (1996) The effectiveness of drug abuse resistance education (Project DARE): Five-year followup results. Preventive Medicine 25, 307–318. 9. Rosenbaum, D.P., Flewelling, R.L., Bailey, S.L., Ringwalt, C.L. and Wilkinson, D.L. (1994) Cops in the classroom: A longitudinal evaluation of drug abuse resistance education (DARE). Journal of Research in Crime and Delinquency 31, 3–31. 10. Sharot, T. (2012) The Optimism Bias: A Tour of the Irrationally Positive Brain, Vintage: New York. 11. Bayley, D.H. and Bittner, E. (1984) Learning the skill of policing. Law and Contemporary Problems 47 (4), 35–59, p. 36. 12. Sherman, L.W. (1984) Experiments in police discretion: Scientific boon or dangerous knowledge? Law and Contemporary Problems 47, 61–80, p. 75. 13. Sparrow, M.K. (2016) Handcuffed: What Holds Policing Back, and the Keys to Reform, Brookings Institution Press: Washington, District of Columbia, pp. 131, 137. W hat are the challenges with evidence - based policing ? 231 14. Hinkle, J.C., Weisburd, D., Telep, C.W. and Petersen, K. (2020) Problem-­ oriented policing for reducing crime and disorder: An updated systematic review and meta-analysis. Campbell Updated Systematic Review, 16: e1089, 1–86. 15. Sherman, L.W. (1998) Evidence-Based Policing, Police Foundation: Washington, DC, p. 4. 16. Eck, J.E. (2006) When is a Bologna sandwich better than sex? A defense of small-n case study evaluations. Journal of Experimental Criminology 2 (3), 345–362. 17. Knuttson, J. and Tompson, L. (2017) Introduction. In Advances in EvidenceBased Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 1–9. 18. Lum, C. and Kennedy, L.W. (2012) In support of evidence-based approaches: A rebuttal to Gloria Laycock. Policing: A Journal of Policy and Practice 6 (4), 317–323. 19. Laycock, G. (2012) In support of evidence-based approaches: A response to Lum and Kennedy. Policing: A Journal of Policy and Practice 6 (4), 324–326. 20. Laycock, G. (2012) Happy birthday? Policing: A Journal of Policy and Practice 6 (2), 101–107. 21. Fielding, N. (2019) Evidence-based practice in policing. In Critical Reflections on Evidence-Based Policing (Fielding, N. et al. eds), Routledge: London, pp. 201–213. 22. Scott, M.S. (2017) Reconciling problem-oriented policing and evidencebased policing. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 27–44. 23. Moore, M.H. (2006) Improving police through expertise, experience, and experiments. In Police Innovation: Contrasting Perspectives (Weisburd, D. and Braga, A.A. eds), Cambridge University Press: New York, pp. 322–338, 324. 24. Tilley, N. and Laycock, G. (2017) The why, what, when and how of evidence-based policing. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 10–26. 25. Sidebottom, A. and Tilley, N. (2019) Evaluation evidence for evidencebased policing: Randomistas and realists. In Critical Reflections on EvidenceBased Policing (Fielding, N. et al. eds), Routledge: London, pp. 72–92. 232 W hat are the challenges with evidence - based policing ? 26. Murray, A. (2019) Why is evidence based policing growing and what challenges lie ahead? In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 215–229. 27. Morrell, K. and Rowe, M. (2019) Democracy, accountability and evidencebased policing: Who calls the shots? In Critical Reflections on Evidence-Based Policing (Fielding, N. et al. eds), Routledge: London, pp. 115–132. 28. Marvell, T.B. and Moody, C.E. (1996) Specification problems, police levels, and crime rates. Criminology 34 (4), 609–646. 29. Chalfin, A. and McCrary, J. (2018) Are U.S. cities underpoliced? Theory and evidence. The Review of Economics and Statistics 100 (1), 167–186. 30. Dau, P.M., Vandeviver, C., Dewinter, M., Witlox, F. and Beken, T.V. (2021) Policing directions: A systematic review on the effectiveness of police presence. European Journal on Criminal Policy and Research. https://doi. org/10.1007/s10610-021-09500-8 31. Best, C. (2021) #35 (Carmen Best) podcast episode. In Reducing Crime Podcast (Ratcliffe, J.H. producer): Philadelphia, PA. 32. Time (1969) Canada: City without cops. Time Magazine. 33. Keay, S. and Kirby, S. (2018) The evolution of the police analyst and the influence of evidence-based policing. Policing: A Journal of Policy and Practice 12 (3), 265–276. 34. Home Office (2010) Covert Surveillance and Property Interference: Revised Code of Practice, Home Office: London, pp. 24–25. 35. Kramer, R. and Remster, B. (2022) The slow violence of contemporary policing. Annual Review of Criminology 5, 43–66. 36. Pryce, D.K., Olaghere, A., Brown, R.A. and Davis, V.M. (2021) A neglected problem: Understanding the effects of personal and vicarious trauma on African Americans’ attitudes toward the police. Criminal Justice and Behavior 48 (10), 1366–1389. 37. Ratcliffe, J.H., Groff, E.R., Sorg, E.T. and Haberman, C.P. (2015) Citizens’ reactions to hot spots policing: Impacts on perceptions of crime, disorder, safety and police. Journal of Experimental Criminology 11 (3), 393–417. 38. Weisburd, D. and Majmundar, M.K., eds (2018) Proactive Policing: Effects on Crime and Communities, National Academies of Sciences Consensus Study Report: Washington, DC. W hat are the challenges with evidence - based policing ? 233 39. Wood, J.D., Taylor, C.J., Groff, E.R. and Ratcliffe, J.H. (2015) Aligning policing and public health promotion: Insights from the world of foot patrol. Police Practice and Research 16 (3), 211–223. 40. Bell, M.C. (2021) Next-generation policing research: Three propositions. Journal of Economic Perspectives 35 (4), 29–48. 41. Fagan, J.A. (2019) Policing and Segregation. In The Dream Revisited (Ellen, I. and Steil, J. eds), Columbia University Press: New York, pp. 153–155. 42. Zwaan, R., Etz, A., Lucas, R. and Donnellan, M. (2018) Making replication mainstream. Behavioral and Brain Sciences 41 (E120). 43. Ratcliffe, J.H., Taniguchi, T., Groff, E.R. and Wood, J.D. (2011) The Philadelphia foot patrol experiment: A randomized controlled trial of police patrol effectiveness in violent crime hotspots. Criminology 49 (3), 795–831. 44. Piza, E.L. and O’Hara, B.A. (2014) Saturation foot-patrol in a high-violence area: A quasi-experimental evaluation. Justice Quarterly 31 (4), 693–718. 45. Novak, K.J., Fox, A.M., Carr, C.M. and Spade, D.A. (2016) The efficacy of foot patrol in violent places. Journal of Experimental Criminology 12 (3), 465–475. 46. Baker, M. (2016) Is there a reproducibility crisis? Nature 533, 452–454. 47. Lewandowsky, S. and Oberauer, K. (2020) Low replicability can support robust and efficient science. Nature Communications 11 (358), 1–12. 48. McNeeley, S. and Warner, J.J. (2015) Replication in criminology: A necessary practice. European Journal of Criminology 12 (5), 581–597. 49. Murad, M.H., Chu, H., Lin, L. and Wang, Z. (2018) The effect of publication bias magnitude and direction on the certainty in evidence. BMJ Evidence-Based Medicine 23 (3), 84–86. 50. May, T., Hunter, G. and Hough, M. (2017) The long and winding road. In Advances in Evidence-Based Policing (Knutsson, J. and Tompson, L. eds), Routledge: New York, pp. 139–156. 51. Goldacre, B. (2013) Building Evidence into Education: Independent External Review Report, Independent External Review Commissioned by the UK Department for Education: London. 52. Williams, D. (2021) Motivated ignorance, rationality, and democratic politics. Synthese 198, 7807–7827. 234 W hat are the challenges with evidence - based policing ? 53. National Cancer Institute (2022) Diethylstilbestrol (DES) Exposure and Cancer. www.cancer.gov/about-cancer/causes-prevention/risk/hormones/desfact-sheet (accessed January 2022). 54. Petrosino, A., Turpin-Petrosino, C., Hollis-Peel, M.E. and Lavenberg, J.G. (2013) Scared straight and other juvenile awareness programs for preventing juvenile delinquency: A systematic review. Campbell Systematic Reviews 9 (1), 1–55. 55. Nisbett, R.E. (2015) Mindware: Tools for Smart Thinking, Farrar, Straus and Giroux: New York, p. 148. 56. Thacher, D. (2008) Research for the front lines. Policing and Society 18 (1), 46–59. 16 WHAT IS NEXT FOR EVIDENCE-BASED POLICING? Carl Sagan pointed out that “science is more than a body of knowledge. It’s a way of thinking”.* The IACP adopted the Law Enforcement Code of Ethics in 1957. One part states: 1 I will never act officiously or permit personal feelings, prejudices, political beliefs, aspirations, animosities or friendships to influence my decisions . . . I know that I alone am responsible for my own standard of professional performance and will take every reasonable opportunity to enhance and improve my level of knowledge and competence.†2 This ethical approach would suggest that an evidence-based approach should be at the core of professional practice. At the time of writing, in too many police services, the role of science is not a way of thinking but instead viewed as an optional feature that can be indulged in if it is not too risky or too much effort. Evidence-based practice is one of the best ways to minimize these * Carl Sagan, May 27, 1996, in a televised interview with Charlie Rose. †www.theiacp.org/resources/law-enforcement-code-of-ethics (accessed February 2022). DOI: 10.4324/9781003145684-16 236 W hat is next for evidence - based policing ? “personal feelings, prejudices, political beliefs, aspirations, animosities or friendships” in decision making, and it can certainly enhance knowledge and competence. So how can it be advanced? This chapter explores several hurdles within policing, and how they might be mitigated. It strives to look forward and explore how to expand evidence-based practice in policing and public safety. EXPAND THE SCOPE OF EVIDENCE-BASED POLICING At the time of writing, evidence-based policing is a little more than twenty years old, and even though rigorous research into crime and policing has existed for a longer period, there is still a great deal to be accomplished. Until now, evidence-based policing has focused more on the management of street-level crime control and somewhat neglected the needs of front-line officers.1 For example, we have advanced our policing of crime hot spots, domestic violence, gangs, and street-level drug dealing. But while we know we should focus on crime hot spots, there is less clarity on what officers should do when they get there.2–3 And areas such as internet scams, criminal investigation, the dark web, financial crimes, and traffic control still lack a strong evidence basis to guide police practitioners. Police and academic researchers are only now turning their attention to recruitment, retention of police staff, training, officer safety and wellness, and the management of vulnerable populations such as people experiencing homelessness and addictions. As policing evaluation expands to these roles, it may become necessary to adjust the evidence hierarchy, or examine the role of different methodologies. Crime scientists and proponents of problem-oriented policing have argued that evidence-based policing should expand to incorporate a wider range of study approaches within what it calls ‘evidence’.4–6 CHANGE THE ‘BLAME GAME’ Aviation safety has numerous lessons for advancing evidence-based policing, such as checklists and incremental improvements. But there W hat is next for evidence - based policing ? 237 is another key example. In the 1990s, the aviation industry (like others) started to use the term ‘no-blame culture’ as organizations tried to replace punitive workplaces that punished what could often be genuine mistakes and errors of judgement. There was little evidence of remedial benefits or prevention from penalizing offenders, and it generated a negative culture of fear and risk aversion. Many systemic errors can be reduced if organizations can learn from not only incidents, but also ‘near misses’ and self-reported errors. These can be combined into analytical data sets of incidents and ‘free lessons’. As James Reason, a leader in this area, has however noted, “To achieve this, it is first necessary to engineer a reporting culture—not an easy thing, especially when it requires people to confess their own slips, lapses and mistakes”.7 The progress in American aviation has been impressive. When a pilot makes an error, there is an obligation to report it to the Aviation Safety Reporting System (ASRS), and design features encourage reporting. For example, the system is hosted by an independent third party—NASA (National Aeronautics and Space Administration)— rather than the FAA, an anonymity program strips pilot information from the report, and it includes an immunity policy. In aviation, the name ‘no-blame culture’ is a misnomer: this is not a blanket immunity policy. But it aims to differentiate those who willfully flaunt aviation rules and safety regulations from those who make inadvertent errors. Making good evidence-based policies requires being informed and having access to data. Data can only come from a culture that reports when things go awry. As Reason explains “an informed culture can only be built on the foundations of a reporting culture. And this, in turn, depends upon establishing a just culture”.8 The highest risk of failure in evidence-based policing occurs when inexperienced managers in a police department with no previous knowledge of evidence-based practice attempt to test novel interventions. 238 W hat is next for evidence - based policing ? This is counter to how many police officers view law enforcement culture. As officers have explained, “We are culturally not ready to critically approach our interventions and accept that things went wrong”. This can result in projects being ‘doomed to succeed’.9 The highest risk of failure in evidence-based policing occurs when inexperienced managers in a police department with no previous knowledge of evidence-based practice attempt to test novel interventions.10 Learning comes from more honest reporting, and honest reporting comes from a place of trust. Now that most patrol officers wear body-worn cameras on the street, we are likely to witness honest lapses of judgement from just about every officer at some point. There will be times when errors are egregious and sanctions should be applied; however, the prevailing view from officers in many of the police departments I work with is that the system is overwhelmingly punitive. The trust between street cops and management is not there. As a former British police chief told me, “It’s hard to take responsibility for the organization, when people outside want individuals to be sacked”. In the future, let us hope that more informed and evidencebased policies can flow from greater access to data, not only when things go right, but when they go wrong. The trust between street cops and management is not there. As a former British police chief told me, “It’s hard to take responsibility for the organization, when people outside want individuals to be sacked.” Making progress by—paradoxically—embracing failure and failing often11 will flow from incorporating evidence-based principles into police academy training, and police promotional training and assessment. Giving projects ‘permission to fail’ is vital to learning and moving forward.9 INSTITUTIONALIZE ANALYTICAL SUPPORT It has been said that academics and police engage in a ‘dialogue of the deaf ’ where neither side can communicate with the other.12 I worry W hat is next for evidence - based policing ? 239 this is increasingly common, having seen numerous academics with no practical or field experience fail miserably in their attempt to connect with practitioners. One possible conduit for less ‘deafness’ in the conversation is to enhance the role of crime analysts. Crime analysts are police personnel who analyze data and information and help the police department make operational decisions in investigations, crime control, and disorder mitigation. They are often knowledgeable regarding evidence-based techniques.13 Currently though, “What is lacking in the description and research of evidence-based practices is guidance for integrating crime analysis into the day-today crime reduction operations of a police department”.14 One area where analysts are often strong is in practical criminological theories. Routine activities theory, the rational choice perspective, crime pattern theory, and deterrence theories are closely connected to crime prevention. These approaches can be immensely valuable in forming research hypothesis. They can help generate the mechanisms that can drive effective crime reduction and form a good foundation for evidence-based policy.15 Evidence-based policing is policy-related, and thus more of a strategic activity that incorporates theoretical perspectives. Unfortunately, crime analysts tend to be overwhelmingly tactical in their orientation, and strategic thinking and theoretical understanding are rarely expected or requested. Analysts report being resentful when asked to perform activities that do not link to the immediate identification and capture of offenders.16 Analysts, either sworn or civilian, may also worry they do not have the education or training to undertake evidence-based policing projects, a point recognized by Evans and Kebbell: “Police agencies frequently lack the skills for effective evaluation and fear that what they have done may not withstand scrutiny”.17 And as Lum and Koper note (in a single spectacular sentence): At the same time, if police officers do not see the value of products from research or analysis, if first-line supervisors view such tactics as overly academic or not implementable, if commanders see such information or those who generate it as a threat to their 240 W hat is next for evidence - based policing ? own authority, or if chiefs and sheriffs try to force implementation without a plan to improve receptivity among their officers, evidence-based policing (or any other reform, for that matter), will hit formidable barriers within organizational systems and cultures.13 A potential solution is of course more training in areas such as experimental methods, policy studies, and strategic intelligence. Another solution is for police departments to expand their researcherpractitioner partnerships. Rather than engage in what is sometimes called ‘hit-and-run’ research—a partnership just for the purposes of one study—researchers can work with a department in a long-term capacity, building trust and insider knowledge over a period of time. Some police forces have used visiting fellow programs with practitioners seconded for some months into academic departments and academics into police agencies. The fellows can act as the translators between academics and practitioners. Embedded criminologists also bring many benefits, as identified by Anthony Braga, including:18 • Bringing research knowledge and analytical skills to the department • Have rigorous evaluation methods that help the department learn what works • Make contributions to research and policy by accessing internal data • Provide scientific evidence that can assist with day-to-day problems • Provide real-world benefits in their interaction with other practitioners Braga worked for many years with the Boston Police Department, and in an informal capacity I have worked for nearly two decades with the Philadelphia Police Department. At present however, too few police agencies have long-term research collaborations with academics who could bring real value to the organization’s evidencebased policy initiatives. W hat is next for evidence - based policing ? 241 ADDRESS IMPLEMENTATION CHALLENGES Evidence-based studies and policies are useless unless they are successfully implemented. Fortunately, researchers found that initiatives implemented from an evidence-based or data-driven perspective are more likely to be effective.19 The strategies were better implemented, the target communities received greater dosage of the intervention, and it was more likely that the project was sustained beyond the initial funding. But policing is also littered with failed attempts to transfer good practice from one place to another. Naïve implementation is common, and I note ten project implementation problems that are likely to occur in Box 16‑1. Whether you are starting a new research study, or adopting an existing good practice from the evidence-based policing literature, how can these risks be reduced? BOX 16‑1 TEN PROJECT IMPLEMENTATION PROBLEMS THAT USUALLY OCCUR 1. Participants will have unrealistic expectations of the project’s anticipated benefits 2. Other people will not see any potential benefit and will actively resist the work 3. There will be insufficient project-related communication within the police department 4. Holding key participants accountable to their tasks will be a challenge 5. External stakeholders will promise more than they can deliver 6. People always anticipate project integrity and data capture go better than they do 7. A few officers will severely misunderstand any instructions and ‘do their own thing’ 8. Analysis of the results will take longer than anyone wants or expects 9. If there is a whiff of success, leaders will immediately want to expand the project 10. If there is a suggestion the program is failing, the project champion will be blamed 242 W hat is next for evidence - based policing ? Researchers who evaluated various British projects argued it was important to collaborate widely and reconcile expertise from various stakeholders “such that the knowledge base comprises the optimum mix in specific contexts of culture, memory, local knowledge and skills to inform evidence-based policing”.20 Marginalized communities and indigenous groups can often be sidelined and ignored when policies that affect them are formulated. Incorporating their perspective can be vital to successful reduction of crime and disorder. A second way to increase the potential success of a project is to engage in a pre-mortem. This is a good way to incorporate the experience of the team. As Gary Klein explains, a pre-mortem is a meeting before a project starts, in which the implementation team try to imagine what might cause their project to fail.21 They then work backward to create a plan to help prevent potential obstacles and reduce the chances of failure. It is the opposite of a post-mortem, where everyone tries to figure out what went wrong and how they could have mitigated the failure. The pre-mortem is a more positive experience because the project has not yet failed, and it can increase team collaboration and investment. DO MORE EXPERIMENTS Noted criminologist Frank Cullen said, “If we cannot develop effective evidence-based interventions that are capable of controlling crime by doing good, our collective irrelevance in policy and practice will continue. Criminology will remain a ‘So what?’ field”.22 In policing specifically, while there continues to be a growth in scholarship, the field lacks experimental evidence across many areas. Dr. Peter Neyroud, former Chief Constable of Thames Valley Police (UK), estimates that by the end of 2019, there had been fewer than 500 experiments that had ever occurred in policing worldwide.‡ By 3 ‡ 445 based on personal communication with Peter, February 2022. W hat is next for evidence - based policing ? 243 comparison, the US National Institute of Health’s website registered over 30,000 clinical trials in just 2019.§4 I often wonder if the reason there is less experimentation in policing is not only that it is more difficult to undertake than in the hard sciences, but also that practitioners are simply not used to seeing it. It is not normalized in a policing world that still reveres experience and intuition. Being a ‘pracademic’—police officers with not only professional expertise but also academic training capable of examining research problems—may not been viewed as real policing.23–24 As a result, few in the job advertise their academic credentials. One way to overcome this is to do more studies and experiments. That lack of normalization and experience with research can mean that officers are understandably reticent. The value of checklists (explained earlier in the book) is their ability to break down complicated tasks into smaller more manageable steps. In that vein, I volunteer a 12-step checklist to conducting experiments in Box 16‑2. In general, greater credence should be given to the reality that resistance to evidence-based approaches is both internal (poor communication, cultural resistance, lack of resources) and external (reliance on external funding resulting in political pressure).25 BOX 16‑2 THE 12-STEP GUIDE TO CONDUCTING AN EXPERIMENT 1. Find a research area that interests you and about which you are passionate 2. Conduct focused background research on reliable sources to hone a hypothesis and research question § 32,517 at Clinicaltrials.gov. Some may not be specifically medical trials or randomized experiments (accessed February 2022). 244 W hat is next for evidence - based policing ? 3. Clearly identify an outcome you want to change, and an intervention you want to test 4. Select a research method that is appropriate for your intervention and outcomes 5. Monitor other variables that might be affected, including those with potentially negative repercussions 6. Pilot your intervention, where possible, to check project viability and study processes. 7. If able, identify equivalent control areas or people as a comparison group 8. During the study, maintain its integrity and monitor the implementation closely 9. Analyze the outcome and the study honestly and diligently 10. Provide a short, readable summary to the stakeholders who participated 11. Publish the results in outlets that will be accessible to practitioners 12. Plan your next study CHANGE BEHAVIOR As one UK police stakeholder said, “That culture is not there in policing, and until it is, it will be very difficult to get evidence-based policing accepted more widely than by a group of enthusiasts . . . The real challenge is to get everyone in policing to understand how it’s got to change”.26 Not only is police culture vested with an ofteninscrutable belief in ideology and experience, but these characteristics can also combine with “an anti-academic rhetoric within policing. There are phrases that refer to officers engaged in research as ‘living in an ivory tower’ or ‘all brain, no common sense.’ Real officers come from the ‘school of life’”.27 So far in this chapter, many suggestions for moving forward would entail tackling the culture in policing that is reluctant to change and wary of the value of research. However, defining and addressing culture is tricky. The real goal is to change behavior. Numerous frameworks for behavior change exist, W hat is next for evidence - based policing ? 245 though one relatively coherent approach is the linking of behavior to two factors specific to the person (capability and motivation) and one external component (opportunity).28 • Capability: Does the officer have the physical and psychological capacity for the activity? Are they armed with the necessary training, skills, and authorizations? • Motivation: Does the officer have the internal enthusiasm and energy, is there anticipation of a reward or benefit for changing, or curiosity to try something new? • Opportunity: Beyond the individual, does the physical and organizational opportunity to change exist? Does the organization provide enough time and encouragement? There are four core building blocks to driving change in an organization. People will change if they are shown how to adapt to the new approach. This requires consistent role models, a story that fosters understanding and a conviction to change, the talent and skills necessary (see Capability in the previous list), and formal mechanisms that support the changes.29 Basford and Schaninger’s building blocks for change, shown in Figure 16‑1, look simple in diagram form.** However, they represent considerable work to achieve. For example, in my own work I have seen police departments claim to adopt new approaches such as problem-oriented policing and the related SARA30 or PANDA models,31 yet fail to make the necessary organizational changes. Leaders still demanded traditional short-term responses to crime problems (a failure of role modeling), they provided little-to-no information to the organization (not fostering understanding), innovative staff with problem-solving skills were kept in reactive roles (failing to develop skills and talents), and they still relied on Compstat’s short-term focus to mandate compliance (formal mechanisms did not support long-term problem-solving). Real change involves more than just talk. 5 ** Adapted from the graphic in Basford and Schaninger.29 246 W hat is next for evidence - based policing ? Figure 16‑1 Basford and Schaninger’s building blocks for change WHY WE NEED SCIENCE, NOT ASSUMPTIONS I briefly mentioned the Cambridge Somerville study in the first chapter. It was devised in the 1930s by Harvard professor Richard Clarke Cabot, and was the first major randomized controlled experiment in criminology. Local schools, churches, police, and welfare groups identified more than 500 ‘average’ or ‘difficult’ boys around the age of 10 from the Massachusetts areas of Cambridge and Somerville near Boston. Half were selected for the program while the boys not selected became the control group. For more than five years, the chosen kids received individual counseling, academic tutoring, family support, summer camps, and home visits. This is the kind of immersive assistance frequently advocated to this day to prevent lessprivileged boys from lapsing into delinquency. It was known within a few years of the program ending that it had no effect. But by the time the boys were adults in their 40s, the findings were even worse. Program participants in the CambridgeSomerville Youth Study were more likely to have engaged in and been convicted of crimes, suffered symptoms of alcoholism and W hat is next for evidence - based policing ? 247 mental illness, and experienced stress-related disorders.32–33 This is one of the most celebrated examples of (and here is your word for the day) iatrogenic crime prevention; crime prevention that can actually be harmful or have adverse side effects.34 Recently, Independent Domestic Violence Advisors were introduced to a specialist domestic abuse court in the UK. The advisors were specially trained to arrange the safety of domestic abuse victims, be a point of contact for them, and to act as their advocate. One would expect that the advisors enhanced the effectiveness of the criminal justice system and kept victims safer, however a recent study found them to be iatrogenic. Victims who had access to the advisors had “a 12% reduced likelihood of their accused abusers being convicted, a 10% increased risk of repeat victimization and harm of repeat crimes 700% higher than victims whose cases were heard without that opportunity”.35 Both studies serve as perfect examples for why research can contribute to social science interventions. They both made a lot of sense theoretically and were likely praised by administrators and even the participants. You can just imagine a sociologist wondering why you would study something that is so ‘obviously’ beneficial. These are cautionary lessons on the importance of evaluating every crime prevention initiative. SUMMARY Simone de Beauvoir once wrote, “I tore myself away from the safe comfort of certainties through my love for truth—and truth rewarded me”.36 A research-oriented and scientific approach to policing may be challenging and drag us away from the comfort of certainties, but the gains could be vital. Mitchell and Lewis argue that there is an ethical imperative to do so, in that “not implementing an evidencebased practice that prevents crime is a failure to protect the innocent and is no different than turning away from a crime in progress. Police leadership should begin a discussion . . . of their duty to use evidence-based practices”.37 248 W hat is next for evidence - based policing ? Should police leaders be under an ethical obligation to embrace evidence-based policing as their duty? In pondering this question, you might consider the position of 18th century Scottish philosopher David Hume, who argued you should not argue what a person ‘ought’ to do, based on what ‘is’.38 For example, while we know that hot spots policing can reduce crime (a largely factual statement), does this mean a police chief is obliged to do it (a moral position)? What if she is dealing with a community struggling with unrest who are resentful of police presence at that time? Operational decisions are grounded in a variety of factors, and we should be careful about falling afoul of Hume’s ‘From-IsTo-Ought’ warning. At the very least however, good leaders should want to access the best available research to aid their decision making. This chapter explored ways to move evidence-based policing research forward. The culture of policing is still mired in the ‘blame game’ where failure is too often seen as individual rather than organizational and structural. This partially explains the reluctance of many in policing to embrace evidence-based policing. Supporting them with analysts who are trained and expected to engage in evidence-based policing is one proposed approach, as is an expansion of the number of embedded criminologists in police departments. All of this requires a change in behavior, meaning to address the capabilities and motivation of officers, and expand the opportunities presented to them. These factors can be enhanced if leaders and peers (inside and outside of policing) act as role models, foster an understanding and conviction to engage with evidence-based practice, and help develop the necessary skills and talent. Finally, the police managerial system has to be aligned with evidence-based policing. This means adjusting processes, procedures, forms, promotion criteria, and reward mechanisms. Advancing evidence-based policing is not a trivial task, but it is one we should embrace. W hat is next for evidence - based policing ? 249 REFERENCES 1. Thacher, D. (2008) Research for the front lines. Policing and Society 18 (1), 46–59. 2. Braga, A.A. and Weisburd, D.L. (2022) Does hot spots policing have meaningful impacts on crime? Findings from an alternative approach to estimating effect sizes from place-based program evaluations. Journal of Quantitative Criminology 38 (1), 1–22. 3. Ratcliffe, J.H. and Sorg, E.T. (2017) Foot Patrol: Rethinking the Cornerstone of Policing, Springer (CriminologyBriefs): New York. 4. Knuttson, J. and Tompson, L. (2017) Introduction. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 1–9. 5. Tilley, N. and Laycock, G. (2017) The why, what, when and how of evidence-based policing. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 10–26. 6. Scott, M.S. (2017) Reconciling problem-oriented policing and evidencebased policing. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 27–44. 7. Reason, J. (1998) Achieving a safe culture: Theory and practice. Work and Stress 12 (3), 293–306, p. 302. 8. Reason, J. (2010) Safety paradoxes and safety culture. Injury Control and Safety Promotion 7 (1), 3–14, p. 2. 9. Fleming, J. and Wingrove, J. (2017) “We would if we could . . . but not sure if we can”: Implementing evidence-based practice: The evidence-based practice agenda in the UK. Policing: A Journal of Policy and Practice 11 (2), 202–213, p. 209. 10. Neyroud, P.W. (2017) Learning to Field Test in Policing: Using an Analysis of Completed Randomised Controlled Trials Involving the Police to Develop a Grounded Theory on the Factors Contributing to High Levels of Treatment Integrity in Police Field Experiments, Institute of Criminology, University of Cambridge: Cambridge. 11. Babineaux, R. and Krumboltz, J. (2013) Fail Fast, Fail Often: How Losing Can Help You Win, Penguin Publishing Group: New York. 12. Bradley, D. and Nixon, C. (2009) Ending the “dialogue of the deaf ”: Evidence and policing policies and practices. An Australian case study. Police Practice and Research 10 (5–6), 423–435. 13. Lum, C. and Koper, C.S. (2017) Evidence-Based Policing: Translating Research into Practice, Oxford University Press: Oxford, p. 135. 250 W hat is next for evidence - based policing ? 14. Smith, J.J., Santos, R.B. and Santos, R.G. (2018) Evidence-based policing and the stratified integration of crime analysis in police agencies: National survey results. Policing: A Journal of Policy and Practice 12 (3), 303–315, p. 303. 15. Eck, J.E. (2017) Some solutions to the evidence-based crime prevention problem. In Advances in Evidence-Based Policing (Knuttson, J. and Tompson, L. eds), Routledge: London, pp. 45–63. 16. O’Shea, T.C. and Nicholls, K. (2003) Crime Analysis in America: Findings and Recommendations, Office of Community Oriented Policing Services: Washington, DC, p. 30. 17. Evans, J. and Kebbell, M. (2012) Integrating intelligence into policing practice. In Policing and Security in Practice (Prenzler, T. ed), Palgrave Macmillan: New York, p. 84. 18. Braga, A.A. (2013) Embedded Criminologists in Police Departments, Police Foundation: Washington, DC, p. 20. 19. Saunders, J., Hipple, N.K., Allison, K. and Peterson, J. (2020) Estimating the impact of research practitioner partnerships on evidence-based program implementation. Justice Quarterly 37 (7), 1322–1342. 20. Davies, P., Rowe, M., Brown, D.M. and Biddle, P. (2021) Understanding the status of evidence in policing research: Reflections from a study of policing domestic abuse. Policing and Society 31 (6), 687–701. 21. Klein, G. (2007) Performing a project premortem. Harvard Business Review. 22. Cullen, F.T. (2011) Beyond adolescence-limited criminology: Choosing our future—the American society of criminology 2010 Sutherland address. Criminology 49 (2), 287–330, 318–319. 23. Bullock, K. and Tilley, N. (2009) Evidence-based policing and crime reduction. Policing: A Journal of Policy and Practice 3 (4), 381–387. 24. Huey, L. and Mitchell, R.J. (2015) Unearthing hidden keys: Why pracademics are an invaluable (if underutilized) resource in policing research. Policing: A Journal of Policy and Practice 10 (3), 300–307. 25. Huey, L., Mitchell, R.J., Kalyal, H. and Pregram, R. (2021) Implementing Evidence-Based Research: A How-to Guide for Police Organizations, Policy Press: Bristol. 26. May, T., Hunter, G. and Hough, M. (2017) The long and winding road. In Advances in Evidence-Based Policing (Knutsson, J. and Tompson, L. eds), Routledge: New York, pp. 139–156, 151. W hat is next for evidence - based policing ? 251 27. Murray, A. (2019) Why is evidence based policing growing and what challenges lie ahead? In Evidence Based Policing: An Introduction (Mitchell, R.J. and Huey, L. eds), Policy Press: Bristol, pp. 215–229, 225. 28. Michie, S., van Stralen, M.M. and West, R. (2011) The behaviour change wheel: A new method for characterising and designing behaviour change interventions. Implementation Science 6 (42), 1–11. 29. Basford, T. and Schaninger, B. (2016) The four building blocks of change. McKinsey Quarterly 1–7, April. 30. Eck, J.E. and Spelman, W. (1987) Problem Solving: Problem-Oriented Policing in Newport News, Police Executive Research Forum: Washington, DC. 31. Ratcliffe, J.H. (2019) Reducing Crime: A Companion for Police Leaders, Routledge: London. 32. McCord, J. (2003) Cures that harm: Unanticipated outcomes of crime prevention programs. Annals of the American Academy of Political Science 587 (1), 16–30. 33. McCord, J. and McCord, W. (1959) A follow-up report on the CambridgeSomerville youth study. Annals of the American Academy of Political and Social Science 322 (1), 89–96. 34. Grabosky, P. (1996) Unintended consequences of crime prevention. In The Politics and Practice of Situational Crime Prevention (Homel, R. ed), Criminal Justice Press: Monsey, NY. 35. Ross, J., Sebire, J. and Strang, H. (2022) Tracking repeat victimisation after domestic abuse cases are heard with and without independent domestic violence advisors (IDVAs) in an English magistrate’s court. Cambridge Journal of Evidence-Based Policing 1–15, p. 14. 36. Beauvoir, S.D. (1974) All Said and Done: The Autobiography of Simone De Beauvoir 1962–1972, Putnam: New York. 37. Mitchell, R.J. and Lewis, S. (2017) Intention is not method, belief is not evidence, rank is not proof: Ethical policing needs evidence-based decision making. International Journal of Emergency Services 6 (3), 188–199, p. 191. 38. Hume, D. (1739 [1986]) Treatise of Human Nature, Penguin Classics: London. INDEX Note: Page numbers in italics indicate a figure and page numbers in bold indicate a table on the corresponding page. ABCDE source reliability test 80 – 82 academic articles 79, 83 – 85, 99, 200, 209 – 210 Adams, Douglas 35 Al-Razi, Abu Bakr Muhammad ibn Zakairya 19 – 20 analogies 213 analysis 181 – 183, 182; confidence intervals 191 – 194; inferential statistics 183; odds ratio 183 – 188; practical significance 194 – 196; statistical significance 188 – 191 anecdotes 98 applied research 119 Asimov, Isaac 15 – 16, 65 attrition 95 authenticity 80 Aviation Safety Reporting System (ASRS) 237 Bacon, Francis 21 bias 80 – 81, 156 – 159; see also selection bias; substitution bias; systematic bias blame game 236 – 238, 248 blinding 160 block randomization 162 – 164, 163 blogs 212 bloodletting 3 – 4 body cameras 2, 26, 66 Boeing Model 299 23, 25 Braga, Anthony 240 Bratton, Bill 13 Broken Windows theory 26 Cabot, Richard Clarke 246 Cambridge-Somerville Youth Study 4–5 Campbell Collaboration 86 – 87 capability 245 cars 22 Center for Problem-Oriented Policing 87 Chamberlin, Thomas 48 change 244 – 245, 246 checklists 23 CHEERS framework 69 – 71 cherry picking 43 254 I ndex Chesterton’s fence 64 childbirth 21 – 22 Clarke, Ron 69 clinical expertise 11 closed-ended survey questions 126 Cohen’s d 175 – 178, 177 Cold War 216 – 217 community policing 6, 7 comparison groups 21 competing explanations 41, 97 Compstat 27 conclusions 203 – 204 conference proceedings 79 – 80 confidence intervals 191 – 194, 194 confirmation bias 4, 21 confounding and confounding variables 94, 157 – 158, 158 context 37 – 40 contingency tables 185, 185 – 187, 192 control groups 155 – 160, 162, 164, 179 Cordner, Gary 40, 216 counterfactual 101, 112 – 113; see also null hypothesis cover letters 208 COVID-19 75 craft 215 – 218 credibility 81 crime analysts 239 Crime Solutions 86 Crisis Intervention Team 107 Cullen, Frank 242 culture of curiosity 14, 49, 115 – 116 D.A.R.E. 138, 217, 229 data and data analysis 2, 25, 34, 37–40 deGrasse Tyson, Neil 16, 30 dependent variables 114 – 115, 128 depth 81 descriptive statistics 183 domestic abuse 170, 219, 247 Drug Abuse Resistance Education (D.A.R.E.) 127 eligibility pool 121 – 122 embedded criminologists 240 EMMIE framework 85 – 86 error 171 – 172 esprit de corps 2 ethics 52 – 54 evidence 40 – 42, 81 – 82; hierarchy of 97 – 104, 99, 103, 136 – 139, 137 evidence-based policing 7; argument for 246 – 248; blame game and 236 – 238; challenges of 215; changing behavior and 244–245, 246; craft vs 215 – 218; defining 10 – 11; expanding the scope of 235 – 236; experimentation and 242 – 244; implementation challenges 241 – 242; innovation and 226 – 227; institutionalized support and 238 – 240; organizational development and 14; origins in aviation safety 22 – 26; origins in medicine 19 – 22, 24 – 26; origins of 3 – 5, 26 – 27; in practice 13 – 16; principles of 27 – 30; publication bias and 224 – 226; as purist 221 – 222; scholars and 220 – 221; street cops vs 218 – 220; supporting the status quo 222 – 224; unethical research and 227 – 229 evidence-based policing matrix 87 Evidence-Based Policing: The Basics 88 evidence-informed policies 222 executive summary 202, 203 experience 7 – 8, 215 – 218 experiments and experimentation 51 – 52, 118, 243 – 244; core concepts 117 – 122; error 171 – 172; experimental power 169 – 171, 172 – 175; methodological approaches 122 – 128; pilot studies 128–131; samples sizes 175 – 179 expertise 36 – 37; see also craft external validity 119 – 120, 121 Eck, John 6, 69, 221 effect size 175–176, 182, 184 facts 34 failure 238 I ndex Federal Aviation Administration (FAA) 24 field experiment 25 field insights 78 field notes 148 field observations 147 – 148, 153 Fleischmann, Martin 57 Floyd, George 5, 81, 126–127 focus groups 98, 125, 146 generalizability 121 Going Equipped 207 – 208 Gray, Thomas 209 grey literature 68 handwashing 20 – 21 harm-focused policing 6, 7 Hawthorne effect 95 Hippocrates 20 Hume, David 196, 248 hunches 63, 108; see also craft Huxley, Aldous 199 hypothesis 50 – 51, 109; importance of 107 – 108; null hypothesis 112 – 113; overall process 115 – 116; from a problem 108 – 112; into a research question 113 – 115 implementation 241 – 242 implications 203, 205 independent variables 114 – 115 inferential statistics 183 information 37 – 40 informed consent 53 informed opinions 35 – 36, 39, 40 innovation 226 – 227 intelligence-led policing 6, 7 internal validity 42, 93, 119 – 120 interpretivist 9 interviews 144 – 146 intuition 1 – 2, 243 ironic-unethical argument 228 – 229 Jersey City Drug Market Analysis Experiment 163 255 Kansas City Gun Experiment 186 – 187, 187, 192 – 194, 193 Kansas City Preventative Patrol Experiment 6, 26 knowledge 37 – 40, 39 Law Enforcement Code of Ethics 235 ledes 206 – 207, 212 libraries 79 Likert scales 151 Lind, James 19 – 21, 24 Machiavelli, Nicolo 65 marginalized groups 224 Maryland Scientific Methods Scale 97 mass media 75 – 76 McCord, Joan 4 – 5 methodology 135 – 139, 203 – 204; qualitative approaches 144 – 150; quality of 98; quasi-experimental design 139 – 144; surveys 150 – 153 Mill, John Stuart 93 Minneapolis Domestic Violence Experiment 26 mixed methods research 126 morality 33 Moskos, Peter 126 motivation 245 multiple choice questions 151 Murray, Alex 222 National Institute of Justice (NIJ) 86 natural experiments 5, 138 natural police 78 New Jersey 5, 26 Newark Foot Patrol Experiment 6 Neyroud, Peter 242 Nisbett, Richard 229 no-blame culture 237 null hypothesis 112–113, 171, 175, 184 Nutter, Michael 220 observation 65 – 67 odds ratio 183 – 188, 185, 213 op-eds 204 – 207 open-ended questions 126, 152 256 I ndex Operation Thumbs Down 143 opinion 33 – 36 opportunity 245 optimism bias 217, 228 organizational evidence 29, 39, 39 p-values 190 – 191, 195 PANDA Crime Reduction Model 29 – 30, 245 participant observers 148 peer review 54 – 56 Pell, Jill 155 Petrov, Stanislav Yevgrafovich 216 – 217 Philadelphia Foot Patrol Experiment 51, 110 – 111, 161, 224 Philadelphia Policing Tactics Experiment 138 PICOT framework 113 – 114, 116 pilot studies 128 –131 plausible mechanism 40, 93, 96 podcasts 211 police academies 77 – 78, 218 Police Chief 208 policy 39 policy briefs 201 – 204 policy research 119 Pons, Stanley 57 Popper, Karl 11 populations 121 positivists 9, 122, 125 posttest-only control group design 141, 142 posttest-only research design 140, 140 – 141 poverty 69 pracademics 65, 220, 243 practical significance 194 – 196 pre-mortem 242 pre-treatment selection 156 – 157 predatory journals 82 – 83 predictive policing 43 preponderance of evidence 54, 109 presentations 210 – 212 pretest-posttest research design 141, 141 principle of proportionality 223 problem identification 48 – 49, 63 – 65, 66; CHEERS framework and 69 – 71; practicality and 68 – 69; through observation 65 – 67; through research 67 – 68 problem-oriented policing 6, 7, 58, 58 – 59, 219 – 220 professional evidence 29, 39, 39 professional publications 207 – 209 professionalism 8 pseudoscience 42 – 43, 44 public health 224 publication 199 – 201, 200; academic writing 209 – 210; op-eds 204 – 207; other forms of 212 – 213; policy briefs 201 – 204; presentations 210 – 212; professional 207 – 209; simple communication 213 – 214 publication bias 170, 224 – 226, 225 qualitative research 125, 127–128, 144 – 150, 153 quantitative research 122, 127–128, 139 quasi-experimental design 138 – 144 Ramsey, Charles 110 randomized controlled trials (RCTs) 101, 155, 166, 227 randomized experiments 143 – 144, 155 – 159; approaches to 161 – 164; considerations for 164 – 167; how it works 159 – 161; tips for running 166 ranking officers 78, 226 – 227 ranking questions 152 rating questions 151 rational ignorance 227 raw differences see simple differences Reason, James 237 recommendations 205 Reducing Crime podcast 68, 136 regression discontinuity 142 regression to the mean 96 rejection 209 I ndex relative incident rate ratio (RIRR) 187 reliability 120 reliable sources 79 – 83, 88 replication 56 – 60, 224 – 226, 226 research 49 – 50, 67 – 68, 76; academic articles 83 – 85; core concepts 117 – 122; methodological approaches 122 –128; pilot studies 128–131; reliable sources 79 – 83; research summaries 85 – 88; unreliable sources 75 – 78; see also science research attitude 49, 64, 71 research ethics 53 – 54, 227 – 229 research, evaluating 91 – 92, 92; evidence hierarchy 97 – 104; internal validity 92 – 97 results 203 right-turn-on-red (RTOR) junctions 173 – 175 role models 245 Rush, Benjamin 3 Russell, Bertrand 41 – 42 Sagan, Carl 235 samples and sample size 112, 174, 175 – 179 samples and sample sizes 122 sampling approaches 122 – 124 Scared Straight 86, 127, 138 science 2; bad 42 – 44; evidence and 40 – 42; information and 37 – 40; opinion vs 33 – 36; see also research science denialism 43 scientific evidence 8 – 9, 29, 38, 39 scientific method 11 – 13, 12, 47 – 48, 50; analyzing results 52 – 54, 182; background research 49 – 50, 76; evaluating research 92; experimentation and 51 – 52; hypothesis and 50 – 51, 107 – 116, 109; identifying a problem 48 – 49, 66; peer review and publication 54 – 56; publication 200; replicating or expanding research 56 – 60, 226 257 scurvy 19 – 20 seasonality 94, 95 selection bias 156, 157 selection effects 93 – 94 semi-structured interviews 145 Semmelweis, Ignaz 21 – 22, 24 Sherman, Lawrence 26 – 27, 100, 199, 218 – 219, 221 simple differences 184 simple randomization 162 – 163, 163 single-group time-series design 142, 142 slides 210 – 212 Smith, Gordon 155 social media 75, 212 social sciences 9 Sparrow, Malcolm 102, 219, 220 Sports Illustrated 96 stakeholder evidence 29, 39, 39 standard model of policing 5 – 7, 7 statistical association 41, 97 statistical significance 182, 188 – 191 statistical testing 169, 179 street cops 218 – 220 structured interviews 144 – 145 substitution bias 165 surveys 126, 150 – 153 systematic bias 159, 167 systematic reviews 55, 57, 102 TASER 129 temporal causality 41, 96 temporal order 94 testing effect 95 Thames Valley Police Journal 56 theories 107, 110 trade magazines 79 trends 94, 95 trust 238 Twitter 212 two-group pretest-posttest design 143, 143 type I and type II errors 171, 171 – 172 258 I ndex UK College of Policing Crime Reduction Toolkit 85 – 86 unreliable sources 75 – 78, 88; see also predatory journals unstructured interviews 145 – 146 validity 119 – 120 vitamins 20 websites 85 – 88 Wensley, Frederick 15 winner’s curse 170 Wright, Orville 22, 25 Wright, Wilbur 22, 25 Younis, Zahid 1