Compliments of A Relativity e-Book Accelerate Your e-Discovery Efforts 8 Ways to Speed Up Review with Text Analytics Let’s Catch Up With Your e-Discovery Workload. Data volumes aren’t getting any smaller… The largest cases run in Relativity have grown about 32 189 6x larger in the last 5 years. million documents in the largest case 2009 million documents in the largest case 2014 … but you can handle them more efficiently than ever. Reviewers working on cases in Relativity that use text analytics get through about as many documents as those working on cases that don’t. 4x Based on median reviewer count and case size for cases in Relativity with and without analytics since 2009. Analytics is the engine that speeds up review. This e-book explains how real-world teams use it. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 1 Table of Contents Introduction: There’s More to Text Analytics Than Assisted Review3 Chapter 1: Tie the Conversation Together with Email Threading4 Chapter 2: Review Near-duplicate Documents at the Same Time6 Chapter 3: Quickly Batch Foreign Language Documents to the Right Reviewers7 Chapter 4: Expand Your Awareness of Critical Case Language8 Chapter 5: Uncover Conceptually Similar Documents9 Chapter 6: Prioritize Your Review with Document Clusters10 Chapter 7: Categorize Case Data with Sample Documents12 Chapter 8: Tackle Complex Cases with Computer-assisted Review14 Conclusion: What Will You Accomplish with Text Analytics?16 Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 2 There’s More to Text Analytics Than Assisted Review. Introduction There’s been a lot of discussion in legal and business communities about how computer-assisted review can make litigation less costly and tedious, but the story about the benefits of the technology behind it—text analytics—hasn’t been told. We created this e-book for those who would like a better understanding of what text analytics is and how it can make their teams more productive. Each chapter focuses on one of eight principle ways this flexible and multi-faceted technology helps document review teams accomplish more with fewer resources and in less time. The final chapter of this e-book explains how to get the most out of text analytics in computerassisted review, but first, we’ll describe how to use text analytics to amplify your e-discovery efforts with features such as: • Email threading • Near-duplicate detection • Foreign language identification • Similar-document analysis • Clustering Regardless of the number of documents in your review, you can handle them more efficiently and with more transparency than ever before. ... and more Throughout, you’ll find real-world examples of how text analytics increases review speed while lowering costs. Where massive review teams once needed to thumb through page after page of irrelevant information to dig into the real substance of a case, now smaller, more agile teams are using technology to quickly find the documents that make or break a case. Regardless of the number of documents in your review, you can handle them more efficiently and with more transparency than ever before. In other words, text analytics helps you make sense of your data quickly. Read on to learn more. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 3 1 Tie the Conversation Together with Email Threading Chapter 1 Troutman Sanders faced a review of 742,000 documents. Of those documents, 441,362 were emails. Email threading is a text analytics feature that works behind the scenes to detect all emails in a single conversation and organize them for faster review. With email messages grouped together, conversations are organized in a way that’s easy to understand and batch out to reviewers. Additionally, identifying the inclusive emails—which include content from all emails in a chain— drastically decreases the number of records to review, preventing repetitive work while ensuring all the content is covered. Troutman Sanders used email threading to eliminate 148,329 emails with 84,225 attachments. The team When assisting their client responding to a government subpoena, was able to focus their efforts on only 66 percent of their emails—those that contained entire conversations or unique attachments. Even with a conservative cost estimate of one dollar per document reviewed, email threading helped the team save their client approximately $233,000. With email threading, you can: • Review only what you need to. Identify the inclusive emails in a thread, such as the last email in a lengthy conversation, and avoid the repetitive replies and forwards. • Understand custodian communication. Visually group together emails in a way that’s easy to understand. • Keep email organized for reviewers. Batch out inclusive documents to your reviewers and keep email threads intact. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 4 1 Tie the Conversation Together with Email Threading Troutman Sanders used Relativity Analytics to reduce their reviewable dataset by 34 percent. Read the complete Customer Win for details » Chapter 1 READ THE DETAILS “We were very pleased with the email threading results in Relativity. Aside from helping us save time and money, the whole process took only a couple of hours to set up.” Chris Haley Director of Litigation Technology, Troutman Sanders eMerge Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 5 2 Review Near-duplicate Documents at the Same Time Chapter 2 How much time do you spend rereading nearly identical information? Text analytics can identify documents that are nearly identical, such as multiple versions or drafts of the same document, so you can focus your efforts on those documents at once. Near-duplicate identification saves you time when tagging highly similar documents with the issues in your case. When new documents are added during a review, near-duplicate detection can also incorporate those new documents into existing groups of duplicates where applicable and set apart any new information so you don’t have to reread content. After text analytics finds documents with substantial amounts of text that are exactly the same, taking into account word order and location, it reports high-level information such as: DOCDOCDOCDOC • The number of near-duplicate groups in your data set 100 98 98 99 • The average number of documents per group PDFPDFPDFPDF • The average percent of similarity between the documents in each group 100 99 100 QUALITY CONTROL TIP Use near-duplicate detection to verify the consistency of coding decisions made in your review by looking for any coding differences across groups of nearly identical documents. 98 As you review documents in each group of near-duplicates, you can quickly navigate to other documents in the group, view the percent of similarity between documents, and even compare documents to view the exact differences within the text. Armed with the information text analytics provides through near-duplicate detection, you can apply issue-coding decisions to very similar documents quickly and accurately. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 6 3 Quickly Batch Foreign Language Documents to the Right Reviewers Chapter 3 Text analytics can detect the languages that exist in a document and tag the document with those languages. Documents can then be organized by language, so you can quickly batch them out to translators or reviewers fluent in those languages. Make the most of language detection in two simple steps: Step 1: Gain insight into the languages of all documents in your case Who knew your data set would include documents in Zulu? Information like the percent of prevalence for all languages present in each document is useful when planning your review. Step 2: Batch documents to the correct review teams Do all your documents in Chinese need to go to a specific review team? Do you have a project manager ready to send Portuguese content to your translation service? Easily batch out all documents by language to get them in the hands of the right people, right away. Relativity Analytics can detect up to 173LANGUAGES HOW DOES IT WORK? Language identification analyzes the sentence structure and punctuation in your documents rather than solely relying on individual words to provide more accurate reports. In addition to providing a big-picture view of which languages are present in your data set, text analytics technology can help you make better decisions about what should be sent to reviewers and what needs to be translated first. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 7 4 Expand Your Awareness of Critical Case Language Chapter 4 Ever find yourself staring at the final critical documents in a case but still missing the final pieces to the puzzle? Through keyword expansion, text analytics gives you investigative power to widen searches and pull in more relevant documents sooner in the review process by teasing relevant terms out of key documents. Just search on keywords of your choice from any document, and from those terms text analytics can provide a list of terms that are conceptually related based on the unique language in your data set. By finding related terms from other documents, you can discover unexpected or hidden words, such as project code names and company or industry jargon, and ensure you aren’t overlooking anything important to your case. With keyword expansion, you can: QUALITY CONTROL TIP Do you continue to add unexpected terms—and time—during your review as you learn more from your documents? Use keyword expansion to find all your keywords up front and uncover code names and other terms you wouldn’t have known were important and maybe never would have found. • Get a sense of the uses of language to express the same or similar concepts • Quickly uncover important terms that would have taken much longer to find by reading through documents • Ensure the list of key terms in your review is complete • Find code words or hidden language • Understand how newly discovered terms rank in relationship to your search term Ultimately, finding unknown terms saves time by refocusing your searches and returning your most relevant documents faster. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 8 5 Uncover Conceptually Similar Documents Words that hold very similar meanings, such as “cold” and “frigid,” or words with multiple meanings, such as “leaves,” can skew results of traditional keyword searches and slow down the review process. Concept searching is another text analytics feature that helps overcome obstacles in standard searching techniques. Concept searching goes beyond keywords to find documents based on ideas rather than specific terms. This allows you to identify important documents and follow an investigatory pattern, locating relevant documents even without knowledge of the specific terms, phrases, jargon, or code words that may be used in other documents. While reviewing a key document, you can quickly locate additional documents that are conceptually similar—either based on a select excerpt of text or the document overall. Text analytics compares your chosen text or document with your entire data set, returning documents ranked by conceptual similarity. Matches are not based on any specific terms in the query or document, Chapter 5 HOW DOES IT WORK? Instead of limiting your search to the exact words or phrases you enter, concept searching analyzes your query holistically—looking at how terms are related and arranged to identify context—creating matches to documents with related meanings. but the concepts found within each. With concept searches, you can: • Find documents conceptually related to the original • Search based on text from external sources such as pleadings, briefs, or articles • Create your own ideal document and search for any documents resembling your fabricated “smoking gun” • Discover potentially relevant documents faster • Focus on the concepts deemed most important in a case QUALITY CONTROL TIP Use concept searches to verify the coding for a given document is consistent with the coding applied to conceptually similar documents. In short, text analytics can find documents you’re not sure even exist—a far more forgiving workflow than the typical constraints imposed by a standard keyword search. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 9 6 Prioritize Your Review with Document Clusters Chapter 6 One Friday morning, a team at Stradley Ronon Stevens & Young received approximately 7,900 documents related to an insurance matter. The partner on the case gave an associate until Monday morning to identify all the key documents in the data set. Text analytics helps firms like Stradley Ronon get the most important groups of documents to review teams as soon as possible and batch documents by conceptual similarity for faster, more consistent coding. With a feature called clustering, you can organize and prioritize your review much earlier in a case. Clustering automatically identifies and groups documents with similar concepts. It labels those groups by the most prevalent ideas in each one and visually represents how the groups relate to one another. Unlike a concept search, the user provides no input as to what they’re looking for— there’s no need for subject matter experts to identify example documents. HELPFUL TERMS AND TIPS Generality: You can adjust “generality” to affect the number of clusters. Set a higher generality for fewer larger clusters when you need a high-level look at the concepts in your review. Set a lower generality for many specific clusters when you need more details. Coherence: Clusters are assigned a “coherence” score to indicate how closely related its documents are to each other. The higher the coherence, the more closely related the documents. Set a high minimum coherence if you need tighter sub-clusters with more closely related documents for shorter, quicker, batched reviews. Depth: Set “depth” to tell text analytics the maximum level of sub-clusters it should create. A lower setting yields a simpler cluster structure if you want to do less conceptual analysis before review. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 10 6 Prioritize Your Review with Document Clusters The associate who used text analytics to cluster documents in Stradley Ronon’s insurance matter found that while the majority of the clusters seemed to have nothing to do with the case, one cluster labeled with more meaningful terms stood out. This key cluster in the opposing counsel’s production contained approximately 450 documents—a mere six percent of the full data set. The associate was able to complete the review in a couple of hours. With documents in clusters, you can: • Investigate an unknown data set in layers to quickly understand the topics involved • Prioritize documents that are most likely to be relevant for review • Rapidly code documents based on issues and batch them accordingly for a more efficient review Chapter 6 Stradley Ronon quickly eliminated 94 percent of the documents in their data set as non-responsive using the clustering feature of Relativity Analytics. Read the complete Customer Win for details » READ THE DETAILS • Overlay coding information and metadata on top of clusters to create a document heat map and identify review inconsistencies among similar clusters • Set aside documents that are clearly not responsive to litigation, such as those labeled with “league,” “fantasy,” “lineup,” and “football” “The fact that the associate—someone who had never used Relativity before—was able to so quickly learn how the system works and then find exactly what he needed was outstanding. Not only will he likely be using Analytics in the future, but increasingly our attorneys are using it in all sorts of cases. They know how powerful a feature it can be.” Brendan Curran Litigation Support Manager, Stradley Ronon Stevens & Young Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 11 7 Categorize Case Data with Sample Documents Chapter 7 With a collection of about 500,000 documents, an anticipated responsiveness rate of just 2 percent, and a production deadline of 2.5 weeks, Innovative Discovery needed multiple tools from a text analytics toolbox. Like the clustering feature of text analytics, categorization conceptually classifies your documents so you can more quickly find the ones most relevant to your case. But while clustering is designed to move a review forward without user input, categorization allows subject matter experts to automatically group unreviewed documents into categories they define themselves with the issues coded in a small manually reviewed set. You can also use categorization to determine if documents are most likely to be responsive or non-responsive. Reviewers code example documents to train the system, and categorization uses the concepts found in the rest of the documents to group them according to the designation and issues coded in the sample set. This means you can organize and prioritize your review around the documents you already know are important to your case. • With your documents categorized, you can: • • Prioritize data that should be reviewed first. Allow subject matter experts to quickly get their eyes on documents that are most likely relevant to the case and related to their areas of expertise. TIPS TO SUCCESSFUL CATEGORIZATION • • Find the important documents from an opposing production. Use coded documents from your own data set to identify key issues and hot documents in the opposition’s production, allowing you to zero in on the documents that will be most beneficial to building your case. • Automate issue coding. Automatically find and code documents similar to those you’ve • Provide at least 5-10 examples per category for appropriate coverage High-quality, representative examples yield strong categories Each example document or excerpt should represent only one topic or issue Generally speaking, the more text in each example, the better already tagged with key issues, a great way to make the most of manual work on a large data set in little time. Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 12 7 Categorize Case Data with Sample Documents Chapter 7 In Innovative Discovery’s case, Relativity Analytics reduced the number of documents for manual review by 92 percent. They met their deadline and saved an estimated $600,000. $600,000 500,000 — IN SAVINGS — documents collected JUST 8% manually reviewed QUALITY CONTROL TIP Run a sample of your manually coded documents through categorization to identify missed documents or coding inconsistencies. “Relativity empowered the legal team to assess their case quickly, despite a large data universe and an aggressive timeline.” Cathy Fetgatter Vice President of Managed Review Services, Innovative Discovery Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 13 8 Tackle Complex Cases with Computer-assisted Review Chapter 8 When the U.S. Department of Justice investigated the AnheuserBusch InBev merger, McDermott Will & Emery were called upon to review 1.6 million documents that could be relevant to the U.S. DOJ’s requests for information. They had just 2 months. Computer-assisted review helps you accelerate your review process by amplifying your team’s efforts across any substantial document set. Text analytics (categorization) is one of the three key elements of computer-assisted review, which also includes statistical validation and, most importantly, actual humans. In computer-assisted review, experts provide coded documents to a system in the form of seed sets, and the system applies their decisions to the rest of the document universe through an iterative workflow managed by the review team. The end result is a less costly and tedious e-discovery experience. Within 6 weeks, McDermott Will & Emery completed productions for the U.S. DOJ and saved over $2 million in review costs using Relativity Assisted Review. Read the complete Customer Win for details » READ THE DETAILS With computer-assisted review, you can: • Code responsive documents more quickly for subsequent manual review by the most qualified experts, passing non-responsive items to other reviewers • Choose to manually review only the documents statistically validated as responsive, saving significant time and money by eliminating unnecessary work on irrelevant data • Create a production in a very short timeframe from a large data set in a non-litigation HOW DOES IT WORK? 1. Your team codes example documents with responsiveness designations 2. During the assisted review process, text analytics applies your decisions across the larger data set 3. Your results are validated with transparent, defensible statistics scenario, such as responding to a second request, where over-inclusiveness may not be a concern Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 14 8 Tackle Complex Cases with Computer-assisted Review Chapter 8 Review QC Sample Report / Verify Regardless of the unique needs of your case, text analytics reports on results throughout an assisted review process to help you determine, based on statistics, when your review is complete. Because human experts validate the decisions made by the system using statistics, reviewers retain the control, flexibility, and transparency needed for an accurate and defensible review. It’s not just the U.S. DOJ—there’s also federal court approval of assisted review. In Rio Tinto v. Pale, Judge Andrew Peck issued an opinion stating that whenever a producing party wants to use assisted review, the courts will permit it. LEARN MORE DOWNLOAD THE WHITE PAPER P A Want to learn more about how to use assisted review to efficiently produce defensible results? Check out the white paper, “Understanding the Components of Assisted Review and the Workflow That Ties Them Together.” Complete D Categorize VE Review Training Sample PRO “The DOJ recognized that computer-assisted review could mean smaller productions with better quality information.” Martha Louks Discovery Consultant, McDermott Will & Emery Copyright © 2015 kCura LLC. All rights reserved. Accelerate Your e-Discovery Efforts 15 What Will You Accomplish with Text Analytics? Conclusion While electronically stored information continues to grow and cases become more complex, technology is keeping up with the challenge and becoming increasingly flexible. A better understanding of the ways text analytics speeds up review is the first step to making your job a lot easier. Relativity Analytics helps e-discovery professionals leverage the power of technology to more effectively handle their data in any combination of the eight ways discussed in this e-book. Integrate Analytics into your workflow to conduct document review more efficiently and with more transparency than ever before, and find out for yourself why Analytics usage increased by 50 percent to a total of 222, 379 gigabytes indexed in the past year alone. Take the next step. Let us know how we can help you catch up with your e-discovery workload. SCHEDULE A DEMO Two Ravinia Drive, Suite 850 Atlanta, GA 30346, USA www.DTIGlobal.com Copyright © 2015 kCura LLC. All rights reserved. Relativity Analytics has indexed over half a petabyte of real-world case data. That’s the equivalent of 20 MILLION two-drawer file cabinets filled with paper. 231 South LaSalle Street, 8th Floor Chicago, IL 60604, USA T: +1 312.263.1177 • F: +1 312.263.4351 sales@kcura.com • www.kcura.com Accelerate Your e-Discovery Efforts 16