Accelerate Your e-Discovery Efforts

Compliments of
A Relativity e-Book
Accelerate Your e-Discovery Efforts
8 Ways to Speed Up Review with Text Analytics
Let’s Catch Up With Your e-Discovery Workload.
Data volumes aren’t getting any smaller…
The largest cases run in Relativity have grown about
32
189
6x larger in the last 5 years.
million documents in the largest case 2009
million documents in the largest case 2014
… but you can handle them more efficiently than ever.
Reviewers working on cases in Relativity that use text analytics get through about
as many documents as those working on cases that don’t.
4x
Based on median reviewer count and case size for cases in Relativity with and without analytics since 2009.
Analytics is the engine that speeds up review. This e-book explains how real-world teams use it.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
1
Table of Contents
Introduction: There’s More to Text Analytics Than Assisted Review3
Chapter 1: Tie the Conversation Together with Email Threading4
Chapter 2: Review Near-duplicate Documents at the Same Time6
Chapter 3: Quickly Batch Foreign Language Documents to the Right Reviewers7
Chapter 4: Expand Your Awareness of Critical Case Language8
Chapter 5: Uncover Conceptually Similar Documents9
Chapter 6: Prioritize Your Review with Document Clusters10
Chapter 7: Categorize Case Data with Sample Documents12
Chapter 8: Tackle Complex Cases with Computer-assisted Review14
Conclusion: What Will You Accomplish with Text Analytics?16
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
2
There’s More to Text Analytics Than Assisted Review.
Introduction
There’s been a lot of discussion in legal and business communities
about how computer-assisted review can make litigation less
costly and tedious, but the story about the benefits of the
technology behind it—text analytics—hasn’t been told.
We created this e-book for those who would like a better understanding of what text analytics is
and how it can make their teams more productive. Each chapter focuses on one of eight principle
ways this flexible and multi-faceted technology helps document review teams accomplish more
with fewer resources and in less time.
The final chapter of this e-book explains how to get the most out of text analytics in computerassisted review, but first, we’ll describe how to use text analytics to amplify your e-discovery efforts
with features such as:
• Email threading
• Near-duplicate detection
• Foreign language identification
• Similar-document analysis
• Clustering
Regardless of the number of
documents in your review, you can
handle them more efficiently and with
more transparency than ever before.
... and more
Throughout, you’ll find real-world examples of how text analytics increases review speed while
lowering costs.
Where massive review teams once needed to thumb through page after page of irrelevant
information to dig into the real substance of a case, now smaller, more agile teams are using
technology to quickly find the documents that make or break a case. Regardless of the number of
documents in your review, you can handle them more efficiently and with more transparency than
ever before.
In other words, text analytics helps you make sense of your data quickly. Read on to learn more.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
3
1
Tie the Conversation Together with Email Threading
Chapter 1
Troutman Sanders faced a review of 742,000 documents.
Of those documents, 441,362 were emails.
Email threading is a text analytics feature that works behind the scenes to detect all emails in a
single conversation and organize them for faster review. With email messages grouped together,
conversations are organized in a way that’s easy to understand and batch out to reviewers.
Additionally, identifying the inclusive emails—which include content from all emails in a chain—
drastically decreases the number of records to review, preventing repetitive work while ensuring
all the content is covered.
Troutman Sanders used
email threading to eliminate 148,329 emails with 84,225 attachments. The team
When assisting their client responding to a government subpoena,
was able to focus their efforts on only 66 percent of their emails—those that contained entire
conversations or unique attachments.
Even with a conservative cost estimate of
one dollar per document reviewed,
email threading helped the team save
their client approximately $233,000.
With email threading, you can:
• Review only what you need to. Identify the inclusive emails in a thread, such as the last email
in a lengthy conversation, and avoid the repetitive replies and forwards.
• Understand custodian communication. Visually group together emails in a way that’s easy
to understand.
• Keep email organized for reviewers. Batch out inclusive documents to your reviewers and
keep email threads intact.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
4
1
Tie the Conversation Together with Email Threading
Troutman Sanders used Relativity Analytics to reduce their reviewable
dataset by 34 percent. Read the complete Customer Win for details »
Chapter 1
READ THE DETAILS
“We were very pleased with the email threading results in
Relativity. Aside from helping us save time and money, the
whole process took only a couple of hours to set up.”
Chris Haley
Director of Litigation Technology, Troutman Sanders eMerge
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
5
2
Review Near-duplicate Documents at the Same Time
Chapter 2
How much time do you spend rereading
nearly identical information?
Text analytics can identify documents that are nearly identical, such as multiple versions or drafts
of the same document, so you can focus your efforts on those documents at once. Near-duplicate
identification saves you time when tagging highly similar documents with the issues in your case.
When new documents are added during a review, near-duplicate detection can also incorporate
those new documents into existing groups of duplicates where applicable and set apart any new
information so you don’t have to reread content.
After text analytics finds documents with substantial amounts of text that are exactly the same,
taking into account word order and location, it reports high-level information such as:
DOCDOCDOCDOC
• The number of near-duplicate groups in
your data set
100
98
98
99
• The average number of documents per
group
PDFPDFPDFPDF
• The average percent of similarity between
the documents in each group
100
99
100
QUALITY CONTROL TIP
Use near-duplicate detection to verify
the consistency of coding decisions
made in your review by looking for
any coding differences across groups
of nearly identical documents.
98
As you review documents in each group of near-duplicates, you can quickly navigate to other
documents in the group, view the percent of similarity between documents, and even compare
documents to view the exact differences within the text.
Armed with the information text analytics provides through near-duplicate
detection, you can apply issue-coding decisions to very similar documents
quickly and accurately.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
6
3
Quickly Batch Foreign Language Documents to the Right Reviewers
Chapter 3
Text analytics can detect the languages that exist in a document and tag the document with those
languages. Documents can then be organized by language, so you can quickly batch them out to
translators or reviewers fluent in those languages.
Make the most of language detection in two simple steps:
Step 1: Gain insight into the languages of all documents in your case
Who knew your data set would include documents in Zulu? Information like the percent
of prevalence for all languages present in each document is useful when planning
your review.
Step 2: Batch documents to the correct review teams
Do all your documents in Chinese need to go to a specific review team? Do you have a
project manager ready to send Portuguese content to your translation service? Easily
batch out all documents by language to get them in the hands of the right people,
right away.
Relativity Analytics can detect up to
173LANGUAGES
HOW DOES IT WORK?
Language identification analyzes the
sentence structure and punctuation
in your documents rather than solely
relying on individual words to provide
more accurate reports.
In addition to providing a big-picture view of which languages are present in your data set, text
analytics technology can help you make better decisions about what should be sent
to reviewers and what needs to be translated first.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
7
4
Expand Your Awareness of Critical Case Language
Chapter 4
Ever find yourself staring at the final critical documents in a case but still missing the final pieces
to the puzzle? Through keyword expansion, text analytics gives you investigative power to widen
searches and pull in more relevant documents sooner in the review process by teasing relevant
terms out of key documents.
Just search on keywords of your choice from any document, and from those terms text analytics
can provide a list of terms that are conceptually related based on the unique language in your
data set.
By finding related terms from other documents, you can
discover unexpected or hidden words, such as project code
names and company or industry jargon, and ensure you aren’t
overlooking anything important to your case.
With keyword expansion, you can:
QUALITY CONTROL TIP
Do you continue to add unexpected
terms—and time—during your
review as you learn more from your
documents? Use keyword expansion
to find all your keywords up front
and uncover code names and other
terms you wouldn’t have known were
important and maybe never would
have found.
• Get a sense of the uses of language to express the same or similar concepts
• Quickly uncover important terms that would have taken much longer to find by reading
through documents
• Ensure the list of key terms in your review is complete
• Find code words or hidden language
• Understand how newly discovered terms rank in relationship to your search term
Ultimately, finding unknown terms saves time by refocusing your searches and
returning your most relevant documents faster.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
8
5
Uncover Conceptually Similar Documents
Words that hold very similar meanings, such as “cold” and “frigid,” or words with multiple meanings,
such as “leaves,” can skew results of traditional keyword searches and slow down the review
process. Concept searching is another text analytics feature that helps overcome obstacles in
standard searching techniques.
Concept searching goes beyond keywords to find documents based on ideas
rather than specific terms. This allows you to identify important documents and follow an
investigatory pattern, locating relevant documents even without knowledge of the specific terms,
phrases, jargon, or code words that may be used in other documents.
While reviewing a key document, you can quickly locate additional documents that are conceptually
similar—either based on a select excerpt of text or the document overall. Text analytics compares
your chosen text or document with your entire data set, returning documents ranked by conceptual
similarity. Matches are not based on any specific terms in the query or document,
Chapter 5
HOW DOES IT WORK?
Instead of limiting your search to the
exact words or phrases you enter,
concept searching analyzes your
query holistically—looking at how
terms are related and arranged to
identify context—creating matches to
documents with related meanings.
but the concepts found within each.
With concept searches, you can:
• Find documents conceptually related to the original
• Search based on text from external sources such as pleadings, briefs, or articles
• Create your own ideal document and search for any documents resembling your fabricated
“smoking gun”
• Discover potentially relevant documents faster
• Focus on the concepts deemed most important in a case
QUALITY CONTROL TIP
Use concept searches to verify
the coding for a given document is
consistent with the coding applied to
conceptually similar documents.
In short, text analytics can find documents you’re not sure even exist—a far more
forgiving workflow than the typical constraints imposed by a standard keyword search.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
9
6
Prioritize Your Review with Document Clusters
Chapter 6
One Friday morning, a team at Stradley Ronon Stevens & Young
received approximately 7,900 documents related to an insurance
matter. The partner on the case gave an associate until Monday
morning to identify all the key documents in the data set.
Text analytics helps firms like Stradley Ronon get the most important groups of documents to
review teams as soon as possible and batch documents by conceptual similarity for faster, more
consistent coding. With a feature called clustering, you can organize and prioritize your review
much earlier in a case.
Clustering automatically identifies and groups documents with similar concepts. It labels those
groups by the most prevalent ideas in each one and visually represents how the groups relate to
one another. Unlike a concept search, the user provides no input as to what they’re looking for—
there’s no need for subject matter experts to identify example documents.
HELPFUL TERMS AND TIPS
Generality: You can adjust “generality”
to affect the number of clusters. Set a
higher generality for fewer larger clusters
when you need a high-level look at the
concepts in your review. Set a lower
generality for many specific clusters when
you need more details.
Coherence: Clusters are assigned
a “coherence” score to indicate how
closely related its documents are to each
other. The higher the coherence, the
more closely related the documents. Set
a high minimum coherence if you need
tighter sub-clusters with more closely
related documents for shorter, quicker,
batched reviews.
Depth: Set “depth” to tell text analytics
the maximum level of sub-clusters it
should create. A lower setting yields a
simpler cluster structure if you want to do
less conceptual analysis before review.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
10
6
Prioritize Your Review with Document Clusters
The associate who used text analytics to cluster documents in Stradley Ronon’s insurance matter
found that while the majority of the clusters seemed to have nothing to do with the case, one
cluster labeled with more meaningful terms stood out. This key cluster in the opposing counsel’s
production contained approximately 450 documents—a mere six percent of the full data set. The
associate was able to complete the review in a couple of hours.
With documents in clusters, you can:
• Investigate an unknown data set in layers to quickly understand the topics involved
• Prioritize documents that are most likely to be relevant for review
• Rapidly code documents based on issues and batch them accordingly for a more
efficient review
Chapter 6
Stradley Ronon quickly
eliminated 94 percent of the
documents in their data set
as non-responsive using the
clustering feature of Relativity
Analytics. Read the complete
Customer Win for details »
READ THE DETAILS
• Overlay coding information and metadata on top of clusters to create a document heat map
and identify review inconsistencies among similar clusters
• Set aside documents that are clearly not responsive to litigation, such as those labeled with
“league,” “fantasy,” “lineup,” and “football”
“The fact that the associate—someone who had never used Relativity before—was able
to so quickly learn how the system works and then find exactly what he needed was
outstanding. Not only will he likely be using Analytics in the future, but increasingly our
attorneys are using it in all sorts of cases. They know how powerful a feature it can be.”
Brendan Curran
Litigation Support Manager, Stradley Ronon Stevens & Young
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
11
7
Categorize Case Data with Sample Documents
Chapter 7
With a collection of about 500,000 documents, an anticipated
responsiveness rate of just 2 percent, and a production deadline
of 2.5 weeks, Innovative Discovery needed multiple tools from a
text analytics toolbox.
Like the clustering feature of text analytics, categorization conceptually classifies your documents
so you can more quickly find the ones most relevant to your case. But while clustering is designed
to move a review forward without user input, categorization allows subject matter experts to
automatically group unreviewed documents into categories they define themselves with the
issues coded in a small manually reviewed set. You can also use categorization to determine if
documents are most likely to be responsive or non-responsive.
Reviewers code example documents to train the system, and categorization uses the concepts
found in the rest of the documents to group them according to the designation and issues coded
in the sample set. This means you can organize and prioritize your review around the documents
you already know are important to your case.
•
With your documents categorized, you can:
•
• Prioritize data that should be reviewed first. Allow subject matter experts to quickly get their
eyes on documents that are most likely relevant to the case and related to their areas of
expertise.
TIPS TO SUCCESSFUL CATEGORIZATION
•
• Find the important documents from an opposing production. Use coded documents from
your own data set to identify key issues and hot documents in the opposition’s production,
allowing you to zero in on the documents that will be most beneficial to building your case.
• Automate issue coding. Automatically find and code documents similar to those you’ve
•
Provide at least 5-10 examples per
category for appropriate coverage
High-quality, representative
examples yield strong categories
Each example document or excerpt
should represent only one topic or
issue
Generally speaking, the more text
in each example, the better
already tagged with key issues, a great way to make the most of manual work on a large data
set in little time.
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
12
7
Categorize Case Data with Sample Documents
Chapter 7
In Innovative Discovery’s case, Relativity Analytics reduced the number of
documents for manual review by 92 percent. They met their deadline and
saved an estimated $600,000.
$600,000
500,000
— IN SAVINGS —
documents collected
JUST
8%
manually reviewed
QUALITY CONTROL TIP
Run a sample of your manually coded
documents through categorization to
identify missed documents or coding
inconsistencies.
“Relativity empowered the legal team to
assess their case quickly, despite a large data
universe and an aggressive timeline.”
Cathy Fetgatter
Vice President of Managed Review Services, Innovative Discovery
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
13
8
Tackle Complex Cases with Computer-assisted Review
Chapter 8
When the U.S. Department of Justice investigated the AnheuserBusch InBev merger, McDermott Will & Emery were called upon
to review 1.6 million documents that could be relevant to the
U.S. DOJ’s requests for information. They had just 2 months.
Computer-assisted review helps you accelerate your review process by amplifying your team’s
efforts across any substantial document set. Text analytics (categorization) is one of the three
key elements of computer-assisted review, which also includes statistical validation and, most
importantly, actual humans.
In computer-assisted review, experts provide coded documents to a system in the form of seed
sets, and the system applies their decisions to the rest of the document universe through an
iterative workflow managed by the review team. The end result is a less costly and tedious
e-discovery experience.
Within 6 weeks, McDermott Will & Emery completed productions for the U.S.
DOJ and saved over $2 million in review costs using Relativity Assisted Review.
Read the complete Customer Win for details »
READ THE DETAILS
With computer-assisted review, you can:
• Code responsive documents more quickly for subsequent manual review by the most
qualified experts, passing non-responsive items to other reviewers
• Choose to manually review only the documents statistically validated as responsive, saving
significant time and money by eliminating unnecessary work on irrelevant data
• Create a production in a very short timeframe from a large data set in a non-litigation
HOW DOES IT WORK?
1. Your team codes example
documents with responsiveness
designations
2. During the assisted review
process, text analytics applies
your decisions across the larger
data set
3. Your results are validated with
transparent, defensible statistics
scenario, such as responding to a second request, where over-inclusiveness may not be
a concern
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
14
8
Tackle Complex Cases with Computer-assisted Review
Chapter 8
Review
QC
Sample
Report /
Verify
Regardless of the unique needs of your case, text analytics reports on
results throughout an assisted review process to help you determine,
based on statistics, when your review is complete. Because human experts
validate the decisions made by the system using statistics, reviewers
retain the control, flexibility, and transparency needed for an accurate and
defensible review.
It’s not just the U.S. DOJ—there’s also
federal court approval of assisted review.
In Rio Tinto v. Pale, Judge Andrew Peck
issued an opinion stating that whenever
a producing party wants to use assisted
review, the courts will permit it.
LEARN MORE
DOWNLOAD THE WHITE PAPER
P
A
Want to learn more about how to use assisted review to efficiently produce
defensible results? Check out the white paper, “Understanding the
Components of Assisted Review and the Workflow That Ties Them Together.”
Complete
D
Categorize
VE
Review
Training
Sample
PRO
“The DOJ recognized that computer-assisted review could
mean smaller productions with better quality information.”
Martha Louks
Discovery Consultant, McDermott Will & Emery
Copyright © 2015 kCura LLC. All rights reserved.
Accelerate Your e-Discovery Efforts
15
What Will You Accomplish with Text Analytics?
Conclusion
While electronically stored information continues to grow and cases become more complex,
technology is keeping up with the challenge and becoming increasingly flexible. A better
understanding of the ways text analytics speeds up review is the first step to making your job
a lot easier.
Relativity Analytics helps e-discovery professionals leverage the power of technology to more
effectively handle their data in any combination of the eight ways discussed in this e-book.
Integrate Analytics into your workflow to conduct document review more efficiently and with more
transparency than ever before, and find out for yourself why Analytics usage increased by
50 percent to a total of 222, 379 gigabytes indexed in the past year alone.
Take the next step. Let us know how we can
help you catch up with your e-discovery workload.
SCHEDULE A DEMO
Two Ravinia Drive, Suite 850
Atlanta, GA 30346, USA
www.DTIGlobal.com
Copyright © 2015 kCura LLC. All rights reserved.
Relativity Analytics has indexed over
half a petabyte of real-world case data.
That’s the equivalent of 20 MILLION
two-drawer file cabinets filled with paper.
231 South LaSalle Street, 8th Floor
Chicago, IL 60604, USA
T: +1 312.263.1177 • F: +1 312.263.4351
sales@kcura.com • www.kcura.com
Accelerate Your e-Discovery Efforts
16