Transcript
A Revolution in e-DiscoveryThe Persuasive Economics of the Document Analytic Approach
KPMG FORENSIC
ADVISORY
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
“The cost of e-discovery—probably the largest single cost in litigation
today—poses an economic threat to any company facing litigation.
In cases with large volumes of potentially relevant documents, doc-
ument analytics can be an effective and strategic tool for managing
costs and meeting electronic discovery challenges.”
—Stephanie Mendelsohn, Partner, Reed SmithRegulatory Litigation Group, e-Discovery & Records Management Team, and
The Sedona Conference Working Group on Electronic Document Retention and Production
The use of electronic media in business has led to a massive explosion of digital docu-
ments, especially e-mail. E-discovery—discovery of digital documents—by traditional
means is grossly unsuited to handling the growing volume. In this case, “traditional”
refers to using either paper or “e-paper.” Those organizations that have a traditional
mindset toward e-discovery are seeing their costs soar.
However, law firms now have the technological capability to easily manage e-discovery
by using sophisticated “document analytics” in a purely electronic environment, and
more effectively and efficiently than paper- or e-paper-based approaches allow. Because
of this breakthrough, law firms have the opportunity to provide more competitive, cost-
efficient services to their clients, and corporate counsel are enabled to drastically reduce
litigation costs for their organizations. Unfortunately, the persuasive economics driving
this new method remains widely unknown.
This paper provides an apples-to-apples comparison of traditional and nontraditional
e-discovery approaches to help law firms and corporate counsel evaluate their options.
Using our KPMG ForensicSM engagement experience with dozens of companies, we
break down each approach into process components and analyze these in terms of
cost, time, and effectiveness. Based on the assumptions we have used for this study
(e.g., document review rates, attorney hourly rates, photocopy rates), we calculate
that the document analytic approach can be nearly 10 times more cost-effective than
traditional approaches. We offer our template as an analytical tool to help the reader
arrive at an independent conclusion based on his or her own cost assumptions.
Foreword
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
CONTENTS
Background 1
e-Discovery Approaches 2
Hard Copy Approach 2
“Imaged” Hard Copy Approach 3
e-Paper Approach 5
Document Analytic Approach 5
Comparing the Approaches 6
Cost and Time Comparisons 7
Transition to the Document Analytic Approach 12
Document Analytic Success Story 13
Conclusion 14
We’re concerned with e-discovery in the first place because as much as 92 percent of
information produced each year is stored in digital format, according to a 2003 study1 by
the University of California, Berkeley. Businesses are far more prolific in generating digital
data than paper documents. Any e-discovery approach needs to deal with a large—and
rising—volume of unstructured digital data, particularly in e-mail format.
Additionally, this study concluded that fewer than 10 billion e-mails were sent per day
worldwide in the year 2000. Researchers expect this number to increase to 61 billion by
2006, with spam accounting for as much as one third.
One reason for the growing volume of stored e-mail is the precipitous decline in the cost
of digital storage, thanks to commoditization and lower-priced, high capacity drives making
it to market. For example, compare the $193 price per megabyte of storage in 1980 with
today’s price of less than $0.01 per megabyte.
Volume isn’t the only challenge for e-discovery. According to Law Technology News
(January 2004), as much as 70 percent of e-mails and corporate documents are duplicates.
Extraneous and repetitive data escalates costs at every stage of any e-discovery process.
An effective e-discovery approach must be able to separate the wheat from the chaff as
early as possible to avoid incurring exorbitant costs.
Since the essence of discovery is sharing of documents, the issues concerning volume
and redundancy, as well as file size (gigabytes) and ease of transmission, become partic-
ularly acute when companies face multiproduction matters or multijurisdictional litigation,
both of which require coordinating discovery among widely dispersed legal support teams.
Law firms need to compare the cost and staffing requirements associated with printing,
packing, and trucking documents to Web-based file sharing, for example.
All e-discovery approaches have two major types of components: (1) document review
and (2) various mechanical processes needed to prepare documents for review and to
implement “production.” Since document review is performed by an attorney or paralegal
and is highly labor intensive, this aspect of e-discovery is extremely costly. Mechanical
(nonreview) processes are less so, although highly inefficient methods can drive up costs.
Any approach that emphasizes mechanical processes (the more computerized the better)
and contains billable hours by attorneys will tend to prevail.
A Revolution in e-Discovery 1
Background
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
1 www.sims.berkeley.edu/research/projects/how-much-info-2003/
2 A Revolution in e-Discovery
We present four different approaches to handling one hypothetical set of electronic docu-
ments in a typical discovery matter. In the first approach, we apply a traditional hard copy
methodology (digital to print), which results in tremendous effort and costs.
In the second approach, we force a hard copy methodology (digital to print to digital) onto
the electronic review environment.
In the third approach, we present what we refer to as a “first generation” e-discovery
process. While the software and data standards differ somewhat from the paper discov-
ery environment (digital to TIFF, PDF, or HTML), the approach is essentially the same.
Finally, we present the “document analytic” approach, which keeps documents native2 to
limit conversion costs and applies concept clustering and mapping technology to group
related documents and expedite the review.
The first phase in all e-discovery approaches is preserving the electronic data. We will refer
to the resulting data file as the “corpus.” How quickly each e-discovery approach can
reduce the size of the corpus has everything to do with its ultimate effectiveness.
The following graphics indicate the phases for each approach—preserve, pare (in some
cases), process, produce—as well as the sequential steps within each phase. Each step
represents a line item on the e-discovery bill to corporate counsel. The height of each
“arrow” roughly corresponds to the phased reduction of the size of the corpus, which
directly correlates to the dollar amount of the bill.
Hard Copy Approach
In the hard copy approach, the electronic evidence is first preserved and a copy is made
available to the review team. This working copy is then printed to paper. Next, the pages
are serially “Bates” numbered for reference and several working photocopies are made.
Only after the review sets have been Bates numbered and copied can the attorneys per-
form their reviews—on a document-by-document or page-by-page basis. Those documents
deemed responsive to the case are then flagged (via a coding sheet, document flag notes,
or color indicator markings), which assigns metadata and intelligence to the document.
Finally, the responsive set is photocopied for production.
e-Discovery Approaches
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
P R E S E R V E P R O C E S S P R O D U C E
DeliverCopy
Responsive
DocumentReview &
CodingCopy
BatesNumbering
PrintPreserve
The height of each “arrow” roughly corresponds to the phased reduction of the size of the corpus.
2 For definitions of technical terms, please see the e-Discovery Glossary on page 4.
A Revolution in e-Discovery 3
Printing everything to hard copy casts a wide net over the population of documents and
assures that the corpus excludes nothing in the early phase that could be relevant to the
case. However, by maximizing the size of the corpus, this approach increases the risk of
omitting important documents during review due to paper mishandling or simple error.
Attorneys and paralegals often use page decisions or document decisions per hour met-
rics to measure the progress of their matters and the review team’s efficiency. These
rates are at their lowest in the hard copy approach relative to other approaches.3
“Imaged” Hard Copy Approach
Similar to the hard copy approach, the “imaged” hard copy approach preserves the elec-
tronic evidence and proceeds to print the data to paper. From the working hard copy,
the documents are unitized to reflect document breaks and scanned to image. An OCR
engine is run on the scanned images to enable full text keyword searches, and the docu-
ments are coded manually to capture specific, predefined metadata (author, date, cc, bcc,
etc.). Once imaged, the documents are loaded to a repository on a discovery management
software platform. Attorneys can then perform the document review, collaborate on the
documents, and manage production sets.
Counterintuitive as this approach may seem—data to paper and then back to data—it is
performed by a considerable number of law firms and e-discovery organizations.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
3 Many professionals are under the false impression that they can review pages faster by flipping them in the traditional waythan by using a computer and a mouse (see e-paper and document analytic approaches below). In fact, the physical activity offlipping pages does not equate to speed or effectiveness. That’s because the paper review approach does not efficiently andaccurately capture useful information about the documents and make that information quickly available to the discovery team.
Deliver
P R E S E R V E P R O C E S S P R O D U C E
The height of each “arrow” roughly corresponds to the phased reduction of the size of the corpus.
DocumentReview
Load toRepository
CodeOCRPrep
& ScanPrintPreserve
4 A Revolution in e-Discovery
Bates number Bates numbering goes back
to the 1890s, when Bates Manufacturing
Company in New York invested in an auto-
matic handheld numbering machine. “Bates
Stamping” is the process of placing sequen-
tial numbers on a page.
corpus The complete data file(s) subject to
e-discovery. The size of the corpus is reduced
through the various phases of e-discovery.
custodian An individual whose e-mail and
data are subject to review. Custodians are
also potential witnesses.
dedupe Short for deduplicate. This is the
process of suppressing exact binary dupli-
cates for purposes of review.
document analytics For purposes of an
investigation or litigation, KPMG defines doc-
ument analytics as the emerging practice of
applying algorithms and technology to iden-
tify relationships and relevance of documents
within a group.
e-paper Short for electronic paper as viewed
in a static, rastered image, such as a TIFF or
PDF file format.
metadata Literally, data about data.
Metadata describes how, when, and by
whom a particular set of data was collected,
and how the data is formatted. In the legal
context, metadata contains document cre-
ation and/or modified date, author, file paths,
and similar information.
native file Refers to a file in the original or
default file format of a specific software
application.
OCR Optical character recognition. The
branch of computer science that involves
reading text from paper and translating the
images into a form that the computer can
manipulate.
pare Usually the second phase of e-discov-
ery. Eliminate nonresponsive, duplicative, and
irrelevant data by applying such criteria as
date, custodian, file type, key words, and
native file review technologies, such as docu-
ment analytics.
PDF Portable document format. A file format
developed by Adobe Systems that makes it
possible to send formatted documents and
have them appear on the recipient’s monitor
and printer as they were intended.
preserve The first phase of e-discovery.
Creating a duplicate of potentially relevant
and responsive data to prevent loss and
establish a full “chain of custody.”
process Usually the third phase of e-discovery.
Mechanically or electronically prepare the
corpus for production, often by converting
documents to TIFF or PDF file formats.
produce, production Usually the fourth
phase of e-discovery. Includes reviewing
documents for privilege, redacting privileged
and/or confidential information, assigning pro-
duction numbers, and transferring documents
to appropriate media for delivery to opposing
parties. This is often performed by using
discovery management software.
raster To print an image to a static, bitmap
file, such as a TIFF.
responsive Relevant to the matter that is
the subject of discovery or e-discovery.
TIFF Tagged image file format. One of the
most widely supported file formats for storing
bitmapped images on personal computers.
unitize To separate or classify into units,
such as “documents.”
Note: Some definitions sourced from merriamwebster.com and webopedia.com.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
e-Discovery Glossary
A Revolution in e-Discovery 5
e-Paper Approach
The e-paper approach is used by most of today’s service providers. Upon preservation,
data is kept in its native electronic format where duplicates are suppressed and data is
culled by keywords and metadata, such as date ranges, file types, and so on. The result-
ing dataset is rastered into static TIFF, PDF, or HTML images, which are then loaded along
with their associated metadata to an online discovery management software platform.
Attorneys can then perform the document review, collaborate on the documents, and
manage production sets.
Document Analytic Approach
Given the challenges and costs of dealing with today’s large discovery matters, as well as
the inherent limitations and risks of keyword searching, companies and service providers
are beginning to establish more sophisticated criteria in their initial culling and rastering
of data. Hence, the emergence of document analytics.
For purposes of an investigation or litigation, KPMG defines document analytics as the
emerging practice of applying algorithms and technology to identify relationships and rel-
evance of documents within a group.
Using document analytics, “like” documents are clustered based on the co-occurrence
of their respective noun or noun phrases in a native review environment. For example,
documents related to accounting will likely contain words such as income statement,
balance sheet, cash flow, or reconciliation, whereas documents related to personal
e-mail may include nouns such as football game, dinner plan, or spouse’s name. Accord-
ingly, the technology is able to distinguish between the two categories of documents
and cluster them separately.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
DocumentPreparation
Load toRepository
Extract & Raster
NativeReview
Dedupe& Cull
Preserve
P R E S E R V E P A R E P R O C E S S P R O D U C E
DeliverDocument
Review
Deliver
Load toRepository
Extract& Raster
Dedupe& Cull
Preserve
P R E S E R V E P A R E P R O C E S S P R O D U C E
The height of each “arrow” roughly corresponds to the phased reduction of the size of the corpus.
The height of each “arrow” roughly corresponds to the phased reduction of the size of the corpus.
6 A Revolution in e-Discovery
Document analytics takes e-discovery automation beyond simple deduping and culling.
It also assists in the document review by enabling the attorney or paralegal to digitally
highlight and explore clusters and complete the review function many times more
quickly than earlier approaches (see example of screen at left).
After this native review, the responsive set of documents—much smaller than in the
e-paper approach—is rastered and loaded to a discovery management platform where
further redaction and quality control take place.
Comparing the Approaches
The hard copy approach suffers by (1) capturing documents with obsolete processes and
(2) casting the widest possible net. With this approach, no paring occurs prior to review,
as indicated in the table below.
The same is true of the imaged hard copy approach. However, the main advantage here
is that review can be done with a computer and Boolean and keyword searches rather
than with stacks of paper documents.
The e-paper approach provides the first significant improvement in the e-discovery process
by removing duplicates and culling by keywords and metadata early on, thereby drastically
reducing volume and the cost and time required for subsequent steps. However, the
e-paper approach, like the hard copy approach, is still unable to put language and other data
into any type of context from one document to the next.
Finally, the analytic approach revolutionizes e-discovery by categorizing and relationally
sorting documents according to content, thereby facilitating drastic time-savings in docu-
ment review.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
C O M P A R I S O N O F A P P R O A C H E S
HARD COPY IMAGED HARD COPY E-PAPER DOCUMENT ANALYTIC
Preserve Preserve Preserve Preserve Preserve
Pare Dedupe and cull Dedupe and cullAttorney native document review
Process Print Print Extract and raster Extract and raster
Bates number Prep and scan Repository load Repository load
Copy OCR Document prep
Code
Repository load
Produce Attorney document review and coding Attorney document review Attorney document review TIFF review
Photocopy responsive documents Deliver Deliver Deliver
Deliver
A Revolution in e-Discovery 7
A Hypothetical Case
Applying the four approaches to a hypothetical case provides an apples-to-apples cost
comparison. We use a scenario of a small to midsized e-discovery case that requires
a production to a government agency or an adversary in a litigation context. The scope
of the case is as follows:
• 15 custodians = 15 hard drives
• Assume an average of 2 gigabytes (GB) of data per custodian (1 GB = 1.024 billion bytes)
• Total estimated data size = 30 GB
• Assume 50,000 pages per GB (if printed out or imaged)
• Total page estimate = 1.5 million pages
• Assuming 2,000 pages per box if printed, total boxes = 750 boxes
Cost Calculations Using KPMG Forensic Pricing Assumptions
The following tables provide cost calculations for each approach using “standard” rates.
The reader may wish to substitute different rates. Again, we view each approach in
terms of activities that are performed on the corpus: preserve, pare, process, and pro-
duce. Note that the first two (traditional) approaches skip the “paring” phase.
Cost and Time Comparisons
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
8 A Revolution in e-Discovery
H A R D C O P Y A P P R O A C H
QUANTITY RATE COST ($)
Preserve 15 hard drives $800 ea 12,000
Pare NA NA NA
ProcessPrint: 30 GBs of data x 50,000 pages/GB = 1.5 million pages $0.06 ea 90,000
Bates number 1.5 million pages $0.02 ea 30,000
Copy: create two sets (client copy and law firm working copy) = 3 million pages $0.06 ea 180,000
ProduceDocument review and issue coding: 1.5 million pages @ 100 page decisions per hour 15,000 hours $200/hr 3,000,000*
Photocopy responsive documents: assume 10% of paper is responsive x 1.5 million pages = 150,000 pages $0.06 ea 9,000
Total $3,321,000
*This cost does not reflect facilities and support for managing 750 boxes of paper.
“ I M A G E D ” H A R D C O P Y A P P R O A C H
QUANTITY RATE COST ($)
Preserve 15 hard drives $800 ea 12,000
Pare NA NA NA
ProcessPrint: 30 GBs of data x 50,000 page/GB = 1.5 million pages $0.06 ea 90,000
Prepare and scan: prepare and unitize the pages by document, then scan 1.5 million pages $0.18 ea 270,000
OCR scan 1.5 million pages $0.06 ea 90,000
Objective coding: 1.5 million pages @ 4 pages per document = 375,000 docs $1.50 ea 562,500
Repository load: load 1.5 million TIFF images (approx. 88 GBs) = 6 months hosting $6,500/mo 39,000
Produce*
Document review: 1.5 million pages @ 200 page decisions per hour 7,500 hours $200/hr 1,500,000
Delivery: assume 10% of pages are responsive x 1.5 million pages = 150,000 pages $0.04 ea 6,000
Total $2,569,500
*Given keyword search capabilities (via OCR) and bibliographic search capabilities (via objective coding), we assume the review team can review 200pages per hour versus hard copy at 100 pages per hour.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
E - P A P E R A P P R O A C H
QUANTITY RATE COST ($)
Preserve 15 hard drives $800 ea 12,000
Pare1
Dedupe & cull 1.5 million pages Typically billed with extract & raster in theProcess phase, not as a separate item
ProcessExtract & raster: assume 50% of the 1.5 million pages are suppressed via deduping, keyword, and other metadata filtering 750,000 pages $0.15 ea 112,500
Repository load: 750,000 TIFF or PDF images (approximately 44 GBs when rastered) 6 months $3,500/mo 21,000
Produce2
Review: 750,000 pages @ 200 page decisions per hour = 3,750 hours $200/hr 750,000
Deliver: assume 20% of e-paper is responsive (net of duplicates) x 750,000 = 150,000 pages $0.04 ea 6,000
Total $901,5001 Paring here includes keyword searching and deduping, but does not include native file review as in document analytics.2 Given keyword search capabilities (via OCR) and bibliographic search capabilities (via objective coding), we assume the review team can review 200pages per hour versus hard copy at 100 pages per hour.
D O C U M E N T A N A L Y T I C A P P R O A C H
QUANTITY RATE COST ($)
Preserve 15 hard drives $800 ea 12,000
PareOriginal GB: dedupe and keyword searching 30 gigabytes $1,000/GB 30,720
Load data for native review 15 gigabytes $3,500/GB 53,760
Native file review for responsiveness: 15 GBs x 50,000 pages/GB = 750,000 pages750,000 pages @ 1,000 pages/hour = 750 hours $200/hr 150,000
ProcessExtract & raster: assume 20% of data reviewed is deemed responsive after native review 20% of 15 GBs = 3 GBs x 50,000 pages/GB = 150,000 pages $0.15 ea 22,500
Repository Load: 150,000 TIFF images 6 months $1,000/mo 6,000
ProduceTIFF review for privilege, redaction, and production: 150,000 pages @ 400 page decisions per hour = 375 hours $200/hr 75,000
Deliver 150,000 pages $0.04 ea 6,000
Total $355,980
A Revolution in e-Discovery 9 © 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
10 A Revolution in e-Discovery
Nonreview and Review Cost Comparison
Review costs represent the largest single “nut” of the total costs in all approaches. The
hard copy approach uses rudimentary processes, which necessitate extensive review
hours. The “imaged” hard copy approach halves the review cost but triples nonreview
costs.
N O N R E V I E W V E R S U S R E V I E W C O S T S
APPROACH NONREVIEW REVIEW PERCENTAGE REVIEW
Hard copy $ 321,000 $3,000,000 90%
“Imaged” hard copy 1,069,500 1,500,000 58%
e-Paper 151,500 750,000 83%
Document analytic 130,980 225,000 63%
The e-paper approach substantially reduces nonreview costs by using a digital envi-
ronment to perform deduping and culling tasks. The major breakthrough occurs with
the document analytic approach, which drastically reduces review time as we have
noted above.
Nonreview and Review Time Comparison
The corpus size and review speed contribute directly to the time required for completing
the e-discovery process. Differences in time required are apparent even between the
e-paper approach and document analytics, as shown in the graphs on page 11.
Although the e-paper approach includes a deduping function, it takes longer at every stage
to reduce the corpus.
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
A Revolution in e-Discovery 11
Review Analysis
Review costs have a significant impact on total e-discovery costs because of the high
hourly rate of attorneys ($200/hour in the sample). It is therefore useful to isolate
the factors that contribute to review costs, namely page quantities (the corpus) and
review speed.
R E V I E W T I M E = C O R P U S X R E V I E W S P E E D
APPROACH PAGES IN CORPUS REVIEW SPEED
Hard copy 1.5 million 100 pages/hour
“Imaged” hard copy 1.5 million 200 pages/hour
e-Paper 750,000 200 pages/hour
Document analytic 750,000 1,000 pages/hour
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
T I M E & E X P E N S E
D O C U M E N T A N A L Y T I C A P P R O A C H
CO
RP
US
SIZ
E
TIME & EXPENSE SAVINGS
E - P A P E R A P P R O A C H
T I M E & E X P E N S E
CO
RP
US
SIZ
E
PR
ES
ER
VE
PA
RE
PR
OC
ES
S
PR
OD
UC
E
PR
ES
ER
VE
PA
RE
PR
OC
ES
S
PR
OD
UC
E
12 A Revolution in e-Discovery
A law firm may have little incentive to upgrade its e-discovery approach if it is already
successfully billing for traditional services. Corporate counsel may be unaware of potential
cost savings in the discovery phase of litigation, or it may have limited ability to engineer
these savings through its law firm. Needless to say, what “works” in the short term—
potentially “overcharging” clients—may not work in the long term since both vendors and
buyers will need to stay competitive. Inevitably, market pressures will assert themselves,
especially when the transition to “modern” e-discovery is so painless.
Modern e-discovery is a “buy,” not “build,” proposition. That’s because few organizations
can afford the investment required to build an efficient e-discovery infrastructure. Nor do
they need to. The economics of the document analytics approach to e-discovery means
that a law firm can outsource the entire document management process to a dedicated
e-discovery provider and still come out ahead.
With outsourcing, the law firm gets out of the document management business—photo-
copying, coding, imaging, rastering, boxing, shipping, and so on—enabling it to reduce or
reallocate support staff. With the review phase highly streamlined, attorneys can spend
more time on higher-value strategic issues of the case, thereby offering better service to
their clients. Modernized e-discovery offers the potential for both higher profit margins for
law firms and reduced bills for corporate counsel.
Change management issues are minimal when a law firm outsources document manage-
ment to a professional. These may include:
• Staffing changes
• Training in document sharing and review technology
• New vendor oversight responsibilities
• New cost accounting.
For some organizations a myth persists that introducing new technology results in “loss
of control.” In reality, new technology can offer different and better controls. In the case
of document analytics and Web-based document sharing, law firms and clients have the
ability to assess exposure quickly with “snapshot” reviews, obtain automated audit
trails and status reporting, find documents quickly and in proper context, view the matter
as a whole or in parts, and leverage past work.
Transition to the Document Analytic Approach
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
A Revolution in e-Discovery 13
An e-discovery service provider assisted a client’s corporate and outside counsel during
the course of an SEC investigation of suspected accounting irregularities related to rev-
enue recognition. The scope of the investigation included 21 “high-interest” employees
and their preserved e-mail, shared network files, and local hard drives.
The service provider performed the services described in the table below, using a docu-
ment analytic approach.
S E C I N V E S T I G A T I O N C A S E
ACTIVITY TIME REQUIRED DERIVED RATE
Preserve Identified and preserved 168 GBs (3,829,000 pages) based on custodian & date range 192 hours1 19,943 pages/hour
Pare Suppressed 62.5% of the 3,829,000 pages (duplicate and e-mail threads) to arrive at a review set of 1,438,500 pages 96 hours 39,885 pages/hour
Ran two 12-hour shifts of 30 reviewers for 3 days; reviewed 1,438,500 pages; identified 2,625 responsive pages (<.01%) 2,160 hours 666 pages/hour
Process Convert responsive set to TIFF images. Apply redaction and Bates numbering. 12 hours2 219 pages/hour
Produce Delivered 2,625 pages via compact disc 48 hours NA3
Total 2,508 hours
1 “Hours” in this and the following table refer to “people hours.”2 Assuming 2 million TIFF images or fewer, 12 hours is the minimum and includes setup, quality control, and data validation.3 Production is process driven, rather than page driven.
This SEC case provides a concrete example of activity rates and total time required for
each phase of e-discovery, from preservation to production. If we apply the same rates
to the hypothetical case (except where we have already included a rate), we obtain the
following:
H Y P O T H E T I C A L C A S E
ACTIVITY APPLIED RATE ESTIMATED TIME
Preserve Identified and preserved 1.5 million pages 19,943 pages/hour 75 hours
Pare Suppressed 50% of the 1.5 million pages to arrive at a review set of 750,000 pages 39,885 pages/hour 38 hours
Reviewed 750,000 pages, identifying 150,000 responsive pages (20%) 1,000 pages/hour1 750 hours
Process Convert responsive set to TIFF images. Apply redaction and Bates numbering. NA 12 hours2
Produce 150,000 pages NA 8 hours
Total 883 hours3
1 Uses the assigned rate of the hypothetical case. Using the rate of the SEC case (666 pages/hour) would yield a total of 1,126 hours required for the e-discovery.
2 Assuming 2 million TIFF images or fewer, 12 hours is the minimum and includes setup, quality control, and data validation.3 Using the review rate of the SEC case (666 pages/hour) the total would be 1,259 hours.
Document Analytic Success Story
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
© 2
005
KP
MG
LLP
, the
U.S
. mem
ber
firm
of
KP
MG
Inte
rnat
iona
l, a
Swis
s co
oper
ativ
e. A
ll rig
hts
rese
rved
. 050
117
KP
MG
and
the
KP
MG
logo
are
reg
iste
red
trad
emak
rs o
f K
PM
GIn
tern
atio
nal.
KP
MG
Fore
nsic
is a
ser
vice
mar
k of
KP
MG
Inte
rnat
iona
l.
14 A Revolution in e-Discovery
By looking at four e-discovery approaches applied to one scenario, we see that e-discovery
efficiency depends on the efficiency of the review component, which in turn depends on
the efficiency of the mechanical processes that prepare the discoverable material (the
corpus) for review. Time is money.
Photocopiers, computerized imaging, computerized deduping and culling, Internet-based
document sharing, and document analytics have successively streamlined the process.
Content relationships are now at the fingertips of the document reviewer. There is little
doubt that the review itself will become further automated.
Law firms and corporate counsel need to consider the ease and potential cost savings
of the outsourcing option, the efficacy of document analytics, and long-term competitive
pressures to realize cost savings. KPMG Forensic’s template may serve as a useful tool
for validating one approach over another, depending on expense rates and other market-
place variables. Our hypothetical sample results show document analytics to be 2.53
times more cost effective than e-paper, 7.22 times more cost effective than “imaged”
hard copy, and 9.33 times more cost effective than traditional hard copy. These findings,
our common-sense analysis of each approach, and empirical evidence from the SEC case
all argue strongly for e-discovery modernization.
For further information about this white paper or KPMG Forensic, please contact:
Chris Paskach
Partner in Charge
Forensic Technology Services
714-934-5442
cpaskach@kpmg.com
Vince Walden
Manager
Forensic Technology Services
714-934-5429
vwalden@kpmg.com
Conclusion
The information contained herein is of a general nature and is not intended to address the circum-stances of any particular individual or entity. Although we endeavor to provide accurate and timelyinformation, there can be no guarantee that such information is accurate as of the date it is receivedor that it will continue to be accurate in the future. No one should act on such information withoutappropriate professional advice after a thorough examination of the particular situation.
For additional news and information, please access KPMG LLP’s Web site at http://www.us.kpmg.com.
top related