Webblaze Login
Commonwealth Legal Logo
Search Commonwealthlegal.com:  
 
SERVICES RESOURCES NEWS & EVENTS CAREERS ABOUT US  
Photo: the only way to be prepared.
CONTACT US | FRANÇAIS | HOME
  
Forensic Data Collection

Scanning

Coding

E-Discovery

Near-Duplicate Detection

Web Repository Services

Offshore Next Door

Electronic Closing Books

Electronic Appeal Books

Digital Printing and Photocopying

Document Management Consulting

Software and Hardware Sales

Software Training and Technical Support

Data Archiving

Electronic Discovery Legal Technology Consulting

ESI Culling and Searching Services
 
  

NEAR-DUPLICATE DETECTION

Our near-duplicate detection service is a pre-review procedure which identifies duplicates and near-duplicate documents. The results are then output in a suitable format for many standard litigation support programs.

Near duplicates are similar documents containing formatting and/or textual differences. They are distinct from duplicates which are exact copies of a file. They include files with a percentage of textual differences (by far the most common), variances in formatting (such as bold or italicized fonts) or different file types (such as an MS Word file converted to PDF). This technology can be used with electronic documents (native file), scanned/OCR’d collections or a combination of both.

The cost of reviewing documents relevant to a litigation matter can be substantial. As technology has given us the ability to create, print and/or store more and more versions of the same document, we are seeing those review costs increase. Identifying and grouping these documents prior to a full document review reduces the review time, frees up expensive legal resources and ultimately saves the litigants money.

Statistics have shown every document collection contains between 25% and 50% near-duplicates. Reviewing similar documents over and over again can be frustrating. Furthermore, having different reviewers look at similar documents can be alarming as they treat similar documents differently. Near-duplicate documents identified early and reviewed and evaluated in the same manner will help ensure an accurate and consistent database.

Our near-duplicate detection service is relatively non-intrusive in the litigation process and data work flow. It creates data for each document and this data is then recorded in the database. The data indicates if the document is unique within the collection or if it is not unique then shows the duplicate or near-duplicate set (aka EquiSet) to which the document belongs. A Pivot document is selected within the EquiSet and becomes the reference against which the rest of the documents will be compared, based on a percentage of similarity.

The near-duplicate detection technology is very intuitive. When it groups two documents together in an EquiSet and says they are 95% similar, the reviewer will know that no more than 5% of the text between these documents is different. Furthermore, comparison software can be utilized to compare the two documents to determine exactly what the differences are.

We provide an extract utility to facilitate loading of the near-duplicates metadata into your review system such as CT Summation, FTI Ringtail, iConect and Dataflight’s Concordance. Once the data is loaded into the review system, reviewers are able to work with sets instead of single documents. This “set-centric paradigm” allows for a much more systematic review process. This system has proven to be a very cost effective solution for native electronic documents as well as OCR’d (Optical Character Recognition) documents.

The ROI on near-duplicate detection is very concrete. Industry studies have shown it costs at least $3 to $4 per document for the review. The near-duplicate detection process costs pennies per document and results have been technically proven as accurate. By eliminating the need to re-read similar documents in their entirety, the review costs is reduced proportionately

The near-duplicate detection ROI Calculator can demonstrate the cost savings in using this service. Contact a Commonwealth Legal representative in your area to access the calculator and for further information regarding our Near-Duplicate Detection service.



Online Service Inquiry
If you would like to request more information on our services or would like to send a comment or question to our sales, administration or marketing teams, please complete and submit our Online Service Inquiry Form available by clicking here.


  

Photo: Box and CD