Top Banner
Processing User Guide Version 8.0 | May 10, 2013 For the most recent version of this document, visit our documentation website .
55

Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Feb 24, 2019

Download

Documents

leduong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Processing User GuideVersion 8.0 | May 10, 2013

For the most recent version of this document, visit our documentation website.

Page 2: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 2

Table of Contents

1 Processing 4

2 Installing and configuring processing 4

2.1 License considerations 4

2.2 Importing the Processing application 4

2.3 Upgrade notes 5

2.4 Processing servers 5

2.5 Fields 6

2.6 Processing servers on resource pools 6

2.7 Processing agents 7

2.8 Creating a choice for the processing source location 8

3 Supported file types 8

3.1 Container file types 9

4 Home tab 9

4.1 Home tab consoles 11

5 Settings tab 13

5.1 Editing field mappings 17

5.2 Invalid OCR language combinations 18

5.3 Processing fields 18

5.3.1 Required fields 18

5.3.2 Optional fields 19

5.3.3 Pivot-enabled fields for processing 21

6 Custodians tab 22

6.1 Creating a new custodian 22

6.2 Fields 23

6.3 Viewing or editing custodian details 23

7 Password bank tab 24

7.1 Creating or deleting a Password Bank entry 24

7.2 Fields 25

7.3 Validations, errors, and exceptions 27

7.4 View audits 27

8 Processing sets tab 28

8.1 Creating a processing set 29

8.2 Fields 30

8.2.1 Invalid characters for the File extension(s) field 35

Page 3: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 3

8.3 Deleting a processing set 36

9 Running a processing set 36

9.1 Discovering files 36

9.1.1 Reading processing status 37

9.1.2 Canceling discovery 39

9.2 Viewing the discovery report 40

9.3 Publishing files 41

9.3.1 Canceling publishing 41

9.3.2 Republishing files 42

9.4 Viewing errors 42

10 Errors tab 42

10.1 Reading individual processing errors 44

10.2 Ignoring errors 44

10.3 Un-ignoring errors 45

10.4 Retrying errors 46

10.4.1 Reading retry status for file discovery and publishing errors 46

10.5 Unresolvable errors 47

11 Reports tab 48

11.1 Generating a processing report 48

12 Managing the processing queue 50

13 Processing workers tab 51

13.1 Stopping or starting a worker 52

13.2 Enabling or disabling a processing worker 52

14 Technical considerations for deduplication 53

14.1 Loose files 53

14.2 Emails 53

Page 4: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 4

1 ProcessingRelativity’s processing feature allows you to ingest raw data directly into your workspace for eventual searchand reviewwithout the need for an external tool. You can also use processing to create custom processingjobs to handle a wide array of information.

2 Installing and configuring processingYou must have the following in order to use processing:

n A processing license. For steps on obtaining a Processing license, see the Licensing Guide.n A processing server installed and configured. See the Processing Server Installation Guide for more

information.n A processing server attached to the resource pool in which the workspace resides.n AWindows Authenticated processing Web API path specified in the configuration table. If you're cur-

rently only using forms authentication, you need to create a newRelativity site. See theMixedAuthentication Guide for more information.

2.1 License considerationsYou are unable to process data in Relativity if any of the following conditions is true:

n There is no processing license associated with your environment.n The processing license associated with your environment is invalid.n The processing license associated with your environment is expired.n The processing server associated with the resource pool is not included in the processing license.

Contact your System Administrator if any of these occur. See the Admin Guide for more information onLicensing.

2.2 Importing the Processing applicationTo install processing in your Relativity environment, import the Processing application from the applicationslibrary. To do this, you must have script administrator rights.You must have obtained a Processing license before you can import the Processing application. For steps onobtaining a Processing license, see the Licensing Guide.To import the Processing application:

1. Navigate to the Relativity Applications tab.2. Click New Relativity Application.3. Select Select from applications library.

4. Click on the Choose from applications library field.5. Select Processing and click OK.

Page 5: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 5

6. Click Import.

2.3 Upgrade notesWhen upgrading the Processing application from 7.5 to Relativity 8, we strongly recommend that you firstcomplete any outstanding processing sets in 7.5 before upgrading. However, note the following if youperform an upgrade and outstanding processing sets exist in 7.5:

n All documents published in 7.5 will retain the 7.5 document numbering format of nine digits.n All documents published or republished in Relativity 8 will have the new 10 digit document numbering

format. This new format extends to the Attachment Document ID, Parent Document ID, and Group IDfields.

n Documents republished in Relativity 8 could potentially be duplicated with the new document num-bering format.

n Reference fields such as the Attachment Document ID, Parent Document ID, and Group ID on doc-uments republished in Relativity 8may not accurately reference the correct documents.

Note: Starting in Relativity 8, when you upgrade the Processing application, your global settings aren'tover-written, which means that you aren't required to re-set them. Likewise, your field mappings aren'toverwritten, which means you aren't required to re-map them.

2.4 Processing serversTo enable processing in your workspace, you must add a Processing Server to your Relativity environmentthrough the Servers tab in Admin Mode. To do this:

1. Switch to Adminmode and navigate to the Servers tab.2. Click New Resource Server.3. Complete the fields. See Fields on the next page.4. Click Save.

Page 6: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 6

2.5 Fields

The Resources Server Information page contains the following fields:

n Name - the name you want the processing server to be listed byn Type - the type of resource server you are adding. Select Processing Server.n URL - the valid URL of the processing server you want to add. This must be valid in order to use

processing in your workspace. The correct format is as follows: net.tcp://<fully-qualified domainname>:6859/InvariantAPI/

n Status - the current status of the server. Select Active.

2.6 Processing servers on resource poolsYou must add a Processing Server to the resource pool associated with the workspace that is hostingProcessing. You can only have one Processing Server per resource pool.

Page 7: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 7

Note: Don't change the Processing Server in a resource pool after you've processed data in a workspacethat uses that resource pool. Changing the Processing Server after data has been processed causesunexpected results with retrying errors, deduplication, and document numbering. This is because a newserver is not aware of what has happened in the workspace before it was added.

2.7 Processing agentsThe processing application uses the following agents:

n Server Manager agent - retrieves version information from the processing server and updates Relativitywith this information

n Processing Set Manager agent -manages the running of processing sets and retrieves errorsencountered while sets are running

When you install the Processing application, one of each of these agents is automatically included. If yourworkspace requires additional agents, you can add them manually. It is not mandatory to add additionalagents.To manually install processing agents, perform the following steps:

1. Navigate to the Agents tab in Admin mode.2. Click New Agent and complete the following required fields:

n Agent Type - click to display a list of agents. Filter for one of the processing agents, select theagent, and click OK.

n Number of agents - enter the number of agents you want to add.

n Agent Server - click to display a list of servers, then select a server and click OK.n Run interval - enter the interval, in seconds, at which the agent should check for available jobs.n Logging level of event details - select Log critical errors only (recommended), Log warnings

and errors, or Log all messages.n Enabled - select Yes.

3. Click Save.

Page 8: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 8

2.8 Creating a choice for the processing source locationTo save a processing set, you must select a value for the Select source for files to process field on the set. Tomake a value available for this field, you must create a choice for the Processing Source Location field. Tocreate a choice for the Processing Source Location field:

1. In Admin mode, navigate to the Choices tab.2. Click New Choice.3. Enter the following values for the following required fields:

a. Processing Source Location for Field. The Processing Source Location field is automatically cre-ated for you.

b. The name of the repository containing the files you want to process forName. Enter an absolutenetwork path (UNC) for the Name field. For example, \\pt-func-file01.ex-ample.com\FileShare\Custodian\MJones.

Note: The Relativity Service Account must have read access to the processing source location.

c. A value of your choice forOrder.4. Add the source location you just created to the resource pool:

a. Navigate to the Resource Pools tab in Admin mode.b. Select the pool to which you want to add the source location.c. Click Add on the Processing Source Locations object.d. Select the source location choice you created and click Ok. The source location is now attached

to the resource pool.

3 Supported file typesThe following file types and extensions are supported by Relativity for processing:

File type ExtensionsImage files JPG, JPEG, ICO, BMP, GIF, TIFF, TIF, JNG, KOALA, LBM, PBM, IFF, PCD, PCX, PGM, PPM, RAS,

TARGA, TGA, WBMP, PSD, CUT, XBM, DDS, FAX, SGI, PNG, EXF, EXIFAdobe files PDF, PS, EPSVector files SVG, SVGZ, WMF, PLT, EMF, SNP, HPGL, HPG, PLO, PRN, EMZ, WMZCAD files DXF, DWG, SLDDRW, SLDPRT, 3DXML, SLDASM, PRTDOT, ASMDOT, DRWDOT, STL, EPRT,

EASM, EDRW, EPRTX, EDRWX, EASMXWord DOCX, DOCM, DOTX, DOTM, DOC, DOT, RTF, ODT, WPD, WPSExcel XLSX, XLSM, XLSB, XLAM, XLTX, XLTM, XLS, XLT, XLA, XLM, XLW, ODC, ODS, UXDC, DBFPowerPoint PPTX, PPTM, PPSX, PPSM, POTX, POTM, PPT, PPS, POT, ODPPublisher PUBOneNote ONEProject MPPVisio VSD, VDX, VSS, VSX, VST, VSWEmail PST, OST, Unencrypted NSF, MSG, P7M, ICS, VCF, MBOX, EML, EMLX, TNEF, DBX, Bloomberg

Page 9: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 9

File type ExtensionsXML

HTML HTML, MHT, HTM, MHTML, XHTM, XHTMLCompressedfiles

ZIP, TAR, GZ, BZ2, RAR, Z, CAB, ALZIP, EnCase Logical Evidence Files

JungUmGlobal

GUL

Text files TXT, CSV

Note: Self-extracting RAR files and PEM certificate files are not currently supported.

3.1 Container file typesThe following file types can act as containers:

File type ExtensionsBloomberg XML (sometimes)Cabinet CABEnCase Extensions start with L01 or Lx01 and go up to YZZ or YZZZRAR RARZip ALZIPZip ZIPZip ZZip BZ2Zip GZTAR (Tape Archive) TAROutlook Offline Storage OSTOutlook Mail Folder PSTOutlook Express Mail Folder DBXPDF Portfolio PDF (sometimes)iCalendar ICS (sometimes)MBOX Email Store MBOX

4 Home tabWhen you click the Processing tab in Relativity, you’re taken to the processing Home tab.In the Home tab, you can create a new processing set, discover and publish processing set files, review andretry errors, and view processing statuses and reports.

Page 10: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 10

You can also navigate the sub-tabs available at the top of the Home page and complete processing taskswithin their respective tabs.

The available processing sub-tabs are:

n Settings – control default settings for all processing jobs in the workspacen Custodians – view, create, and edit custodiansn Password Bank – create password bank entries that correspond to files to be processed so that pass-

word-protected files are automatically unlocked. See Password bank tab on page 24 for more inform-ation.

n Processing Sets – view, create, and edit processing setsn Errors – view errors for each processing set, add notes, and retry processing on files with errorsn Reports – generate and save reports for one or multiple processing sets

The Home tab displays the processing set list, which contains the following columns:

n Processing Set – the name of the processing set. Click this to go to the Processing Set layout on the Pro-cessing Sets sub-tab.

n Custodian – the name of the custodian associated with the processing set. Click the name of a cus-todian to go to the Custodian layout on the Custodians sub-tab.

n Status – status of the processing setn Percent complete – progress (as a percentage) of the currently running job. When you click Discover,

this column displays the progress of the discovery process. When you click Publish for the same set,this column resets to 0, then displays the progress (as a percentage) of the publishing process.

Page 11: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 11

n Discovered file count – after discovery, displays the number of files discovered in the processing setn Error count – the number of errors with a status of Ready to retry or Unresolvable for the processing

set. Click this number to go to the Errors tab. Only errors with a status of Ready to retry or Unresolv-able appear for that processing set. The view that displays is the last view selected on the Errors tab.Depending on the view criteria, errors included in the count on the Home tab for the processing setmay be excluded from the view.

n Published document count – after publishing, displays the number of files published from the pro-cessing set into Documents

n Discover - starts the discovery phase of the processing job, during which Relativity attempts to dis-cover all files in the set

n Publish - starts the publish phase of the discovery job, during which Relativity publishes the files it hasdiscovered

o This button remains enabled even after publish is complete so that you can republish the set ifthe first publish attempt failed or resulted in errors. You can't cancel a republish.

o If you've arranged to auto-publish sets, then when you kick off discovery, you are also kicking offpublish.

You can also discover or publish files from multiple processing sets at once. Select the check boxes for theprocessing sets to be discovered or published and click Discover Selected or Publish Selected directly belowthe processing set list.

4.1 Home tab consolesConsoles display to the right of the processing set list on the Home tab.The Processing Status console displays status information for the processing sets across your workspace.

This console includes the following values:

Page 12: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 12

n Files to be Published – the number of files across processing sets in your workspace that are ready tobe published. Files to be Published = (Discovered file count) - (Duplicate file count) - (Published filecount). This count excludes all canceled processing sets and sets in the queue.

n Duplicate File Count - the number of duplicate files that have been excluded across all processing setsin the workspace.

n Total Published Files – the number of files across processing sets in your workspace that have beenpublished to the Documents tab.

n Processing Sets – the number of processing sets in your workspace with the number of sets in each ofthe following states:

o New – the number of processing sets that have been created but whose files have not been dis-covered or published. If you click Discover for a set but then cancel the discovery process beforeit begins, the set is still considered new.

o Discovering – the number of processing sets whose files are currently being discovered. Thisreflects sets that are being discovered for the first time, as well as sets that are being retried.

o Discovered – the number of processing sets whose files have been discovered but not pub-lished.

o Publishing – the number of processing sets whose files have been discovered and are currentlybeing published. This reflects sets that are being published for the first time, as well as sets thatare being retried.

o Published – the number of processing sets whose files have been discovered and published.o Canceling / Canceled – the number of processing sets that have had a discover or publish job

canceled before completion.

Note: A single processing set can't be included in more than one state. When a set is Retrying for Discovery,it's included in the Discovering count. When a set is Retrying for Publish, it's included in the Publishing count.

n Error Count – the number of errors the processing sets in your workspace have encountered. Thiscount includes only errors with statuses of Ready to retry or Unresolvable.

Page 13: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 13

TheWorkspace Settings console displays information about the currently selected processing settings foryour workspace.

This console includes the following values:

n Deduplication Method – themethod of deduplication you've selected. Click this link to go to the Set-tings layout on the Settings sub-tab, from which you can edit the deduplication method.

n # of Fields Mapped – the number of processing fields that have been mapped to Relativity fields. If youhaven't mapped any fields, little metadata is available for any processed files published to the Docu-ments tab. Click this link to go to the Settings sub-tab, where you can map processing fields to Relativ-ity fields.

The Available Reports console displays the reports available for your processing sets. Click a report name to goto the Reports sub-tab to generate and view reports for one or multiple processing sets. See Reports tab onpage 48 for more information on the information provided in this console.

5 Settings tabUse the Settings tab to view and modify settings for all the processing jobs in your workspace.The values you set here are the default for all new processing sets.To view settings for all the processing sets in your workspace, navigate to the Settings tab. The DefaultSettings area displays your current settings. The Processing Field (Field Mapping) area displays your currentfield mappings.

Page 14: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 14

To modify the settings listed in the Default Settings area, click Edit on the settings layout.

Page 15: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 15

The following Default Settings fields are editable:

n Deduplication method - removes duplicate files. Select from the following options:

Page 16: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 16

o None - no deduplication occurs.

Note: Even when you select None as the deduplication method, Relativity identifies duplicates bystoring one copy of the native document on the file repository and using metadata markers for allduplicates of that document. For details on how deduplication works in the background, see Technicalconsiderations for deduplication on page 53.

o Global - documents from each processing set are deduplicated against all documents in all otherprocessing sets in your workspace.

o Custodial - documents from each processing set are deduplicated against only documents in pro-cessing sets owned by that custodian.

Note: Deduplication only applies to parent files. Deduplication does not apply to children. If a parent ispublished, all of its children are also published.

n DeNIST - deNISTing separates and removes files found on the National Institute of Standards and Tech-nology (NIST) list from the data you plan to process so that they do not make it into Relativity whenyou publish a processing set. The NIST list contains file signatures—or hash values—for millions of filesthat hold little evidentiary value for litigation purposes because they are not user-generated. This listmay not contain every known junk or system file, so deNISTing may not remove 100% of undesirablematerial. The options are:

o Yes - removes all files found on the NIST list.o No - does not remove any files found on the NIST list. Files found on the list are published with

the processing set.

The following Processing Set Setup Defaults fields are editable:

n Processing set name - the name of the processing set using these settings.n Custodian - the owner of the processed data. When you select a custodian with a specified prefix, the

default document numbering prefix field changes to reflect the custodian's prefix. When you select acustodian without a prefix, the default document numbering prefix field is cleared and you are requiredto manually enter a prefix value. For more information on custodians, see Processing sets tab on page28 in Creating a processing set.

n Use source folder structure - allows you to maintain the folder structure of the source of the files youprocess when you bring these files into Relativity. For more information on source folders, see Pro-cessing sets tab on page 28 in Creating a processing set.

n Auto-publish set - arranges for the processing engine to automatically kick off publish after the com-pletion of discovery, with or without errors. By default, this is set to No. The options are:

o Yes - arranges for automatic publish after discovery is complete.o No - does not arrange for automatic publish after discovery is complete. You always have to

manually start the publish phase if you select No. This is the default value.

n Default time zone - click to display a list of time zones. Select the time zone appropriate for yourenvironment. This selection determines the default time zone for processing sets you create later.

Page 17: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 17

n Default OCR languages - click to display a list of OCR languages. Select the languages appropriatefor your environment. This selection determines the default language for processing sets you createlater.

The following Processing Set Document Numbering Defaults field is editable:

n Document numbering prefix - the prefix applied to each file in a processing set once it is published. Thedefault value for this field is REL. When applied to documents, this appears as <Prefix>xxxxxxxxxx - theprefix followed by ten digits. If you select a default Custodian on the Settings tab, it overrides the RELvalue. If the custodian attached to the processing set contains a prefix different than that on the Set-tings tab, the processing set applies the custodian's prefix to each file.

The Processing Set File Handling Defaults fields are identical to those found on the processing set. SeeProcessing sets tab on page 28 for descriptions of these fields.After selecting options for each field, click Save to save the settings for all processing jobs in your workspace.

5.1 Editing field mappingsTo edit field mappings:

1. Click the name of a processing field from the Processing Field Name column. The Processing Fields lay-out appears for that field.

2. Click Edit , then click in the Relativity Field field.3. From the Select Item - Relativity Field dialog, select the Relativity field you wish to map to.

n When mapping fields, make sure the field typematches. For example, if the processing field is aFixed-Length text field, the Relativity field you select must also be Fixed-Length Text.

n You can only map each Relativity field to one processing field.n An error occurs if you attempt to map a processing field to a Relativity field that is mapped to a

different processing field.n Relativity fields must be Unicode-enabled if they aremapped to processing fields.

Page 18: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 18

n If you change your field mapping and then republish the processing set, any documents des-ignated to be republished have the new field mapping applied. Documents that aren't repub-lished aren't updated with the new field mapping.

4. Click Set. The field you selected appears in the Relativity Field box. To remove the selection, click Clear.

5. Save the field mapping and repeat steps 1-4 for all fields you wish to map.

Note: If you publish processing sets without mapping the Document Extension processing field, the TextExtraction report does not accurately report document counts by file type.

5.2 Invalid OCR language combinationsYou can’t save processing settings if you select any of the following languages plus any other language. If oneof the following five languages is selected, it must be the only one selected:

n Simplified Chinesen Traditional Chinesen Koreann Japanesen Thai

5.3 Processing fields

5.3.1 Required fieldsThe following system-created metadata fields are always populated when data is processed:

Processing FieldName

Field Type Description

Container Extension Fixed-LengthText

Document extension of the container file in which the document ori-ginated.

Container ID Fixed-LengthText

Unique identifier of the container file in which the document originated.This is used to identify or group files that came from the same container.

Container Name Fixed-LengthText

Name of the container file in which the document originated.

Custodian SingleObject

Custodian associated with (or assigned to) the processing set during pro-cessing.

Extracted Text Long Text Complete text extracted from content of electronic files or OCR data field.Last Published On Date Date on which the document was last updated via re-publish.

Page 19: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 19

Processing FieldName

Field Type Description

Level WholeNumber

Numeric value indicating how deeply nested the document is within thefamily. The higher the number, the deeper the document is nested.

Originating Pro-cessing Set

Single Obj-ect

The processing set in which the document was processed.

Processing Duplicate -Hash

Fixed-LengthText

Identifying value of an electronic record that is used for deduplication dur-ing processing.

Processing File Id Fixed-LengthText

Unique identifier of the document in the processing engine database.

Processing Errors MultipleObject

Any associated errors that occurred on the document during processing.This field is a link to the associated Processing Errors record.

Relativity NativeTime Zone Offset

WholeNumber

The hour offset based on the Time Zone ID. Numeric field that controlshow header dates and times appear for email messages in the viewer oron redacted or highlighted images. This does not modify actual metadataassociated with the displayed values.

Relativity Native Type FixedLengthText

The type of native file loaded into the system.

Supported By Viewer Boolean Yes/No field that indicates whether the native document is supported bythe viewer.

Time Zone Field SingleObject

Indicates which time zone is used to display dates and times on a doc-ument image.

Virtual Path Long Text Folder structure and path to file from the original location identified dur-ing processing.

5.3.2 Optional fieldsThe following metadata fields can be, but are not required to be, mapped. For each of the followingProcessing fields, you can pick a Document field in Relativity to map to.

Processing Field Name Field Type DescriptionAttachment DocumentIDs

Long Text Attachment document IDs of all child items in family group, delimitedby semicolon, only present on parent items.

Author Fixed-Length Text

Original composer of document or sender of email message.

Comments Long Text Comments extracted from themetadata of the native file.Conversation Long Text Normalized subject of email messages. This is the subject line of the

email after removing the RE and FW that are added by the systemwhen emails are forwarded or replied to.

Conversation Family Fixed-Length Text

Relational field for conversation threads. This is a 44-character string ofnumbers and letters that is created in the initial email.

Page 20: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 20

Processing Field Name Field Type DescriptionConversation Index Long Text Email thread created by the email system. This is a 44-character string

of numbers and letters that is created in the initial email and has 10characters added for each reply or forward of an email.

Date Created Date Date and time from the Date Created property extracted from the ori-ginal file or email message.

Date Last Modified Date Date and time from theModified property of a document, representingthe date and time that changes to the document were last saved.

Date Last Printed Date Date and time that the document was last printed.Date Received Date Date and time that the email message was received (according to ori-

ginal time zones).Date Sent Date Date and time that the email message was sent (according to original

time zones).Document Extension Fixed-

Length TextCharacter extension of the document that represents the file type totheWindows Operating System. Examples are PDF, DOC, or DOCX.

Document Subject Long Text Subject of the document extracted from the properties of the nativefile.

Domains (Email BCC) MultipleObject

Domains of 'Blind Carbon Copy' recipients of the email message. SeetheNote below.

Domains (Email CC) MultipleObject

Domains of 'Carbon Copy' recipients of the email message. See theNote below.

Domains (Email From) MultipleObject

Domains of Originator of the email message. See theNote below.

Domains (Email To) MultipleObject

Domains of 'To' recipients of the email message. See theNote below.

Email BCC Long Text Recipients of 'Blind Carbon Copies' of the email message.Email Categories Long Text Category(ies) assigned to an email message.Email CC Long Text Recipients of 'Carbon Copies' of the email message.Email From Fixed-

Length TextOriginator of the email message.

Email Subject Long Text Subject of the email message.Email To Long Text List of recipients or addressees of the email message.File Name Fixed-

Length TextOriginal name of the file.

File Size Decimal Generally a decimal number indicating the size in bytes of a file.File Type Fixed-

Length TextDescription that represents the file type to theWindows Operating Sys-tem. Examples are Adobe Portable Document Format, Microsoft Word97 - 2003 Document, or Microsoft OfficeWord Open XML Format.

Group Identifier Fixed-Length Text

Group the file belongs to (used to identify the group if attachmentfields are not used).

Has Hidden Data Yes/No Indication of the existence of hidden document data such as hiddentext in a Word document, hidden columns, rows, or worksheets inExcel, or slide notes in PowerPoint.

Page 21: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 21

Processing Field Name Field Type DescriptionImportance Single

ChoiceNotation created for email messages to note a higher level of import-ance than other email messages added by the email originator.

Lotus Notes OtherFolders

Long Text A semi-colon delimited listing of all non-primary folders that a LotusNotes message or document was included.

MD5Hash Fixed-Length Text

Identifying value of an electronic record that can be used for dedu-plication and authentication generated using theMD5 hash algorithm.

Message Type SingleChoice

Indicates the email system message type. Possible values includeAppointment, Contact, Distribution List, Delivery Report, Message, orTask. The valuemay be appended with '(Encrypted)' or 'Digitally Signed'where appropriate.

Number of Attach-ments

WholeNumber

Number of files attached to a parent document.

Other Props Long Text Metadata extracted during processing for additional fields beyond thelist of processing fields available for mapping. Field names and their cor-responding values are delimited by a semicolon.

Parent Document ID Fixed-Length Text

Document ID of the parent document. This field is only available onchild items.

Password Protected SingleChoice

Indicates the documents that were password protected. It contains thevalue 'Decrypted' if the password was identified, 'Encrypted" if the pass-word was not identified, or no value if the file was not password pro-tected.

SHA1 Hash Fixed-Length Text

Identifying value of an electronic record that can be used for dedu-plication and authentication generated using the SHA1 hash algorithm.

SHA256 Hash Fixed-Length Text

Identifying value of an electronic record that can be used for dedu-plication and authentication generated using the SHA256 hashalgorithm.

Sort Date Date Date taken from the Date Sent field on email messages repeated for theparent document and all child items to allow for date sorting.

Unified Title Long Text Subject of the document. If the document is an email, this field con-tains the email subject. If the document is not an email, this field con-tains the document's file name.

Note: Refer to the Relativity Admin Certification Workbook for information on domains and steps to createthe Domains object and associative multi-object fields. The Domains processing fields listed in this tableeliminate the need to perform domain parsing using transform sets for the processed documents.

5.3.3 Pivot-enabled fields for processingSeveral processing fields are enabled for Pivot by default. You can use the following fields to generate Pivotreports in their respective processing objects to help you better understand your processing data:

Page 22: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 22

Processing field name Processing object found inError created on ErrorsError status ErrorsError type ErrorsIdentified file type ErrorsProcessing Set ErrorsName Processing SetsCustodian Processing SetsStatus Processing SetsTime zone Processing Sets

6 Custodians tabCustodians are the parties that own or facilitate the data included in a processing job. To view or editcustodian details, or to create a new custodian, navigate to the Custodians tab.

This view displays a list of all custodians in your workspace. You can select the custodians that appear in thistab on the Custodian field on the processing set.

6.1 Creating a new custodianTo create a new custodian:

1. Click the Processing tab and then click the Custodians sub-tab.2. Click New Custodian on the Custodians tab.3. Complete the fields on the Custodian layout. See Fields on the next page.4. Click Save. The Custodian layout appears with a new section called Processing Set (Custodian).5. (Optional) Click New from the Processing Set (Custodian) section to go to the Processing sets tab on

page 28, where you can create a new processing set assigned to this custodian.

Page 23: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 23

6.2 Fields

The Custodian layout provides the following fields:

n Name - the name of the custodian.n Document numbering prefix - the prefix used to identify each file of a processing set once the set is

published. The prefix entered on the custodian appears as the default value for the required Documentnumbering prefix field on the processing set that uses that custodian. If you provide a prefix for boththe custodian and processing settings, the published file uses the custodian prefix. The identifier of thepublished file reads: <Prefix> # # # # # # # # # #

n Notes - any additional descriptors of the custodian.

6.3 Viewing or editing custodian detailsClick the Edit link to the left of theName column to access the Custodian layout. On the Custodian layout,you can edit a custodian's name, enter a document numbering prefix, and add notes.

You can viewmore details about a custodian by clicking the custodian's name from the list on the Custodianstab. The Custodian layout displays with an additional section called Processing Set (Custodian).

Page 24: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 24

From the Processing Set (Custodian) section, you can create a new processing set for the custodian, or youcan view and edit details for existing processing sets assigned to this custodian.To create a new processing set for the custodian, click New on the Processing Set (Custodian) object.To view and edit processing set details, click the name of the set in theName column.Clicking either option directs you to the Processing sets tab on page 28.

Note: You can't delete a custodian already associated with a processing set.

7 Password bank tabThe Password Bank is a password repository used to decrypt certain password-protected files during filediscovery and native imaging. Using Password Bank, you can enter each password in the bank, and Relativityruns those passwords against each password-protected document until it finds a match. Likewise, when yourun an imaging job, mass images, or use image-on-the-fly, the list of passwords specified in the PasswordBank accompanies the imaging job so that password-protected files are imaged in that job.Password Bank potentially reduces the number of errors in each job and eliminates the need to addresspassword errors outside of Relativity.

Note: Locate the Password Bank under both the Imaging and the Processing applications if both areinstalled, and is updated in each to reflect the most current entries added, deleted, or edited.

7.1 Creating or deleting a Password Bank entryTo create a new entry in the bank:

Page 25: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 25

1. Click Processing, and click the Password Bank tab.2. Click New on the Password Entry category.

3. Complete the fields on the Password Entry Layout. See Fields below for more information.4. Click Save. The entry appears among the others under the Password Entry object.

To delete a password, select the check box next to its name and click Delete on the Password Entry object.You must select a password to enable the Delete button.

Note: Encrypted email messages and Lotus Notes databases require a matching password and file.However, for loose files all entered passwords are considered as options used to unlock, regardless of filetype and/or file association.

7.2 FieldsThe Password Bank layout contains the following fields:

Page 26: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 26

n Type - the file type for which you are storing a password. The options are:o Passwords - any file that is not grouped with the two other type options of Lotus Notes or Email

encryption certificate. When you select this type, you must enter at least one password in thePasswords field in order to save.

Note: Microsoft OneNote files and password protected Encase files are not supported in the PasswordBank.

o Lotus Notes - any file generated by Lotus Notes software. When you select this type, you mustupload a file with a Lotus Notes extension in the Upload file field in order to save. An associatedpassword is optional.

o Email encryption certificate - files protected by various encryption software certificates. Whenyou select this type, you must upload one of the eligible file extensions listed below in theUpload file field. An associated password is optional.

Note: Lotus Notes and Email encryption certificate files require a matching password and file. However,for Password or "loose" files all entered passwords are considered as options used to unlock, regardlessof file type and/or file association.

n Description - the description of the entry you are adding to the bank. This field helps you differentiatebetween other entry types.

n Password(s) - the one or more passwords you are specifying for the Passwords type or for certificateswithout passwords. If you select Passwords as the file type, you must add at least one password herein order to save. You can also add values here if you are uploading certificates that do not have

Page 27: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 27

passwords. Separate passwords with a carriage return. If you enter two passwords on the same line,Password Bank interprets the value as a single password.

Note: Unicode passwords for ZIP files are not supported in the Password Bank.

n Upload file - the file you are required to upload for either the Lotus Notes or Email encryption cer-tificate types. If uploading for Lotus Notes, the file extension must be "User.ID" with no exceptions. Thefile types eligible for upload for the Email encryption certificate type are:

o PFXo P12

7.3 Validations, errors, and exceptionsNote the following:

n Including a password that doesn't belong to a document in your data set does not throw an error oraffect the process.

n A password can unlock multiple files. This means that if you provide the password for a Lotus Notes filethat also happens to correspond to aWord file, Password Bank will also unlock that Word file.

n If you delete a Password Bank entry after submitting a processing or imaging job, you can still completethose jobs.

The following exception situations may occur when using the Password Bank:

n Word template files - the Password Bank can't unlock an encrypted Word file that was created basedon an encrypted Word template where theWord file password is different than the template pass-word, regardless of whether both passwords are in the Password Bank.

You can resolve Password Bank errors by supplying the correct password to the bank and then retrying thoseerrors in their respective processing or imaging jobs.

7.4 View auditsEvery time you send a Password Bank to the processing engine, Relativity adds an audit. The Password Bankobject's audit history includes the standard Relativity audit actions of update and run, as well as a list of allpasswords associated with a discovery job at run time.To view the passwords sent to the processing engine during a job:

1. Click the Processing tab, and then click Password Bank.2. View Audit on the Password Bank layout.

Page 28: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 28

3. Click Details on the Password Bank history layout.

4. Refer to the Value field on the audit details window. Any properties not set on the password bankentry are not listed in the audit.

8 Processing sets tabThe Processing Sets tab allows you to see a list of all the processing sets in your environment.

This view provides the following information:

Page 29: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 29

n Name - the name of the processing set.n Custodian - the name of the person or entity who owns the data in the processing set.n Status - the current status of the processing set.n Percent complete - howmuch of the processing set has completed so far in the current phase.n Discovered file count - howmany files were discovered.n Published document count - howmany documents were published.

From this tab you can:

n Open and edit an existing processing set.n Perform the following mass operations on selected processing sets:

o Deleteo Export to Fileo Tally/Sum/Average

Note: The Copy, Edit, and Replace mass operations are not available for use with processing sets.

8.1 Creating a processing setWhen you create a processing set, you are specifying the settings that the processing engine uses to processdata.You must have the following to create a processing set:

n Source location for the files you want to processn Destination folder in Relativity in which to publish documentsn A custodian

To create a processing set:

1. Navigate to the Processing Sets tab or theHome tab.2. Click theNew Processing Set button to display the Processing Set layout.3. Complete the fields on the Processing Set layout. See Fields on the next page.

Page 30: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 30

8.2 FieldsThe following fields are in the Setup section:

n Processing set - the name of the set.n Custodian - the owner of the processed data.

o If no items exist in the Custodian list, no choices appear in this field.o You cannot delete a custodian that is associated with a processing set.o You cannot save a processing set unless you have selected a Custodian.o If you change the custodian, the Document numbering prefix changes to reflect the new cus-

todian's prefix.o To create a Custodian:

l Click Add.l Provide a name in theName field.l Click Save. The custodian is now available for selection in the required Custodian field on

the processing set.n Select source for files to process - the location of the data you want to process. Click Browse to select

the path:

Page 31: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 31

o The source path you select controls the folder tree below. The folder tree displays an icon foreach file or folder within the source path. You can specify source paths in the resource poolunder the Processing Source Location object. Click Save after you select a folder or file in thisfield.

n Select destination for published files - the folder in Relativity where the processed data is published.

Click to display the Relativity folder picker and select the appropriate folder. To create a new folder,right-click the base folder and click Create. Right-click on the New Folder and select Rename to updatethe name of the folder.

Page 32: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 32

o If the source path you selected is an individual file or a container, such as a zip, then the foldertree does not include the folder name that contains the individual file or container.

o If the source path you selected is a folder, then the folder tree includes the name of the folderyou selected.

n Use source folder structure - allows you to maintain the folder structure of the source of the files youprocess when you bring these files into Relativity. Select Yes to maintain the folder structure. Select Noif you do not want to maintain the folder structure.

Note: If you select Yes for Use source folder structure, subfolders matching the source folder structure arecreated under this folder. See the following examples:

Example 1 (recommended)- Select Source for files to process: \\server.ourcompany.com\Fileshare\Processing Data\Jones, Bob\- Select Destination folder for published files: Processing Workspace \ Custodians \

Results: A subfolder named Jones, Bob is created under the Processing Workspace \ Custodians \ destinationfolder, resulting in the following folder structure in Relativity: Processing Workspace \ Custodians \ Jones, Bob \

Example 2 (not recommended)- Select Source for files to process: \\server.ourcompany.com\Fileshare\Processing Data\Jones, Bob\- Select Destination folder for published files: Processing Workspace \ Custodians \ Jones, Bob \

Results: A sub-folder named Jones, Bob is created under the Processing Workspace \ Custodians \ Jones, Bob \destination folder, resulting in the following folder structure in Relativity: Processing Workspace \ Custodians \Jones, Bob \ Jones, Bob \Any folder structure in the original source data is retained underneath.

If you selectNo for Do you want to use source folder structure, no sub-folders are created under thedestination folder in Relativity. Any folder structure that may have existed in the original source data is lost.

n Auto-publish set - determines whether publish is started automatically after file discovery is completefor this set, with or without errors. By default, this is set to the value specified on Settings. The optionsare:

o Yes - arranges for automatic publish after discovery is complete.o No - does not arrange for automatic publish after discovery is complete. You always have to

manually start publish if you select No.n Time Zone - determines what time zone is used to display date and time on a processed document.

The default value for this is Coordinated Universal Time (UTC). Click to select from a picker list ofavailable time zone values.

n OCR language - determines what language is used to OCR files where text extraction isn't possible, suchas for image files containing text. If you select Japanese, Simplified Chinese, Traditional Chinese,Korean, or Thai on the OCR language field, that languagemust be the only one selected. Remove allother selected languages from the OCR language field before attempting to save the set.

The Document Numbering section contains the following field:

Page 33: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 33

n Document numbering prefix - the prefix applied to the files once they are published. This appears as<Prefix>xxxxxxxxxx - the prefix followed by ten digits. If the custodian attached to the processing setincludes a prefix, the set uses that prefix and that prefix is pre-populated on the Settings page. If the cus-todian attached to the set does not have a prefix, the set uses the prefix specified on the Settings page.

The following fields are in the File handling section:

n Extract children - provides the ability to extract children files, such as attachments, embedded object-s/images, and other non-parent files, from the data you plan to process. The options are:

o Yes - extracts all children files during discovery so that both children and parents are included inthe processing job.

o No - does not extract children, so that only parents are included in the processing job.

n When extracting children, do not extract - allows you to exclude one or both of the following file typeswhen extracting children with the above Extract children option is set to Yes:

o MS Office embedded images - excludes images of various file types found insideMicrosoftOffice files—such as .jpg, .bmp, or .png in a Word file—from discovery so that embedded imagesaren't published separately in Relativity.

o MS Office embedded objects - excludes objects of various file types found insideMicrosoftOffice files—such as an Excel spreadsheet inside aWord file—from discovery so that the embed-ded objects aren't published separately in Relativity.

Note: This fields does not include MS Outlook, since Outlook is not part of MS Office.

n Filter by file extensions - allows you to include or exclude certain file extensions from the processingjob. Selecting either include or excludemakes the File extension(s) field below required:

Page 34: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 34

o Do not filter by file extension(s) - discovers all files during processing.o Include only file extensions entered below in processing - includes only those file extensions

you specify in the File extension(s) field below.o Exclude only file extensions entered below in processing - excludes only those file extensions

you specify in the File extension(s) field below.

Note: Relativity excludes or includes only those file extensions listed in the File Extensions field. Forexample, if you list XLS only, Relativity doesn't filter XLSX, XLSM, XLA, XLSA, or any other Excel variationsnot specified in the extensions field.

n File extensions - the one or more file extensions you want to either include in or exclude from dis-covery. If you selected to either include or exclude in the field above, you must enter at least one fileextension here or you cannot save the set. Separate each entry with a semi-colon. Periods before theextensions are optional and are removed upon save. See the Processing sets tab on page 28.

n Only apply filter to parent - determines whether or not to apply the file extension filter to only parentfiles. Container files are included in parent filtering. This means that if you specify .doc as the file exten-sion to include, a .doc file in the folder structure is returned, as well as any .doc file in a container file. Ifyou entered values into the Filter by file extensions and File extension fields above, you must completethis field in order to save the set. The options are:

o Yes - filters only parent documentso No - filters both parent and child documents

The following field is in the Notifications section:

Email notification recipients - allows you to list the email addresses of all individuals you want to receivenotifications while processing jobs are in progress. Emails are sent to notify the recipient of the following:

n Discoveryo Successful discovery completedo Discovery completed with errorso First discovery job level error is encounteredo File discovery error during submission of job

n Retry - Discoveryo First discovery retry job level erroro Discovery retry error during submission of job

n Publishingo Successful publishing completedo Publishing complete with errors

Page 35: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 35

o First publishing job level error is encounteredo File publishing error during submission of job

n Retry - Publishingo First publishing retry job level erroro Publishing retry error during submission of job

4. Click Save. When you save the set, a console appears on the right side of the screen. Use this console torun the processing job.

8.2.1 Invalid characters for the File extension(s) fieldYou can’t include any of the following Unicode characters when entering values for the File extension field on aprocessing set:

Code Result Description AbbreviationU+0022 “ Quotation markU+003C < Less-than signU+003E > Greater-than signU+007C | Vertical barU+0000 Null character NULU+0001 Start of heading SOHU+0002 Start of text STXU+0003 End-of-text character ETXU+0004 End-of-transmission character EOTU+0005 Enquiry character ENQU+0006 Acknowledge character ACKU+0007 Bell character BELU+0008 Backspace BSU+0009 Horizontal tab HTU+000A Line feed LFU+000B Vertical tab VTU+000C Form feed FFU+000D Carriage return CRU+000E Shift out SOU+000F Shift in SIU+0010 Data Link Escape DLEU+0011 Device Control 1 DC1U+0012 Device Control 2 DC2U+0013 Device Control 3 DC3U+0014 Device Control 4 DC4U+0015 Negative-acknowledge character NAKU+0016 Synchronous Idle SYNU+0017 End of Transmission Block ETB

Page 36: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 36

Code Result Description AbbreviationU+0018 Cancel character CANU+0019 End ofMedium EMU+001A Substitute character SUBU+001B Escape character ESCU+001C File Separator FSU+001D Group Separator GSU+001E Record Separator RSU+001F Unit Separator USU+003A : ColonU+002A * AsteriskU+002F / Slash (Solidus)U+003F ? Question markU+005C \ Backslash

8.3 Deleting a processing setThe following table breaks down when you can and cannot delete a processing set.

Phase Can delete?Discovery not yet kicked off YesDiscovery in progress NoDiscovery complete NoPublish in progress NoPublish complete NoRepublish in progress No

9 Running a processing setTo complete a processing set, you must:

1. Discover files2. Publish files3. Republish files, if necessary

You can start each phase from both the Home page and the Processing Set layout.

Note: Don't add documents to a workspace and link those documents to an in-progress processing set.Doing this distorts the processing set's report data.

9.1 Discovering filesTo start discovery from the processing set layout, click Discover Files on the console.

Page 37: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 37

Clicking Discover Files sends the processing set to the processing queue, where it waits to be picked up. Onceit is picked up, the processing engine begins to ingest the files you specified in the set. Relativity then performsOCR and/or text extraction on those files.Note the following about file discovery:

n You can’t change the settings on any processing job at any point after file discovery begins. This meansthat once you click Discover Files, you can’t go back and edit the settings of the processing set and re-click Discover Files. You would need to create a new processing set with the desired settings.

n If you've arranged for auto-publish on the settings page, then when you start discovery on aprocessing set, you are also starting publish once discovery is complete, even if errors occur duringdiscovery. This means that the Publish button is not enabled for the set until after the complete job isfinished.

n When you start discovery or retry discovery for a processing job, the list of passwords specified in thepassword bank accompanies the processing job so that password-protected files are processed in thatjob. For more information, see Password bank tab on page 24.

When you start discovery, theDiscover Files button changes to Cancel Discovery. Click this to cancel filediscovery. Once discovery on a set has been canceled, you can't resume discovery. You must create a newprocessing set to fully discover those files.

9.1.1 Reading processing statusThe Processing Set Status console provides data you can use to measure the progress of the processing job.This console and the information on the processing set layout refresh automatically every five seconds toreflect changes in the job.

Note: The frequency with which the processing set console refreshes is determined by theProcessingSetStatusUpdateInterval entry in the configuration table. The default value for this is 5 seconds.See the Configuration Table Guide for more information.

The Processing Set Status console provides the following information:

Page 38: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 38

n Status displays the current state of the processing job. Depending on the phase and progress of thejob, any of the following status values appear:

Status What it meansNew Newprocessing set created.Waiting The user clicked Discover or Publish Files and the agent

has not picked up the job.Initializing The agent picked up the job, but hasn't yet submitted

it.<error message> The agent could not submit the job and an error mes-

sage appears.Discovering/Publishing/Republishing Files The agent has submitted the job.Discovered/Published/Republished;Retrievingerrors

The processing server is done discovering, publishing,or republishing files and is retrieving errors (whether ornot there are any errors).

Discovered/Published/Republished; Updatingtables

The processing server is done discovering, publishing,or republishing files and is updating tables.

Discovered/Published/Republished witherrors

The processing server is done discovering, publishing,or republishing files and retrieving errors, and therewas at least one error.

Discovered/Published/Republished files com-plete

The processing server is done working on the job, andRelativity is done retrieving errors, and there were noerrors.

n Deduplication method is themethod selected for separating duplicate files during discovery, as set onthe Settings object. The possible values here are:

Page 39: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 39

o None - means deduplication is not performed.

Note: Even when None is selected as the deduplication method, Relativity identifies duplicates bystoring one copy of the native document on the file repository and using metadata markers for allduplicates of that document.

o Global - means deduplication is performed against every file in the workspace.o Custodial - means that deduplication is performed only against documents attached to the cus-

todian selected for this set.

Note: After a processing set is published, the deduplication method that was used for that set whenit was published is respected. All retries of errors generated during file publishing use the samededuplication method used when publishing was taking place. For more information, see Technicalconsiderations for deduplication on page 53.

n Discovered file count - the total number of files discovered to this point in the job. Depending on thenumber of container files, such as zips and .pst files, this value could fluctuate dramatically as the jobgoes on and the processing server continues to discover files.

n Published document count - the number of documents that are published into the Documents tab.After file publishing is complete, this count is updated as retries occur and additional files are published.If you delete some of the published documents and then retry errors, this count reflects the number ofpublished documents currently in the processing set.

n Unresolvable errors - the number of unresolvable errors encountered during a job.n Errors that are ready for retry - the number of errors that have a status of Ready for retry and which

you can submit to be retried via the Retry mass operation.n Retry jobs in queue - the number of error retries that remain in the queue.n Last activity - the time at which the status console was last updated.

9.1.2 Canceling discoveryOnce you start discovery, you can cancel it before the job reaches a status of Discovered with errors orDiscover files complete.To cancel discovery, click Cancel Discovery on the console.

Note the following about canceling discovery:

n If you click Cancel Discovery while the status is still Waiting, you can re-submit the discovery job.n If you click Cancel Discovery after the job has already been sent to the processing engine, then the set

is Canceled, meaning all options are disabled and it is unusable.

Page 40: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 40

n If you have auto-publish enabled and you cancel discovery, file publishing is does not start.n Once the agent picks up the cancel discovery job, no more errors are created for the processing set.n Errors that result from a job that is canceled are given a canceled status and can't be retried.

Once discovery is complete, the status console is updated to reflect this. The following options are nowavailable for selection on the console:

n Publishing files on the next pagen Viewing the discovery report belown Viewing errors on page 42 (if there are errors associated with the processing set)

9.2 Viewing the discovery reportOnce discovery is complete, you can view a report breaking down the different types of files discovered.To view this report, click View Discovery Report on the console.

The discovery report provides the following information:

n Discovered Files by Custodiann File Types Discovered - Processablen File Types Discovered - Processable (By Custodian)n Files Types Discovered - Unprocessablen Files Types Discovered - Unprocessable (By Custodian)

Page 41: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 41

See Reports tab on page 48 for more information.

9.3 Publishing filesAt any point after file discovery is complete, you can publish the discovered files.To publish files, click Publish Files on the console.

Note the following about publish:

n You can also republish for sets that have been previously published with or without errors. The Publishoption is available even after publish is complete.

n If you've arranged for auto-publish on the settings page, then when you start discovery, you are alsostarting publish once discovery is complete, even if errors occur during discovery. This means that thePublish button is never enabled.

n The status console appears the same and updates in a similar fashion to discovery with the exceptionof the status field, which reflects publish progress instead of discovery.

n Once you publish files, you are unable to change the settings for the processing set, including the dedu-plication method.

9.3.1 Canceling publishingOnce you start publish, you can cancel it.To cancel publish, click Cancel Publishing on the console.

Note the following about canceling publish:

n You can't cancel a republish job. The cancel option is disabled during republish.n Once the agent picks up the cancel publish job, no more errors are created for the processing set.n If you click Cancel Publishing while the status is still Waiting, you can re-submit the publish job.n If you click Cancel Publishing after the job has already been sent to the processing engine, then the set

is Canceled, meaning all options are disabled and it is unusable.

Page 42: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 42

n Errors that result from a job that is canceled are given a canceled status and can't be retried.n Once the agent picks up the cancel publish job, you can't use that processing set anymore.

9.3.2 Republishing filesYou can republish a processing set any time after the Publish Files option is enabled after the previous publishjob is complete. You may want to republish if you there any documents that had ingestion or text extractionerrors that have been retried.To republish, click Publish Files on the console.Note the following:

n All ready-to-retry errors resulting from this publish job are retried when you republish.n Deduplication is respected on republish.n If you change your field mapping and then republish the processing set, any documents designated to

be republished have the new field mapping applied. Documents that aren't republished aren't updatedwith the new field mapping.

9.4 Viewing errorsYou can view any errors that occur during discovery or publish.To see errors, click View Errors on the console.

This takes you to the Errors tab. See Errors tab below for more information.

10 Errors tabYou could encounter a variety of errors while executing your processing job. To see a list of all processingerrors for the workspace, navigate to the Errors tab.Here, you can compare the errors returned with the errors displayed on the Document and/or Reports tab onpage 48. You can also edit errors by adding notes about the error.

Note: The initial number of errors pulled from the processing server at one time is determined by theProcessingErrorRetrievalInitialBatchSize value in the configuration table. See the Configuration Table guidefor more information.

Page 43: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 43

You can identify an error in the All Processing Errors view through any of the following fields:

n Message - the cause and nature of the error. For example, Unable to recurse embedded objects.Office document is password protected.

n Processing phase - the state of the processing job when the error occurred:o Discovering fileso Extracting datao Publishing files

n Error status - the state of the error:o Ready to retryo Unresolvableo Retriedo Ignored

n Error type - the type of error:o Document - occurs inside an individual file after the processing set that retrieved that file sent it

through as part of the processing job. This type of error does not stop the job from completing.The job continues until the processing error limit is reached.

o Job - stops the processing set from being submitted to the processing engine. You can't start theprocessing job until the error is resolved and the job is attempted again.

n Processing set - the processing set containing the error.n Notes - any notes added by users to describe the error.n Name - the numbering identifier of the error as assigned by the processing server.n Identified file type - the file type of the document as it was identified by the processing engine. This is

only populated for document-level errors. If the file type is unknown, this displays a value ofUnknown.n Document file location - the file path of the document in error.n Container ID - the ID of the container from which the file in error was directly extracted, as identified by

the processing engine. If you have a zip file with a .pst file inside it, and the .pst file contains an .msg file,then the .msg file has the container ID of the .pst, not the zip. This is because the .pst is the direct con-tainer of the .msg file.

n Error created on - the date/time stamp of when the error was created.

Note: All errors that are resolvable have an initial status of Ready to Retry. This allows you to queue retriesas the processing job is running.

Page 44: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 44

10.1 Reading individual processing errorsAny error that occurs during the processing job is visible in the Processing Errors tab. Click themessage or thename of any error in the All Processing Errors view to bring up its details.

The Error Details layout reflects the same fields found in the All Processing Errors view, in addition to thefollowing:

n Document location in Relativity - the number identification of the error as automatically generated bythe processing engine. This name is used on the Document object to associate an error with a doc-ument in Relativity.

n Document folder location - the path to the parent folder of the document in error.n Virtual location - the folder inside the repository containing the processed files.n Source location - the full name of the repository containing the processed files.

Note: The links for the Document file location and Document folder location fields on the error won't workif the website(s) containing them isn't added as a trusted site. See theWorkstation Configuration Guide formore information.

10.2 Ignoring errorsYou have the option of ignoring errors you aren't going to resolve immediately and want to filter out of yourlist view. Ignoring errors is a viable option when an Office document is password protected and you don’thave the password and/or you haven't entered the password in the password bank.You must have edit permissions for processing errors in order to ignore errors.

Page 45: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 45

Note: You can't ignore job-level publishing errors.

To ignore errors:

1. Navigate to the Processing Errors tab.2. Select the check box next to each error you want to ignore. You can only ignore errors with a status of

Ready to retry.3. Move to themass operation menu in the bottom left corner, select Ignore and click Go. You can only

ignore errors through this mass operation.

4. Click Ok on the following confirmation message: “X errors are eligible to be ignored. Once you ignorethese errors you can’t retry them until you have un-ignored them.”

5. Once you've ignored the selected errors, check the Error status column for each error in the view. Thisstatus should be Ignored.

10.3 Un-ignoring errorsYou can un-ignore errors that you previously ignored so that you can have the option of retrying them. Youmust have edit permissions for processing errors in order to un-ignore errors.To un-ignore errors:

1. Navigate to the Processing Errors tab.2. Select the check box next to each error you want to un-ignore. You can only un-ignore errors that have

a status of Ignored.3. Move to themass operation menu in the bottom left corner and select Un-ignore and click Go. You can

only un-ignore errors through this mass operation.

Page 46: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 46

4. A confirmation appears stating, “X errors are eligible to be un-ignored. Once you un-ignore theseerrors, they will be ready to retry.” Click Ok.

5. Once the selected errors have been un-ignored, check the Error status column for each error in theview. This status should be Ready to retry.

10.4 Retrying errorsYou can attempt to resolve one or more processing errors from one or more processing sets through theRetry mass operation. You must have edit permissions for processing errors to retry them.

Note: Relativity automatically retries all Publish errors for a set when you are republishing that set.

To mass retry one or more errors:

1. Navigate to the Processing Errors tab.2. Select the check box next to each error you want to submit for retry.3. Select Retry from themass operations menu in the bottom left corner. You can only retry errors with a

status of Ready for retry.4. Click Go.

a. A retry is audited.b. If a retried error has a status of In progress, you can’t delete the processing set containing that

error.

Note: Once an error retry is submitted to the processing engine for a single processing set, another retry jobfor the same set cannot be submitted until the first retry is complete. The subsequent retry is placed in thequeue in Relativity to await retry.

10.4.1 Reading retry status for file discovery and publishing errorsWhen you retry errors generated during file publishing, only one error per workspace can be worked on at atime.

Page 47: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 47

Any of the following values appear in the Status field on the processing set as those errors are being retried.

Status What it meansWaiting The user executed a mass retry of discovery or pub-

lishing errors and the agent has not yet picked upthe job.

Retry – Initializing The agent picked up the job but hasn’t yet sub-mitted it.

<error message> The agent could not submit the job; the error mes-sage is shown here.

Retry – Discovering Files/Publishing Files The agent submitted the job.Retry – Published/Discovered; Retrieving errors The processing server is done discovering or pub-

lishing files, but Relativity is not done retrievingerrors (whether or not there are any).

Retry – Published/Updating tables The processing server is done discovering or pub-lishing files, and Relativity is updating tables.

Retry – Published/Discovered with errors The processing server is done discovering or pub-lishing files and Relativity is done retrieving errors,and there was at least one error.

Retry complete The processing server is done discovering or pub-lishing files and Relativity is done retrieving errors,and there were no errors.

Note: Errors resulting from discovery remain in their current state while the user publishes files and areavailable for retry after publish is complete.

10.5 Unresolvable errorsErrors that can’t be retried can’t be resolved and aremarked by a status ofUnresolvable.

Note the following:

n Only job-level errors can reach a status of Unresolvable; document-level errors can’t be unresolvable.n Not all job-level errors are unresolvable.

You can’t resolve an unresolvable error, but you can learn more about the cause of the error by opening itand referring to theDetails field in the Processing Job Level Errors layout to see when it occurred and why.

Page 48: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 48

11 Reports tabIn the Reports tab, you can generate reports in Relativity to understand the progress and results ofprocessing jobs. You can't run reports on processing sets that have been canceled. When you generate aprocessing report, this information is recorded in the History tab.In each report, the Processing Sets section at the end of the report identifies the sets the report was run for.

The following Report options are available:

n Data Migration - provides information on how data was migrated into Relativity, including detailsabout excluded files and a summary of the number of starting files, published documents, and doc-uments included in the workspace for each custodian associated with the selected processing sets. Youcan run this report on processing sets that have been published.

n Discovered Files by Custodian - provides information on the file types discovered during processingfor the custodians associated with the selected processing sets. This report identifies the total pro-cessable and unprocessable file types discovered and breaks down the totals by custodian. You can runthis report on sets that have been discovered or published.

n Discovered Files by File Type - provides information on the file types discovered during processing forthe custodians associated with the selected processing sets. This report identifies the total processableand unprocessable file types discovered and breaks down the totals by file type. You can run thisreport on sets that have been discovered or published. See Supported file types on page 8 for a list offile types and extensions supported by Relativity for processing.

n Document Exception - provides details on the document level errors encountered during processing,broken down by those that occurred during the discovery process and those that occurred during thepublishing process. You can run this report on sets that have been discovered or published.

n Job Exception - provides details on the job level errors encountered during processing. You can runthis report on sets that have been discovered or published.

n Text Extraction - provides information, broken down by custodian and file type, on the number andpercentage of published files that contain and don’t contain extracted text and the total number of filespublished into Relativity. This also provides details on error messages encountered during processing.You can run this report on sets that have been published.

Note: If you publish processing sets without mapping the Document Extension processing field, theText Extraction report won't accurately report document counts by file type.

11.1 Generating a processing report1. Navigate to the Reports tab.2. From the Available Reports section, select the report type you want to generate. Click on the thumbnail

image to view a larger sample report.

Page 49: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 49

3. From the Processing Sets section, select the sets you want to run the report for, then click GenerateReport.

The report displays in the right pane.

Page 50: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 50

At the top of the report display, you have options to print or save the report. To save, select a file typeat the top of the report.

Note: If you export a report as a PDF, and the web server you're logged in to does not have the fontArial Unicode MS Regular installed (regardless of whether the server the workspace resides on hasthis font installed), you see blocks in the generated PDF file if Unicode characters are used in thereport. To resolve this issue, you can purchase and install the font separately, or you can installMicrosoft Office to the web server, which installs the font automatically.

12 Managing the processing queueYou can manage jobs in the processing queue from the QueueManagement tab in Admin mode. Use theprocessing queue to change the priority of a job or to cancel an imaging job.Select the Processing Queue sub-tab to display the processing queue.

The following columns appear on the Processing Queue sub-tab:

n Workspace – the workspace in which the job was created. Click the name of a workspace to navigate tothemain tab in that workspace.

n Set Name – the name of the processing set. Click a set name to navigate to the Processing Set Layouton the Processing Sets tab. From here you can cancel publishing or edit the processing set.

n Job Type – the type of job running. The processing server handles Processing and Imaging jobs.n Status – the status of the set.n Documents Remaining – the number of documents remaining for the set.

Note: This column can display -1 if you've clicked Discover Files or Publish Files but the job hasn't been pickedup yet.

n Priority – the order in which sets in the queue are processed.n Submitted Date – the date and time the job was submitted, based on local time.n Submitted By – the name of the user who submitted the job.n Server Name – the name of the server performing the job. Click a server name to navigate to the Serv-

ers tab, where you can view and edit server information.At the bottom of the screen, the following buttons appear:

Page 51: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 51

n Change Priority - change the priority of processing sets in the queue.n Cancel Imaging Job - cancel an imaging job. Only imaging jobs can be canceled from the Processing

Queue sub-tab. If you have processing jobs selected and you click Cancel Imaging Job, the processingjobs are skipped.

If you click Discover or Publish on the Home tab or from the Processing Set Layout, but then cancel the jobbefore the agent picks it up, you can return to the set and re-execute the discovery or publish job.

13 Processing workers tabThe Processing Workers tab displays all processing server workers for the active processing servers in theenvironment. If the processing server is not installed, the tab will not appear.Information in this tab will only display if there is an active processing server and if the server manager agent isrunning.View the tab in Admin Mode by clicking the Processing Workers tab.

The workers view displays the following fields:

n Server Name - the server name.n Name - the worker name.n Thread Status - howmany threads in that worker are currently in use.n Worker Status

Status Color DescriptionRunning Green This means the worker is running or ready for workStopped Gray This means the worker is stoppedStopping Gray This means the worker is finishing what it is working on and will stop afterward

n Time - the date and time that the worker was last queried for a status.n Version - the version of the processing server that the worker is running on. If an upgrade fails, the ver-

sion of these workers may be different than your server's versionn Is licensed for processing - indicates whether the worker is licensed for processing. Workers that aren't

licensed for processing will only perform imaging jobs.

Page 52: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 52

13.1 Stopping or starting a workerTo manually stop or start a worker:

1. In Admin Mode, click the Processing Workers tab.2. Select the check box next to one or more workers.3. Click Change Worker Status.4. Select Started or Stopped.5. Click Update. The worker status change will be reflected.

13.2 Enabling or disabling a processing workerTo activate processing licensing on a worker:

1. In Admin Mode, click the Processing Workers tab.2. Select the check box next to one or more workers.3. Click Change License Status.4. Select Processing Enabled or Processing Disabled.5. Click Update.

Note: To review your license terms, navigate to the License tab in Admin Mode.

Page 53: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 53

14 Technical considerations for deduplicationRelativity performs the following functions for loose files and emails when running deduplication of files aspart of a processing job.

14.1 Loose filesRelativity calculates the SHA256 hash in a standard way—all the bits and bytes that make the content of thefile are involved in hash calculation. Metadata is excluded from the hash value for loose files. Relativity thencompares this hash to other loose files to identify duplicates.The following is the standard method for computing a checksum for large and small files:

1. Open the file.2. Read 8k blocks from the file.3. Pass each block into an MD5/SHA1/SHA256 collator, which uses the corresponding standard algorithm

to accumulate the values until the final block of the file is read. The final checksum is derived.

14.2 EmailsThe Processing engine generates four different SHA256 hashes:

n Body hash - takes the text of the body of the e-mail and generates a hashn Header hash - takes themessage time, subject, author’s name and e-mail, and generates a hashn Recipient hash - takes the recipient names and emails of each attachment and generates a hashn Attachment hash - takes each SHA256 hash of each attachment and hashes the SHA256 hashes

togetherThe following is the process for computing Email HeaderHash:

1. A Unicode string containing Sub-ject&lt;crlf&gt;SenderName&lt;crlf&gt;SenderEMail&lt;crlf&gt;ClientSubmitTime is constructed

2. A SHA256 hash is derived from the above3. ClientSubmitTime is formatted with:m/d/yyyy hh:mm:ss AM/PM4. The following is a constructed string: RE: Your last email Robert [email protected]/4/2010

05:42:01 PMThe following is the process for computing Email RecipientHash:

1. A Unicode string is constructed by looping through each recipient in the email and inserting each recip-ient into the string

2. Once the loop completes, the SHA256 hash is computed from the string Recip-ientName&lt;space&gt;RecipientEMail&lt;crlf&gt;

3. The following is an example of a constructed recipient string of two recipients: Russell Scarcella [email protected] Vercellino [email protected]

The following is the process for computing Email MessageBodyHash:

1. If the PR_BODY tag is present in theMSG, capture it into a Unicode string2. If the PR_BODY tag is not present, get the native body from the PR_RTF_COMPRESSED tag and either

Page 54: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 54

convert the HTML or the RTF to Unicode text3. Construct a SHA256 hash from the above string

The following is the process for computing Email AttachmentHash:

1. Compute the loose file standard SHA256 hash from each attachment2. Encode the hash in a Unicode string as a string of hexadecimal numbers without &lt;crfl&gt; separators3. Construct a SHA256 hash from the composed string4. The following is an example of constructed string of two attachments:

80D03318867DB05E40E20CE10B7C8F511B1D0B9F336EF2C787CC3D51B9E26BC9974C9D2C0EEC0F515C770B8282C87C1E8F957FAF34654504520A7ADC2E0E23EA

In all email scenarios, the following is the process for deriving a SHA256 from a Unicode string:

1. The string is converted to a byte array of UTF8 values2. The resulting array of bytes is fed to a standard SHA256 subroutine which computes the SHA256 hash

of the UTF8 byte array

Note: If two emails or loose files have an identical body, attachment, recipient, and header hash, they areduplicates.

Page 55: Relativity Processing User Guide - 8 · Relativity|ProcessingUserGuide-3 8.3Deletingaprocessingset 36 9Runningaprocessingset 36 9.1Discoveringfiles 36 9.1.1Readingprocessingstatus

Relativity | Processing User Guide - 55

Proprietary RightsThis documentation (“Documentation”) and the software to which it relates (“Software”) belongs to kCuraCorporation and/or kCura’s third party software vendors. kCura grants written license agreements whichcontain restrictions. All parties accessing the Documentation or Softwaremust: respect proprietary rights ofkCura and third parties; comply with your organization’s license agreement, including but not limited tolicense restrictions on use, copying, modifications, reverse engineering, and derivative products; and refrainfrom any misuse or misappropriation of this Documentation or Software in whole or in part. The Software andDocumentation is protected by the Copyright Act of 1976, as amended, and the Software code is protectedby the Illinois Trade Secrets Act. Violations can involve substantial civil liabilities, exemplary damages, andcriminal penalties, including fines and possible imprisonment.©2013. kCura Corporation. All rights reserved. Relativity® and kCura® are registered trademarks of kCuraCorporation.