Module 8 -- Automatic Separation and Form ID; Recognition Profiles Kofax Technical Training Page 1 Module 8 Automatic Separation and Form ID; Recognition Profiles Slide 1 Slide 2 Module 8 -- Automatic Separation and Form ID; Recognition Profiles Automatic Separation and Form ID; Recognition Profiles Automatic Separation and Form ID; Recognition Profiles • Automatic separation and form identification • Adding form types • Adding sample pages • Zone types • Modifying recognition profiles • Image cleanup profiles • Copy and paste • Rubber band OCR/ICR Slide 3 Module 8 -- Automatic Separation and Form ID; Recognition Profiles Form Identification: Topology-based An analysis of the sample page on a pixel-by-pixel basis results in a histogram. The histogram becomes a “fingerprint” for the sample page. When documents are scanned, Kofax Capture compares them to the sample page(s) and tries to match the “fingerprints” using two criteria: Confidence and Difference
18
Embed
Module 8 -- Automatic Separation and Form ID; …class.kofax.com/KC10/KC10Mod08.pdfModule 8 -- Automatic Separation and Form ID; Recognition Profiles Kofax Technical Training Page
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 1
Module 8Automatic Separation and Form ID; Recognition Profiles
Slide 1
Slide 2 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Automatic Separation and Form ID; Recognition Profiles
Automatic Separation
and Form ID; Recognition
Profiles
• Automatic separation and form identification
• Adding form types
• Adding sample pages
• Zone types
• Modifying recognition profiles
• Image cleanup profiles
• Copy and paste
• Rubber band OCR/ICR
Slide 3 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Form Identification: Topology-based
An analysis of the sample page on a pixel-by-pixel
basis results in a histogram.
The histogram becomes a “fingerprint” for the
sample page.
When documents are scanned, Kofax Capture compares them to the
sample page(s) and tries to match the “fingerprints”
using two criteria: Confidence and Difference
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 2
Slide 4 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Automatic Separation and Form ID
• Kofax Capture then automatically performs document separation in the Recognition Server using the form identification method just described.
Select “Kofax page recognition” as the Separation and Form
Identification method to make use of topology-based form identification
and document separation in the Recognition Server
Slide 5 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Enhancing Automatic Separation and Form ID
• However, there are instances where the documents are far too similar and the standard topology-based form identification method does not yield consistently accurate document separation results.
• In these scenarios, Form Identification zones can be added and the Confidence and Difference settings can be altered (resulting in a custom Separation and Form Identification profile) to aid in document separation.
• This custom Separation and Form Identification profile works in conjunction with the Form ID zone(s).
Form Identification zones use a Recognition Profile to identify a unique set of pixels (a form
number, name, logo, or shape) and it is linked to a specific form type.
Slide 6 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Confidence and Difference Settings
The Confidence and Difference settings can be altered, resulting in
a custom Separation and Form Identification profile.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 3
Slide 7 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Disabling Topology–based Form ID
• In extreme cases, the topology-based form identification can be completely disabled, relying entirely on one or more Form Identification zones for document separation.
• This is accomplished by creating a custom Separation and Form Identification profile where the Confidence and Difference sliders are both set to “zero” percent.
• Kofax Capture then uses only the Form Identification zones for performing document separation, enhancing accuracy.
NOTE: When using this technique, proper page registration (as discussed in Module 7) may prove to be very critical in consistently
attaining the required form identification.
Slide 8 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Custom Separation and Form ID Profile
Disable the topology-based Form ID by setting the Confidence and Difference sliders to 0%. This will cause Kofax
Capture to perform automatic document separation based entirely on Form ID zones.
Slide 9 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Renaming
Objects (batch classes, document classes, folder
classes, form types, field types) can be renamed to eliminate
confusion. In this case, a new form type is going to be added to
this document class, so the batch class, document class and
form type will be renamed to avoid confusion.
This is done by right mouse clicking on the object and
selecting “Rename” or click on the Rename icon in the ribbon.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 4
Slide 10 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Add Form Type
Right mouse-click on the newly renamed Catalog Orders document class and select “New Form Type”
from the context menu. Give the new form type a new name.
Slide 11 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Image Tab
Permanent Image Cleanup profiles affect the final image output and are
applied to the entire form (not just on a zone-by-zone basis).
Since index zones are going to be setup on this page, Kofax Capture
automatically registers the page (so “Register all pages” really does not
need to be checked).
The Automatic Index Field Recognition Threshold slider may be modified as
necessary.
Slide 12 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Page Level Bar Code Tab
The bar code on the second form type is going to be read as an index zone, so
there is no need to modify any settings on the Page Level Bar Code tab.
Click [OK] when settings are
complete
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 5
Slide 13 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Add Sample Pages
Don’t forget…before adding a sample page, make sure the scanner properties
match those of the production environment.
Click [Add Sample Pages]
Slide 14 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Sample Page Ready
The sample page can be scanned or imported.
As before, the Sample Page icon appears associated
with the form type and the image appears in the
Viewer panel.
Slide 15 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Zone Types
• Multiple zone types are available for defining locations and activities to be performed by Kofax Capture in production. Zone are selected and configured from the Insert tab and include:
• Data or Index• Text Registration• Shape Registration• Separator• Form Identification• Group Zones
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 6
Slide 16 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Data or Index Zones
• Data or Index zones are associated with fields
• Data may be manually entered by a Validation operator (like the State field)
• Or it can be automatically read using one of the recognition engines
• Fields may also be populated using a database lookup or via custom validation scripts
• The zone drawn also defines the field of view for the operator in the Validation and Verification queues
Slide 17 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Text Registration Zones
• Registration zones use optical recognition technology to recognize the first four letters of a text string
• A registration point is placed at the bottom-left corner of the first character reference points from which zones are registered
• Registration points help to align pages, aiding zone positioning for accurate optical data extraction
Remember: Registration is the alignment or calibration of a page being imported or scanned to its sample page, helping to assure
that zones line-up with the goal of enhancing the accuracy of optical recognition operations. Registration can be enhanced by
one or more Registration Zones (Text and/or Shape).
Slide 18 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Shape Registration Zones
• One or more registration points based on Shape Registration zones may also be created for registration.
• Shape Registration zones use shapes like corners, line intersections, and BLOBs (Binary Large OBjects) to establish registration points.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 7
Slide 19 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Separator Zones
• Separator zones use optical recognition technology to recognize an alphanumeric string.
• Anytime that string is found in the specified location, document separation occurs.
• Separator zones are used when there is a single form type per batch class.
• Separator zones require that a custom Separation and Form Identification profile be set up at the batch class level.
• Only one per batch class allowed.
Slide 20 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Form Identification Zones
• Like Separator zones, Form Identification zones use optical recognition technology to recognize an alphanumeric string.
• Upon recognition, the zone can:• Identify the form type• Separate the document
• Designed for use with multiple formtypes in the batch class.
Slide 21
Group Zones
• Group zones allow the collection of multiple values and return the best single value (like a check box) or concatenate the multiple values (like our First Name/Last Name below) into a single field value.
• All zones within a Group zone as well as the Group zone itself, are associated with a common Field.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 8
Slide 22 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Creating a Group Zone
The group zone has a magenta colored frame. This magenta
bounding box defines the area of the image that the Validation and Verification operators will view.
3. Draw a box around the group with the mouse.
2. Select the Group Zone icon from the Insert tab.
1. Define each included member zone first.
Slide 23 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Group Zone Properties
Configure the properties for the
group zone.
Slide 24 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Zones are Indicated on the Sample Page
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 9
Slide 25 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Selecting Recognition Profiles
Recognition profiles use recognition engines. Kofax profiles may be edited and “saved as” a new name to
use whenever needed.
If Auto recognition is enabled, the zone will be processed using a Recognition profile.
Slide 26 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Modifying Recognition Profiles
• Remember: Kofax-supplied profiles can be used as-is, or they can be modified and saved as a new custom profile.
• When a profile is modified, it becomes a new custom profile that gets added to the pool of profiles from which to select when defining optical recognition operations.
• Changing the profile in one instance, will affect the behavior of that profile in all instances of its use.
Slide 27 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Recognition Profile Settings
• Recognition Profiles contain settings for maximizing data extraction.
• For example, a choice of different languages is available…
• Or the print type may be selected…
• Or the content and character set…
Notice that [Save] is
disabled, but [Save As] is
available.
Remember: Kofax profiles cannot be overwritten; only modified and then saved as
new custom profiles.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 10
Slide 28 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Image Cleanup Profiles
• Every recognition profile may use an image cleanup profile.
• This image cleanup is applied to the individual zone using the recognition profile, therefore the changes are temporary and do not affect the image output at export.
• Zonal image cleanup increases accuracy for automatic recognition.
• Image cleanup profiles propagate upward and can be shared by recognition profiles.
Select [Edit] to modify the
Image Cleanup settings.
Slide 29 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Black Border Removal & Cropping
Black border removal gets rid of black
borders around white documents. The
white noise gap is the number of pixels
Kofax Capture must encounter before it
stops border removal because it now thinks
it’s into the form itself.
Please refer to the Reference Section of your course material for examples of image cleanup.
Slide 30 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Deshade
Deshading removes areas of shading on a
document. Set the minimum width and height for an area of
shading and tell Kofax the maximum size of a
“speckle.” Remember, to Kofax, speckles (black and
white pixels) make up what we see as
shading.
Notice that a choice of measurement units is
available: English (inches), Metric
(centimeters) or Pixels.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 11
Slide 31 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Deskew
Deskewing straightens out
crooked documents –up to 12 degrees.
Documents skewed at greater than 12
degrees must be rescanned…or use
VRS (it can deskew up to 45 degrees)
Slide 32 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Despeckle
Speckles up to 32x32 pixels can be
removed through the despeckle settings
Slide 33 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Line Removal
Horizontal and vertical lines may be eliminated by defining what a
line is (and isn’t). In the top example, any horizontal line that is less than 200 pixels (1 inch long at 200 dpi) is NOT removed. Maybe it’s the crossbar on the letter “T.”
Maximum gap in a line that Kofax allows before it stops removing the line can be specified. And
“Edge clean factor” can be set to smooth jagged lines.
“Repair character” fixes characters that have been
damaged by line removal, like when someone doesn’t stay
between the lines when they fill out a form.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 12
Slide 34 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Streak Removal
Vertical streaks may be caused by errant pixels being held in
memory too long during the scan
operation. A width up to 10 pixels maximum
can be specified.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Edge Enhancement
Edge enhancement filters are either ON or
OFF. There are no user definable
properties. Look at the word, “Widgets”
in your reference material provided for an example of what each of these filters
does.
Slide 35
Slide 36 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Selecting Units of Measure
It is possible to start with inches, but change to pixels.
Built-in conversion calculator
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 13
Slide 37 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Testing an Image Cleanup Profile
Image cleanup settings can be
tested from this window
Slide 38 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Image Cleanup Test Results
Horizontal and Vertical line removal working together
View of image without image cleanup
Results of recognition operation after image
cleanup and extraction
View of image with image cleanup applied
Slide 39 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Save As to Create Image Cleanup Profile
Select [Save As] to name the
profile, and then [OK], and finally [Close] to exit.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 14
Slide 40 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Save As to Create Recognition Profile
Note that the image cleanup profile propagates upwards and constitutes a change to the recognition profile. So
select [Save As] to rename the new recognition profile, and
then [Close] to exit.
Slide 41 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Image Cleanup Profile Re-use
Remember: Image cleanup profiles
propagate upward and can be shared
by recognition profiles.
Slide 42 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Recognition and Image Cleanup Profile Re-use
Look for the opportunity to reuse image cleanup and recognition
profiles.
Remember: Changing the profile in one instance, will affect the behavior of that profile in all
instances of its use.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 15
Slide 43 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Copy and Paste – Objects Setup
Use Copy to duplicate an existing object
(batch class, document class,
folder class, or form type). The copy
method is useful for creating a new object that is similar to an
existing one.
Highlight the object and
select [Copy].
Slide 44 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Paste Option
When the Paste option is selected, a copied batch class is given a different
name (such as Name1) with links to the existing document class(es).
Copy
Original
Original
Links to Original
Slide 45 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Paste Special Option
Paste with a copy of the original Document Classes -- a copy of all document class(es)
and their related objects will be created with a unique name(s) and associated with the
copied batch class.
Paste with a link to the original Document Classes -- a copy of the batch class will be
created, linked to the original document classes (same as the Paste option ).
Copy
Original
Copy
Original
Original
Copy
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 16
Slide 46 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Copy and Paste – Zones
Zones can also be copied and pasted for
ease and speed of administration.
Highlight a zone, right mouse click
and select “Copy”.
Slide 47 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Paste New Zones
Right mouse click and select
“Paste”.
Slide 48 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
New Zone and Index Field are Created
A new zone is created and can
be placed as desired on the sample page.
A new index field is created in the
document class.
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 17
Slide 49 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Rubber Band OCR/ICR Setup
Rubber band OCR/ICR is setup in the
Administration Module as a Batch Class
Advanced property
Select a recognition profile based on one of the following profiles (this selection applies only to fields that are not already setup for automatic recognition):
•Kofax High Performance OCR Zonal•Kofax High Performance ICR Zonal•Kofax Advanced OCR Zonal•Kofax ICR Zonal•A custom profile based on one of the above profiles
Slide 50 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Rubber Band OCR/ICR in Validation & Verification
Step 1: The Validation and/or Verification operator selects the rubber band OCR/ICR icon from the Home tab.
Cursor changes shape
Slide 51 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Rubber Band OCR/ICR – Step 2
Step 2: With the new cursor icon, the operator draws a zone around the data to be read and extracted
Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Kofax Technical Training Page 18
Slide 52 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Rubber Band OCR/ICR – Step 3
Step 3: The index field is populated with the value read
Remember: This feature first has to be setup in
the Administration module (Batch Class Advanced properties)
Slide 53 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Demonstration
Adding form types and sample pages, setting up zones, modifying image
cleanup and recognition profiles, copy and paste, and rubber band OCR
Slide 54 Module 8 -- Automatic Separation and Form ID; Recognition Profiles
Lab
• Adding form types and sample pages
• Setting up zones
• Modifying image cleanup and recognition profiles
• Copy and paste
• Rubber band OCR
Refer to the Kofax Capture 10 Lab and Reference Guide