Bo Jin Sr. Solution Architect Fatih Karaoglu Sr. Solution Architect Kofax Transformation Modules – Advanced Track & What’s new in KTM Technical Track
Bo Jin
Sr. Solution Architect
Fatih Karaoglu
Sr. Solution Architect
Kofax Transformation Modules –
Advanced Track & What’s new in
KTM
Technical Track
Agenda
Clustering Utility
Benchmarking
Separation
Classification
Extraction
Project Merge Tool
Localisation
Thin Client Enhancements
Q&A
2
Technology Enhancements
Productivity
Enhancements
Design Time
Productivity
Enhancements
Users
Clustering Utility
Technology Enhancements
New Utility for Clustering Unknown Documents
What it does
Requirements
Step-by-step
Importing into KTM
4
What does the Kofax Clustering Utility do?
When configuring KTM content classification, the customer needs
to provide samples for each class.
What KTM requires:
5
What does the Kofax Clustering Utility do?
When configuring KTM content classification, the customer needs
to provide samples for each class.
What customers usually provide:
6
What does the Kofax Clustering Utility do?
presorting a document set into clusters of similar documents
User labels some of these clusters
Utility learns from labeling and pre-sorts again
Several iterations of labeling and pre-sorting
Export of sorted documents as learn-set for KTM content
classification
7
What does the Kofax Clustering Utility do?
new KTM project
Customer uses Utility to provide KPSG or partner with sorted
documents
KPSG or partner uses Utility to sort documents from customer
Understanding what are the biggest subsets of documents in a
customer’s monthly mailroom volume
enhancing a KTM project
Customer adds new classes to project and needs samples for
classification
8
Requirements
Kofax Clustering Utility works with XDocuments
XDocuments must be created with KTM OCR Server tool
KTM (5.5 SP2) must be installed to use Clustering Utility.
9
Using the KTM OCR Server reduces the KTM base volume count
Eval licenses supported
Hardware requirements same as for KC/KTM
Files to be clustered should be local for performance
Need write access to file location
Requirements
10
Step by Step – KTM OCR Server
Configuring the KTM OCR Server:
Select path to unsorted images
Enable „Save XDoc files“ and
„Save text files“
Under OCR Settings, select
proper language
Leave rest at default
Running the KTM OCR Server:
Simply press the Start button
11
Step by Step – Kofax Clustering Utility
1. Import
Point „Import directory“ to same directory of unsorted documents
For each document, an .xdc file and a .txt file must exist
Select „Start Discovery“.
Takes a while, ~0.5 sec per document
Converts XDocs into internal format
Identifies initial clusters
12
Step by Step – Kofax Clustering Utility
2. Discovery
Label initial 3 clusters
You see the most representative document of each cluster
Provide a name for each cluster, will be used as class name in
KTM
13
Step by Step – Kofax Clustering Utility
2. Discovery
14
Step by Step – Kofax Clustering Utility
2. Discovery
You can stop discovery when 80-90% of the documents are
discovered or continue until all documents are discovered
At 80-90% the most common document types are often known,
remaining documents are likely in very small clusters
Click „Review“ to continue to next step
15
Step by Step – Kofax Clustering Utility
3. Review
Sort by categories (labels)
Examine the categories for consistency
Confirm some documents if you want to cluster again
16
Step by Step – Kofax Clustering Utility
3. Review
17
Step by Step – Kofax Clustering Utility
4. Export
Select any directory for export
Sub directories will be created for each category/label
.txt files (and tif/xdoc for reference) will be exported, since only .txt
files are used to train KTM content classification later
18
Importing into KTM
In Project Builder, point New Project dialog Content Classifier
settings to exported directory
Select „Discovered documents“ sub directory
19
Importing into KTM
A class is created in Project Builder for each category
Training documents are imported
Select „Train“ in Project Builder main menu
Verify in Classification Benchmark (Result Matrix)
20
Importing into KTM
21
Setting this up manually
and finding/organizing the
proper training documents
takes hours or days.
With the Kofax Clustering
Utility, this example took
20 minutes.
Benchmarking
Productivity Enhancements – Design Time
KTM 5.5 – Benchmarking
Separation Benchmarking
Classification Benchmarking
Extraction Benchmarking
23
KTM 5.5 – Separation Benchmarking
Separation Benchmark
24
Document Separation Test
KTM 5.5 – Separation Benchmarking
Separation Benchmark
25
Separation Benchmark
Golden Batch
Golden Files – Extraction Benchmarking
KTM 5.5 – Separation Benchmarking
Separation Benchmark
How can a Golden Batch be created?
Kofax Capture (before Export Connector)
KTM Project Builder
26
KTM 5.5 – Separation Benchmarking
Separation Benchmark
27
Separation Benchmark – Quality?
KTM 5.5 – Separation Benchmarking
28
Correct Documents
Rejected Documents
Incorrect Documents
Incorrectly classified
Additional splits
Missed splits
But confidently
Document Review...?
The worst of all three categories
False Postive
KTM 5.5 – Classification Benchmarking
Classification Benchmark
29
KTM 5.5 – Classification Benchmarking
Classification Benchmark
30
KTM 5.5 – Classification Benchmarking
Classification Benchmark
31
KTM 5.5 – Classification Benchmarking
Classification Benchmark
32
KTM 5.5 – Classification Benchmarking
Classification Benchmark
33
Extraction Benchmark
KTM 5.5 – Extraction Benchmarking
Slide 34 34
Extraction Benchmark
KTM 5.5- Extraction Benchmarking
EV = Extracted Value GFV = Golden File Value (perfect file)
EV = GFV Work
EV ≠ GFV Work
EV ≠ GFV False positives
EV = GFV Super
Project quality
Project design
Slide 35 35
KTM 5.5 – Extraction Benchmarking
Extraction Benchmark - Comparison
36
KTM 5.5 – Extraction Benchmarking
37
Extraction Benchmark - Enhancements
Selection List
Sorting
By Column Content
By Status
Open in Document Viewer
Re-arrange columns
Project Merge Tool
Productivity Enhancements – Design Time
Multiple Users – One Project
KTM 5.5 – Project Merge Tool
39
KTM 5.5 – Project Merge Tool
Project Master
40
KTM 5.5 – Project Merge Tool
Copy the Project Master for each aditional user
41
KTM 5.5 – Project Merge Tool
Project Master
42
KTM 5.5 – Project Merge Tool
Copy 1
43
KTM 5.5 – Project Merge Tool
Copy 2
44
KTM 5.5 – Project Merge Tool
Merge Copy 1
45
KTM 5.5 – Project Merge Tool
Source and Destination projects
46
KTM 5.5 – Project Merge Tool
Select Classes
47
KTM 5.5 – Project Merge Tool
Summary
48
KTM 5.5 – Project Merge Tool
Save changes to destination project (Project Master)
49
KTM 5.5 – Project Merge Tool
Merge Copy 2
50
KTM 5.5 – Project Merge Tool
Source and Destination projects
51
KTM 5.5 – Project Merge Tool
Select Classes
52
KTM 5.5 – Project Merge Tool
Summary
53
KTM 5.5 – Project Merge Tool
Save changes to destination project (Project Master)
54
KTM 5.5 – Project Merge Tool
Project Master after merging
55
KTM 5.5 – Project Merge Tool
Elements that can be merged...
56
Classes
Fields
Locators
Validation Rules
Script
Localization
Validation Forms
KTM 5.5 – Project Merge Tool
57
KTM 5.5 – Project Merge Tool
Elements
58
KTM 5.5 – Project Merge Tool
59
KTM 5.5 – Project Merge Tool
60
KTM 5.5 – Project Merge Tool
61
KTM 5.5 – Project Merge Tool
62
KTM 5.5 – Project Merge Tool
63
KTM 5.5 – Project Merge Tool
64
KTM 5.5 – Project Merge Tool
Summary
65
KTM 5.5 – Project Merge Tool
Save changes
66
KTM 5.5 – Project Merge Tool
The merged project
67
Localisation
Productivity Enhancements – Users
KTM 5.5 – Localisation
KTM Languages
English
German
69
KTM 5.5 – Localisation
Additional KTM Languages
70
# Language Pack Language ID
1 Brazilian pt-BR
2 Chinese zh-CN
3 Czech cs
4 French fr
5 Italian it
6 Japanese ja
7 Polish pl
8 Russian ru
9 Spanish es
10 Swedish sv-SE
Additional KTM Languages
Graphic User Interface
Project Builder and runtime modules
Component based messages
KTM Server
Documentation (runtime modules and Userguide.pdf)
1. Document Review 2. Correction 3. Validation 4. Verification
KTM 5.5 – Localisation
71
KTM 5.5 – Localisation
Project Settings - Localization
72
KTM 5.5 – Localisation
Project Settings - Localization
73
KTM 5.5 – Localisation
.Net concept
Primary language
Secondary language
74
English en
English (United Kingdom) en-UK
English (United Stated) en-US
KTM 5.5 – Localisation
Fall back principle
75
KTM 5.5 – Localisation
Fall back principle
76
Localise
End
Primary – Secondary
language translation?
Yes
No
Primary language
translation?
Use default value for display name Use translation value for display name
Yes
No
KTM 5.5 – Localisation
KTM GUI, Server and Active Language
77
KTM 5.5 – Localisation
KTM GUI Language, Server and Active Language
78
The Project.ActiveLanguage overrides the Region and Language settings
KTM 5.5 – Localisation
Summary
KTM Graphic User Interface language
KTM Server language
Project language (Project.ActiveLanguage)
79
-
-
-
KTM 5.5 – Localisation
What can be localised?
80
KTM Element Yes/No Note
Fields
Table Columns
Formatting Methods Component messages used
Validation Methods Regular Expression only
Component messages used
Validation Form Tab captions
Field label
Simple label
Button captions
DB button captions
Group captions
Script Resources
KTM 5.5 – Localisation
Fields
81
KTM 5.5 – Localisation
Tables
82
KTM 5.5 – Localisation
Project – Script Resources
83
KTM 5.5 – Localisation
Project – Script Resources
84
Project.Resources.GetString("Error_Example")
KTM project folder structure
Default language in *.fpr file
Additional languages
KTM 5.5 – Localisation
85
Document Review
Default language
Localised languages
KTM 5.5 – Localisation
Localisation.xml
External editor
Language ID
Example: Field • Default value • Localised translation
86
KTM 5.5 – Localisation
XML Update
87
KTM 5.5 – Localisation
Project design language
88
Thin Client Enhancements
Productivity Enhancements – Users
New and Improved Functionality Inside KTM TC 5.5
Validation Form Layouts
Annotations
Additional Batch Editing Operations
User Settings
Advanced Login Capabilities
Combo-boxes With Descriptions
Combo-boxes Inside Tables
Other “Small” Things
KTM TC 5.5 Improvements
90
Support Validation Form Layouts
Different font types and sizes
Mini-viewers
Custom buttons
Location of fields
Anchoring
Layout localization
KTM TC 5.5 Improvements
91
Support Annotations
Display annotations created by KTM modules
Create new annotations inside Thin Client
Edit annotations
Delete annotations
Move annotations
Hide/Display annotations
KTM TC 5.5 Improvements
92
Additional Batch Editing Operations
Delete pages
Move, merge, delete documents
Move, merge, delete, split, create folders
KTM TC 5.5 Improvements
93
Preserve User Settings
User name at login screen
Batch Open dialog box: size, columns, sorting settings
Panels: size, expanded states
Zoom settings: fit width, fit height, custom zoom
Annotation settings: hide/display annotations
KTM TC 5.5 Improvements
94
Advanced Login Capabilities
Domain login for linked users
Single sign-on support for Active Directory users
KTM TC 5.5 Improvements
95
Combo-boxes Inside Tables, Items With Descriptions
Display descriptions, values or both
Support empty strings consistently for all combo-boxes
Paging control for over 100 items
Type-ahead filtering capabilities
New script events to initialize scripted combo-boxes
KTM TC 5.5 Improvements
96
Other “Small” Things…
Batch loading performance improvements (project caching)
PDF support
Reject/Unreject documents – support scripting on the server
Allow to install Thin Client Server on top of previous version
Propagate user changes in config files to a new version
KTM TC 5.5 Improvements
97
Q&A
Fatih Karaoglu
Sr. Solution Architect
Phone: +41 41 799 82 36
Email: [email protected]
For further information, please contact:
Bo Jin
Sr. Solution Architect
Phone: +41 41 799 82 30
Email: [email protected]