Copyright © gillc S.A. 2009. All rights reserved Financial Services Advanced Analysis Insight and Intelligence September 2010 1
Jun 29, 2015
Copyright © gillc S.A. 2009. All rights reserved
Financial Services Advanced Analysis Insight and Intelligence
ColumboDiscovery™
September 2010
1
Copyright © gillc S.A. 2009. All rights reserved
Contents
Information Challenges
Our Approach to Information Discovery
Columbo® Information Discovery Platform
Entity Extraction and Theming
Shortcomings of existing software
Dynamic Linking
Financial Applications
Summary and benefits
2
Copyright © gillc S.A. 2009. All rights reserved
Information Challenges
Massive data volumes
Petabytes + of data (1 Petabyte (1000 terabytes) = approx 3000 million documents Forensics analysis of hundreds of devices on large investigations Billions of transaction records
Diverse sources The Internet – www, blogs, twitter, social networks, virtual worlds, chat-rooms Internal – E-mail, Office Systems, CRM and Analyst Reports Trading Systems, Accounts, Card Processing, Transactions and Billing FSA, Watch-lists, stolen cards Computers, storage devices, mobile phones, cameras, sat-navs, Wi
Integration of data in multiple formats Structured, unstructured (text), multi-media (image, voice, video) Deleted and hidden Languages / alphabets
Dangers and shortcomings of search
Search engine issues – ranking, relevance etc. Terminology, expert knowledge of subject…. Can distort investigative approach Spellings / miss-spellings
3
Copyright © gillc S.A. 2009. All rights reserved
Some spelling challenges
4
MohammedMuhammadMohammadMuhammedMohamedMohamadMahammedMohammodMahamedMuhammodMuhamadMohmmedMohamudMohammud
HusseinHussainHusainHusaynHuseinHusenHuseyinHussayn
112 different combinations!
Merrill LynchMerril LynchMerill LynchMerall LynchMeryll LynchMerrill LinchMerril LyncheMerill LinchMerall LynnchMeryll LynshMerrillLynchMerrill-LynchMerrill.LynchMerrill Lynch)aMerrill Lynch-Merrill Lynchs
Copyright © gillc S.A. 2009. All rights reserved
Our Approach to Information Discovery
A key difference is that our solutions help users understand and explore the content:
We get the data to tell us what is there, rather than just looking for something specific We don’t search for the needle hidden in the haystack – we remove the hay and find the needle
together with whatever else might be there
We identify entities, themes (subjects), links – in most cases automatically
People, Places, Objects, Account Numbers, Telephone Numbers, Credit Cards, Companies etc. Themes, Concepts, Sentiment Hard and soft (weak) links between Entities and Themes
We present this in ways that help users understand and explore (discover) the data
Entity /Theme Extractions Summaries Timelines and Graphs / Connection and Relationship Diagrams /Geo-location Maps
Intelligent search
Prompted Sounds like / spelt like Semantic (find similar content to this)
Automate processes including reports, where possible
5
Copyright © gillc S.A. 2009. All rights reserved
Columbo® Discovery - gathering
Scanned Text (OCR)
ISP’sTelcos
Forensic Images
ImagesVideo Audio and voice
Lotus Notes
Documents
Powered by Columbo®
RSS feeds
BlogsWWW International, National and local news sites
Accounts
Structured Excel
CSV
Trading SystemsOperational and
Legacy databases
PhoneForensics
Copyright © gillc S.A. 2009. All rights reserved
Entity Extraction & Theming
All structured and unstructured information resources can be automatically processed for entity extraction, including:
Documents – including web pages, social media, office applications, email, databases Digital devices – cameras, phones, SIM cards, storage devices
Additional types can be added by Gillc or added as Custom types by the end user
Metadata from applications, image files and digital devices is also extracted as entity information. For example:
Device type and ID – for phones, cameras, computers etc. Author and creation date – for enterprise documents etc.
7
Themes and Classification - themes and sub-themes are automatically identified from textual resource information
Links - hard and soft links can be identified or uncovered by interacting with the information within Columbo®. Soft links (or weak links) can be identified by:
• Analysing the presence/popularity of entities and themes in different resources/devices
• Using Columbo® Semantic Indexing (CSI) to identify varying levels of link strength
Copyright © gillc S.A. 2009. All rights reserved
Shortcomings of existing software
8
OperationalSystem
Search
PatternRecognition
Neural Networks
LearningEngines
SophisticatedSecurity Systems
ColumboDiscovery™
But isolated silos of data Integration of all types of data
Copyright © gillc S.A. 2009. All rights reserved
Dynamic linking of entities, themes and concepts
9
Email addresses
Telephone numbers
IBAN
Credit Card
Names
Places
Companies
Concepts
Themes
Device type
Author
Locations
Concepts
Themes
Device type
Author
Locations
Email addresses
IBAN
Credit Card
Names
Places
Companies
Telephone numbers
Copyright © gillc S.A. 2009. All rights reserved
Dynamic linking of investigations &discreet databases together
10
Suspect 2
StolenCreditCards
WatchListSuspect 1
Copyright © gillc S.A. 2009. All rights reserved
Applications of ColumboDiscovery™
11
Money-laundering• Patterns of transactions• Company ownership• Associated metadata• Pro-active and reactive• Asset-tracking
Anti-Corruption• Patterns of transactions• Company ownership• Associated metadata• Ethics and compliance• Transparency• Asset-tracking
Fraud Detection• Patterns of transactions• Identity matching•Company ownership• Associated metadata• Link Analysis• Flexible bin-control• Asset-tracking
Hi-Tech Investigations• Patterns of transactions• Compliance•Employee activities• Employee collusion• Cyber-forensics
Copyright © gillc S.A. 2009. All rights reserved
Summary and Benefits
The Columbo® group of products are powerful, next generation information discovery applications
Columbo® applications are tailored towards ‘discovery’, as opposed to ‘search’
Search implies that the user already knows what to look for Discovery allows the data to identify what may be relevant, and allows the user to
interact with it in order to find the information contained within it
The software delivers significant efficiency savings, by both rapidly finding relevant data and automating much of the process including reporting
The software enhances effectiveness, automatically compares content and incrementally builds an intelligence repository
Columbo® is “implementation-lite” and has capacity to readily link diverse organisations together, sharing and collaborating critical data as appropriate
12