Thursday, February 6, 2014 | 3:00 - 4:00 PM Speakers: Raul Saccani, Dave Stewart, John Walsh Making Big Data Your Ally Using data analytics to improve compliance, due diligence and investigations
Oct 21, 2014
Thursday, February 6, 2014 | 3:00 - 4:00 PM
Speakers:Raul Saccani, Dave Stewart, John Walsh
Making Big Data Your AllyUsing data analytics to improve compliance, due diligence and
investigations
CEOSightSpan
Charlotte, NC
John Walsh
Director, Fraud and Financial Crimes PracticeSAS Institute
Cary, NC
Dave Stewart
Partner, Forensic and Dispute ServicesDeloitte
Buenos Aires
Raul Saccani
CEOSightSpan
Charlotte, NC
John Walsh
Director, Fraud and Financial Crimes PracticeSAS Institute
Cary, NC
Dave Stewart
Copyright © 2012, SAS Institute Inc. All rights reserved.
VOLUME
VARIETY
VELOCITY
VALUE
TODAY THE FUTURE
DA
TA
SIZ
E
THRIVING IN THE BIG DATA ERA
The Challenge
The Analytics Lifecycle
IDENTIFY /FORMULATE
PROBLEM
DATAPREPARATION
DATAEXPLORATION
TRANSFORM& SELECT
BUILDMODEL
VALIDATEMODEL
DEPLOYMODEL
EVALUATE /MONITORRESULTS
Domain ExpertMakes DecisionsEvaluates Processes and ROI
BUSINESSMANAGER
Model ValidationModel DeploymentData Preparation
IT SYSTEMS /MANAGEMENT
Data ExplorationData VisualizationReport Creation
DATA SCIENTIST
Exploratory AnalysisDescriptive SegmentationPredictive Modeling
ANALYSTDATA MINER
Case Studies• Tier I Asian Bank
– Visual analytics of Group Security Operations– Cross-border sharing of summary data
• Tier I Global Bank– AML model tuning & optimization– Large volume peer group analysis
• Tier I Global Bank– “Safety Net” approach for controlling affiliate risk– Ad hoc builds of illicit networks
Observations
• New capabilities require new thinking about business as usual
• Variety of data & techniques requires new skills within lines of business
• Adopt a pro-active/pre-emptive analytics strategy• Understand your company’s technology roadmap
and get on board
Partner, Forensic and Dispute ServicesDeloitte.
Buenos Aires
Raul Saccani
Raúl Saccani´s presentation contents
• Data privacy standards in Latin America, compared to US and EU standards, and
• How data privacy rules, limitations on cross-border data sharing can impact compliance functions and internal investigations
• Role of e-discovery in financial crime investigations, including internal investigations
• Sources of data in internal investigations, including structured and unstructured data
Privacy and Data Protection1) The context2) Data protection and electronic evidence3) EU law on privacy and data protection4) Practical considerations
(1) ContextMost personal information and most evidence are digitalLawyers and judges need to know significance of digital
informationNeed to know and understand the :• nature of digital evidence• data protection rules of the road
Otherwise no :• remedy for the data subject• fair trial for the accused• convictions for the prosecutor
No.
of c
ount
ries
with
priv
acy
law
s
Time Period
The growth of global privacy laws
(2) Data Protection andElectronic Evidence
• Overlapping Scope• Data protection rules apply to the courts• Fruits of the Poisoned Tree• precautions to ensure admissibility of e-
evidence
(3) EU Law on Privacy: two fundamental rights(a) the Right to Privacy
ECHR (1950), Article 8Everyone has the right to respect for his or her private and family life, home and correspondence
EU Charter (2000), Article 7 : …and communications.
(b) the Right to Protection of Personal Data
an autonomous fundamental right to self-determination in the Information Society
Article 16, EU Treaty
EU Charter, Article 8 : 1. Everyone has the right to the protection of personal data concerning him or her.
2. Such data must be processed fairly for specified purposes and on the basis of the consent of the person concerned or some other legitimate basis laid down by law. Everyone has the right of access to data which has been collected concerning him or her, and the right to have it rectified.
3. Compliance with these rules shall be subject to control by an independent authority
Data Privacy• What is a Data Controller?
– Person or entity who determines purpose and manner of processing
– EU Directive imposes obligation to protect personal data– Potential liability for failure to fulfill obligations– Responsible for directing and controlling actions of Data
Processor
• What is a Data Processor?– Processes data on behalf of and at the direction of Data
Controller– Must follow instructions of Data Controller
Practical Considerations• Now you are in a position to make the necessary cost-benefit
analysis. Ask yourself the following questions:– What is the true value of this source of information
relative to (a) other more easily accessible sources of information and (b) the litigation as a whole?
– What are the projected costs of complying with the EU Data Protection Directive?
– What are the projected costs of defending a discovery dispute?
– What are the relative strengths and weaknesses of each side on discovery issues?
Outsourcing Personal Data Processing Contractual means:
All practicable security measures Timely return, destruction or deletion
of data Prohibition against any use or
disclosure for other purposes Prohibition against sub-contracting Right to audit and inspect
Forensic Technology
IdentificationIdentification
Preservation
ProcessingProcessing
ReviewReview
* Forensic methodology* Chain of custody* Integrity* Confidentiality
* Forensic methodology* Chain of custody* Integrity* Confidentiality
Pre-processingPre-processing
Forensic Technology
IdentificationIdentification
PreservationPreservation
Pre-processing
ProcessingProcessing
ReviewReview
(Pre-processing tasks)
* Integrity verification* Formats conversion and standardization* Chain of custody* Additional copies
(Pre-processing tasks)
* Integrity verification* Formats conversion and standardization* Chain of custody* Additional copies
Forensic Technology
IdentificationIdentification
PreservationPreservation
Processing
ReviewReview
(Obtain value from information without modifying it)
*Deleted documents or e-mails*Information in hidden sectors or partitions*Encrypted files*Files with modified extensions*Internet devices, MSN, Y!, social networks*Applications audit trails / SO
(Obtain value from information without modifying it)
*Deleted documents or e-mails*Information in hidden sectors or partitions*Encrypted files*Files with modified extensions*Internet devices, MSN, Y!, social networks*Applications audit trails / SO
Pre-processingPre-processing
Forensic Technology
IdentificationIdentification
PreservationPreservation
ProcessingProcessing
Review
(Review platform)
*Do not modify evidence*Eliminate duplicates*Early Case Assessment (ECA)*Keywords / tags*Produce evidence*Bates stamping*Audit logs
(Review platform)
*Do not modify evidence*Eliminate duplicates*Early Case Assessment (ECA)*Keywords / tags*Produce evidence*Bates stamping*Audit logs
Pre-processingPre-processing
Data Complexity, Variety and Velocity
Terabytes
Gigabytes
Megabytes
PetabytesBig Data
Social sentiment
Audio/video
Log files
Spatial & GPS coordinates
Data market feeds
eGov feeds
Weather
Text/image
Click stream
Wikis/blogs
Sensors/RFID/devices
Web 2.0 Collaboration
Tourism
Web Logs
Digital Marketing
Citizen Engagement
Recommendations
Advertising
Mobile
ERP/CRMPayables
Payroll
Inventory
HR People
Case Management
Inspection/Permitting
The Change is Driving Big Data
29
Big Data Is…
Big Data represents the
Trends, Technologies and
Potential for organizations
to obtain valuable insight
from large amounts of
Structured, Unstructured
and fast-moving data.
20%Structured Data
80%Unstructured Data
Click StreamVideos
ImagesText
Sensors
Where Does Big Data Come From?• Our Data-driven World
– Science• Data bases from astronomy, genomics,
environmental data, transportation data, …– Humanities and Social Sciences
• Scanned books, historical documents, social interactions data, new technology like GPS, …
– Business & Commerce• Corporate sales, stock market transactions, census,
airline traffic, …– Entertainment
• Internet images, Hollywood movies, MP3 files, …– Medicine
• MRI & CT scans, patient records, …
Structured vs unstructured data
• Structured data : information in “tables”
Employee Manager Salary
Smith Jones 50000
Chang Smith 60000
50000Ivy Smith
Typically allows numerical range and exact match(for text) queries, e.g.,Salary < 60000 AND Manager = Smith.
Unstructured data
• Typically refers to free text
• Allows– Keyword-based queries including operators– More sophisticated “concept” queries, e.g.,
• find all web pages dealing with drug abuse
Forensic Data Analytics - DefinitionCore objectives:Identifying, preserving, recovering, processing, and analyzing structured, standardized and/or codified digital information for the purpose of generating evidence that may be used as such in an investigation, and that may ultimately serve as legal actions support in litigation.
Source of information:Company’s accounting system (ERP), proprietary or third party-developed vertical applications, intersystem interfaces, financial reporting worksheets.
Data Acquisition, Accounting Integrity Control and Data Mapping
Evaluation of fraud and misconduct risk indicators
Routines and tests
Identification of unusual or irregular trends and patterns
Analysis of pre-identified transactions
How data analytics works?
– Reviews with focus in red flags detected.– Master vendor and customer files analysis:
• Databases cross analysis between company databases and public databases and records. Some examples are: Clients related to public biddings Vendors/Clients with invalid or incomplete key data Vendors/Clients with potential tax irregularities Vendors/Clients with unusual activities Vendors/Clients with unusual characteristics Vendors/Clients with unusual transactional activity Duplicate Vendors/Clients Vendors/Clients related to employees or other Vendors/Clients Employees related to other employees
Usual procedures - Overview
How data analytics works?
Apparently unrelated
Employee - Vendor Matching: identical domicile as per external databases
Masters External databases
CODE VENDOR DOMICILE CITY Taxpayer IDTaxpayer ID
COMPANY NAME
DOMICILEALTERNATIVE
DOMICILE100911 TRANSPORTES PARANÁPARANÁ 1 CAP. FED. 30-70867893-0 30-70867893-0MARÍA PEREZ PARANÁ 1 AV. CÓRDOBA 999 PISO 3
CODE EMPLOYEE DOMICILE CITY Employee ID Employee ID NAME DOMICILE CITY502435 JUAN PEREZ AV PUEYRREDÓN 1111CAP. FED. 23-20667877-4 23-20667877-4JUAN PEREZ CÓRDOBA 999 PISO 3CAP. FED.
Unusual relationshipThe difference might arise from the fantasy name – company name
Examples of results per vendor
Individual
Sequentially numbered invoices
Related to a potentially irregular entity
Based on external public sources, he/she would be working under a contract of employment Vendor C: Provider
of advertising services
Entity showing no tax activity
Data quality issues (incomplete information)
Company name does not match the information filed with AFIP
Significant number of legal actions Vendor :
Advisory services fees
Individual
Sequentially numbered invoices
Related to a potentially irregular entity
He/she would be working under a contract of employment
Vendor : Provider of advertising services
Vendors with higher scoring
Vendor Name
Hig
h ris
k fr
aud
aler
t
Unu
sual
act
ivity
Mas
ter
data
cha
nges
Unu
sual
Beh
avio
ur
Inco
nsis
tent
nam
es
Pot
entia
l tax
irre
gula
ritie
s
Con
nect
ed e
ntiti
es
Sus
pici
ous
tax
paye
r ID
Sus
pici
ous
addr
ess
Sus
pici
ous
tele
phon
e
Unu
sual
info
rmat
ion
Oth
er P
oten
tial i
rreg
ular
ities
Dat
a Q
ualit
y -
Inva
lid k
ey d
ata
Dat
a Q
ualit
y -
Mis
sing
key
dat
a
Dup
licat
es
TO
TA
L S
CO
RIN
G
Tes
t 00
1
Tes
t 00
2
Tes
t 10
0
TO
TA
L T
ES
TS
100123 Vendor 1 100 10 0 0 0 3 7 0 0 0 2 0 0 0 2 122 1 0 0 1100981 Vendor 2 100 10 0 1 0 2 8 0 0 0 0 0 0 0 4 121 1 1 0 2100789 Vendor 3 100 10 0 0 0 3 4 0 0 0 0 3 0 0 3 120 1 0 0 1101000 Vendor 4 100 0 0 0 0 2 7 0 0 0 0 0 0 0 4 109 0 0 1 1102078 Vendor 5 100 0 0 0 0 1 3 0 0 0 2 1 0 0 5 107 0 0 1 1
Each routine is classified into these groups considering the estimated risk inherent to each test.
Note: for instance, only three routines are identified in the chart. The complete analysis includes over 200 routines.
Manual Journal Entries Ranking
• Night shifts • Unbalanced entries • Reclassifications• Weekends • Rarely used accounts • Benford Law
• Holidays • Adjustments• Round numbers • Reversals
Your Questions