Transcript
Analysis of REAP 1
1
Analytics and Risk
Examples from Research & Analytics Branch
Duncan Cleary dcleary@revenue.ie
http://www.linkedin.com/in/duncancleary
Research & Analytics Branch
DATA - INFORMATION - KNOWLEDGE
2
Revenue’s Business Context
‘To serve the community by fairly and
efficiently collecting taxes and duties and
implementing Customs controls.’
www.revenue.ie
Total Receipts €31.5 Billion (2010, Net)
Analysis of REAP 2
3
Research & Analytics Branch
Conduct analyses to transform data into information primarily using SAS software.
Evidence based projects, predictive analytics, segmentation, forecasting etc. using data from Revenue and other sources.
Enables Revenue make better use of its data and provides an improved understanding of the taxpayer population.
The results are used to better target services to customers and to improve compliance.
4
Target
Analysis of REAP 3
5
Rules
6
…not a duck
Analysis of REAP 4
7
8
Not a duck…
Analysis of REAP 5
9
+
+ =
Rules combined are better…
10
But where are the ducks?
Analysis of REAP 6
11
Case Study: Use of Predictive Analytics in Revenue
Revenue’s Risk system; uses ~300 business rules to quantify risk.
Goal: use predictive analytics to extend from risk to predicting likelihood of yield, if audited.
Pilot model and if successful bring to production.
Show how analytics can assist development of
effective business strategies for Revenue.
Optimise use of Revenue resources.
12
Data and Variables:
Considerable effort at Data Integration stage. (use SAS DI Studio, scalable, semi auto).
Data Quality! Risk system data is opportunistic.
Business Context and understanding.
Rules that fire/ don’t fire, binary and frequency.
Derived variables, such as monetary risk and behaviour scores created by risk system.
Target variables: Audit Outcomes (e.g. yield).
Demographic variables, Geography, Sector etc.
Analysis of REAP 7
13
Help for finding ducks
14
SAS Credit Scoring Module
Banking Analogy: Likelihood of a case defaulting on a loan, based on their profile and the profiles of cases who have defaulted in the past.
Credit Scoring techniques applied in this model where the likelihood of a case to yield, based on their profile and the profiles of cases who have yielded in the past.
Model creates a scorecard and probability of yield for the cases base.
Analysis of REAP 8
15
Training the assistant…
16
Results SAS Credit Scoring Module in SAS Enterprise Miner.
Target: Any yield over €2500= ‘1’, < €2500 = ‘0’
Cut off point e.g. p= 0.65: misclassification of 23% (77% hit rate).
Number of cases: can continue to select until quota is filled, based on decreasing probability to yield. Scorecard can be used to assess cases.
0200400600800
100012001400160018002000
0.95
- 1.0
0
0.85
- 0.9
0
0.75
- 0.8
0
0.65
- 0.7
0
0.55
- 0.6
0
0.45
- 0.5
0
0.35
- 0.4
0
0.25
- 0.3
0
0.15
- 0.2
0
0.05
- 0.1
0
Yielding Cases
Non Yielding Cases
Analysis of REAP 9
17
Scorecard Extract
All cases are asssigned a score
based on their profile as per the
model. Cut offs can be set to
increase likelihood
The less points that a case
scores, the more likely it is to
yield if audited.
18
…?
Analysis of REAP 10
19
Unseen data scored, i.e. cases that have not been audited in period
List of cases with scores based on propensity to yield according to model.
Cut off set high (e.g. 0.70 probability).
Extending the Model: Reduce Misclassification
20
3:1
Hits vs. Misses in Pilot Region
Analysis of REAP 11
21
€
€
€
22
An auditor in the field…
Analysis of REAP 12
23
So what next? ‘Operationalise’ this approach
Developing more models for the business (e.g. Yield, Sectoral, Regional, Liquidation, ‘Phoenix’ Directors, Real Time Risk, etc.)
To evaluate models through field testing, in co-operation with Revenue Regions
Extract more value from the data & info we already have, better training data.
To make analytics more central to how Revenue performs its work
24
Analysis of REAP 13
25
26
Analysis of REAP 14
27
Other Work Audit Yield Models 2 stage: monetary risk as a target, cost of audit
Liquidation Models ‘Balloon’ Payments
‘Phoenix’ Directors/ Group Risk
Model Evaluation
Real Time or Look Back Prevent and Detect, Customs, VAT, Excise
Customer Segmentation
28
Two Stage Model: Test
Analysis of REAP 15
29
Two Stage Model: Test cont’d
30
Liquidation Model Assessment: Probability Distributions
Analysis of REAP 16
31
SNA: Social Network Analysis
32
2 Directors with many companies in common
One director red according to REAP
Other director and most companies green
Methods of propagating content(risk) through a network
Linked to Associated Companies
Directors
Analysis of REAP 17
33
Identifying Risky Cases
Cases flagged by predictive model: Cases flagged by algorithm:
34
Customer Segmentation for Risk
Analysis of REAP 18
35
36
Questions?
_________________________________________________________
Dr. Duncan Cleary
Revenue, Planning Division, Research & Analytics Branch
t: 00353-(0)1-4251414| e: mailto:dcleary@revenue.ie
http://ie.linkedin.com/in/duncancleary
_________________________________________________________
top related