Best Practices In Predictive Analytics Keeping things simple… June 10, 2014 Ajay Gopikrishnan Lead Architect – Analytics & Bigdata
Sep 08, 2014
Best Practices In Predictive Analytics Keeping things simple… June 10, 2014
Ajay Gopikrishnan Lead Architect – Analytics & Bigdata
2
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
What is Predictive Analytics (PA)?
Definition Predictive Analytics is the application of statistical
techniques and BI technologies to uncover relationships and patters from within large volumes of data that can be used to predict behavior or events of
interest Source: TDWI.org
3
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Predictive Analytics in action…
! Predictive analytics (PA) is used to address churn
! PA is used to predict the likelihood of response to a mailer
! PA is used to predict the risk of default on a credit card
! PA is used to predict the average time to failure for a particular industrial heavy machine
Predictive Analytics is more forward looking compared to regular BI – we use past events to anticipate the future!
4
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Why is PA not widely deployed?
! It is usually complex and calls for a combination of skills ! The value generated is often under-rated ! Software is expensive ! Dependency on good quality data ! PA is often taken up more as an experiment and not core to function
5
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Best Practices
Data & Technology
Organization
Process
Structured
Unstructured / semi-
structured
6
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Impact
Marketing
It is important to measure the RoI of PA projects simply because organizational resources are ploughed into action
Mailers are sent
Credit Cards Applications are denied / downgraded
A Canadian bank uses PA to increase campaign response rates by 600%, cut customer acquisition costs in half, and boost campaign ROI by 100%. TDWI.ORG
7
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Analytical Metrics
It is more important to evaluate PA projects by using a set of Business metrics rather than analytical metrics
! R-square ! Lift ! ROC Curve
Business Metrics
! Response rate ! Gross sales ! Net Profit
No one gets a raise or a bonus based on R-square or lift ! - Build business metrics into the Analytical Plan
8
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Be prepared to spend a good of deal of your project time (75% sometimes) on data management
The goal of PA is to isolate the variables from amongst a large set that can best explain the event or behavior of interest; normally EDW tables cannot be used as such
Data jobs
! Merges & joins ! Transformations ! Data Quality
Process jobs
! Exploratory analysis using central tendency measures
! Detect outliers
9
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
You are trying to model human behaviour so do not expect a silver bullet – expect incremental improvements in current level of organizational performance
PA models have a learning curve so be prepared to stay invested over time to improve performance and reap benefits
10
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Better is usually measured in terms of business metrics
Complex Better
A neural network is not necessarily better than Linear Regression if the basic assumptions of Linear regression are being met in the given business problem
11
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
PA projects can be outsourced to an expert agency after suitable due diligence based on resource availability and comparative cost-benefit analysis with respect to in-sourcing
Outsourcing models
! Leading organizations are known to set up captive analytics service centers in remote locations where skills are available
! BPOs/KPOs are known to undertake process elements in a PA project
12
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
There is a substantial shortage of Data Science skills in the market; Companies are tying up with academic institutions for PA programs
PA project teams comprise multiple skills – most successful teams branch out as “Information Management” as a bridge between business and IT
PA teams
! Business analyst ! Quantitative expert ! Tools expert
13
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Let it not be reduced to a research effort by a PA enthusiast within a line function
PA Projects require executive sponsorship; it is better to adopt a top-down approach, that originated from the business even if they are small projects to begin with
14
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
Cost of deployment is high; hence need to measure RoI of PA projects
PA needs to exploit the Enterprise Data Warehouse and enabling technology like In-database and In-memory for scale, faster throughout and a more comprehensive approach (think enterprise PA!)
Example technology
! SAS + Teradata ! SPSS on Netezza ! Analytical Sandboxes
15
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
Best practices in Predictive Analytics
PA is a combination of both art and science!
There is no best software for Predictive Analytics; the major contribution is efficiency; it is up to the user to design the project, define KPIs, evaluate candidate models and choose the best model appropriate for the problem
PA + Big Data + Cloud
! PA applications are deployable on the cloud ! PA vendors are now building compatibility with Hadoop and
related technology to handle unconventional data types
16
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
The PA Maturity Curve
17
Business Information Management
Copyright © 2014 Capgemini. All rights reserved.
Best Practices in Predictive Analytics | June 2014
PA can create business value for the organization
Ajay Gopikrishnan Lead Architect – Big Data & Analytics [email protected]
Thank you!
Reference: TDWI 2010 paper by Thomas Rathburn - 10 mistakes to avoid in predictive analytics