Rise of the Machines – Applying Intelligence Tradecraft to Marketing Data Mumbrella Finance Marketing Summit Sydney, 22 September 2016
Rise of the Machines – Applying Intelligence Tradecraft to Marketing Data
Mumbrella Finance Marketing Summit Sydney, 22 September 2016
ADATOS
• 27 years as a U.S. Intelligence Officer Trained over 1000 analysts in Intelligence Tradecraft Co-founded Lockheed Martin Centre for Security Analysis.
• Declassified Intelligence Analytic Tradecraft methodology and technology for the private sector (2005)
• Consulted with PwC, Thomson Reuters and Accenture. Clients: Gates Foundation, BritishPetroleum, Merck Pharma, etc.
• Operations in Singapore, USA, and Australia
• Development team composed of PHD’s and Master’s in IT/Computer Science, (University of Pisa (Italy), Massachusetts Institute of Technology(Boston, USA), Cornell University(USA)) experience in Machine Learning, Complex Networks, and UI/UX Development
• Management and Business Development teams with extensive experience in IT Management, Management Consulting, and Marketing
APPLIED in the Intelligence Community for over 7 decades TO Achieve SCALABILITY, SPEED & ACCURACY
GEN0Statistics
GEN1.0‘Big Data’
GEN2.0AI
GEN2.5AI
GEN3.0AI
Heuristics/RulesBased DataDriven
MachineLearning/NaturalLanguageProcessing
Deep Neural Networks Auto-Encoders
Statisticians“NotNecessarilyTrue”
QuantitativeAnalysts“BoiltheOcean”
DataScientists“AlgorithmLibrary”
DataScientist“SignalProcessing’
Limitations of The Current AI Resurgence
•Qualified Data Science Shortage. Machine Learning (ML) and Natural Language Processing is dependent on Data Scientists = high labour cost •Many companies claim they offer AI to increase ‘Marketability’ • Is this sustainable?
•Many approaches to AI are academic in pursuit of ’Holy Grail’ making machines think and act like humans. •Business to consumer (B2C) focus on enhancing the customer experience: Facial Recognition, Voice Recognition. How is AI being applied to real business problems. •Avoid addressing the reality of Feature Engineering and Data Quality
Cutting through the Hype
•AI has been around for decades. •Scalability, Speed, Accuracy: OODA • If you can’t trust the machines, you won’t achieve speed, accuracy and cost efficiencies. •“Never send a human to do a machine’s job.” • It is NOT ‘Judgement Day’
Crossing the Bias Chasm
•Too much data not enough qualified analysts. Data Scientists do not scale. •A high labour cost model promoted to sell enterprise solutions and consultancy services. •Data Munging: The Elephant in the Room •Despite the liberal use of the ‘Data Scientist’ title, many are unqualified.
A ‘Data Scientist’ is the last person you should hire
The AI Threat… and Opportunity
•Adoption of AI is driving the ‘new’ economy, reliant on technology and data and the ability of new, agile, disruptive business models to transform ‘old’ economies •Asia is aware of the opportunity to leap frog and establish competitive advantage • Efforts are supported by Government and Industry • In markets where talent is scarce or expensive, AI offers a stronger value proposition
So what?
•Can Australia seize the opportunities and maintain their dominance in Finance? •Will Australian businesses trust the machines or will they continue to rely on people-centric business models? •It’s not too late for the Australian Finance Industry to leapfrog, but the window is rapidly closing.
Lessons Learned
• Focus on applications that increase revenue and/or reduce costs • Focus on markets where speed and accuracy are critical • Focus on addressing the challenge of too much data not enough qualified analysts • Focus on applications where Data Analysis can be automated through AutoEncoders
THE ADATOS METHOD
Our valued clients in the private sector understand the urgent imperative to glean insights from data sources. Too much data, not enough analysts. Enter Adatos, with its ability to build learning machines that enhance the capabilities of your data science team, or provide them with the capability where it didn't exist before.
“The goal is to turn data into information, and information into insight.”
RETAIL- 360 view of the customer- Click-stream analysis- Real-time promotion- Location-based marketing
FINANCIAL SERVICES- 360 view of the customer- Fraud detection- Risk management
HEALTHCARE- Personal health monitoring- Remote healthcare- Epidemic early warnings
OTT PLAYERS- 360 view of the customer- Location-based services- Local advertising- Real-time services
UTILITIES & GOVERNMENT- Smart grid management- Real-time monitoring
MARKETING- Market research- Targeted advertising- Product creation
CUSTOMER SERVICE- Multi-channel CXP analysis- Mobile self-care- Cross-selling and up-selling
LOGISTICS- Supply chain management- Predictive logistics- Integrated distribution
INTELLIGENT NETWORKS- Smart grids- M2M- Video surveillance
GEOLOCATION- Location-based services- Navigation- Geospatial analytics
Verticals HORIZONTALS
OUR CORPUS OF KNOWLEDGESpeed and Flexibility in Deployment with our partners
9
Because we have a structured, repeatable methodology, we are able to build new solutions rapidly; Setup becomes even faster if we’re pulling a curriculum off the shelf. Some curriculums we’ve built:
Credit scoring SALES FORECASTPurchase prediction
Market basket recommender INTRUSION DETECTION
ANTI MONEY LAUNDERING Fraud Detection
Adatos Case Studies
Develop a Prediction engine that is able to predict the likelihood of settlement of an outstanding NPL by a borrowerData: • Borrower Credit History across all accounts• Borrower Current Credit Status across all accounts
IDENITIFYING LIKELIHOOD OF SETTLEMENT OF NPL’s
• An innovative finance technology company in India, helping consumer’s improve their credit score by providing a friendly facility for settling outstanding loans (NPL’s); and at the same time helping banks to collect on NPL’sC
LIEN
T C
HA
LLEN
GE
RES
ULT
S • Identified 85.5% of NPL’s as high probability of complete default or write-off correctly
Adatos Case Studies
• Identified 80.03% of loans that eventually defaulted, went into legal proceedings or written off. If model had been applied at time of application, the lending company could have potentially saved PHP720 million pesos (15M US$)
Develop a Credit Scoring system based on borrower bill payment behavior, borrower demographics, and historical loan payment behaviorData: • Loan customer data and loan repayment data of half a million accounts,
spanning five years• Bill Payment data across 200 billers, for 36 months, at about 34 million
transactions per month (>1B transactions)
IDENITIFYING POTENTIAL NPL at Loan Application THROUGH A SMART CREDIT SCORING SYSTEM
• The Philippines’ largest bill payment aggregators, covering 70% of the bill payments across 200 billers in a population of 100M.
• One of the biggest retail and micro lending institutions in the Philippines, with a 33B loan bookC
LIEN
T C
HA
LLEN
GE
RES
ULT
S
01 Identify Non Performing Loans (NPL’s) on historical
and current loan book.
Collect customer data and macro-economic data related to
customer geography and industry Create Curriculum & Build
Cognitive Machine that Predicts NPL Probability
Predict NPL Risk of New Applications
DETECT BUILD PREDICT
03 04
O.R
02
PROFILE
O.R
O.R
Aging>90Days?FlagasNPL
CustA,35yrs,CityXYZ,Retail,MYR150,000income….[FeatureN],NPL:N
CustB,50yrs,CityBBB,Mining,MYR90,000…[FeatureN],NPL:Y
CustC,45yrs,CityABC,Tech,MYR200,000…[FeatureN],NPLN
NewApplication
NPLPredictionMachine
MacroEconomicIndicators
Income&SpendingBehavior
Demographic&LifestyleInfo
NPL RISK Prediction AT LOAN Application
Adatos Case Studies
• Identification of over 50 market basket of goods to be used in product bundling.
• Identified over 30 micro-segments to be used in targeted marketing efforts
The Challenge: Profile current customer behavior and develop a market basket recommender to offer better product bundles. Data Size: Store data at 2,500 line items per hour.
PROFILING CONVENIENCE STORE CUSTOMERS
One of North America’s largest convenience store operators with more than 16,000 stores across Canada, the United States, Europe, Mexico, Japan, China, and IndonesiaC
LIEN
T C
HA
LLEN
GE
RES
ULT
S
Adatos Case Studies
• A novel and smart data blending solution that uses text analysis to determine the context of the dataset, and our deep learning technology to apply the appropriate string metrics / string distance function; also has a learning functionality that updates the matcher with user corrections
• Up to 97% accuracy in finding matches, depending on dataset
Match various datasets with no unique key, datasets can have have a varied number and types of fieldsData:• Store names and addresses for multiple countries• Items (SKUs) with different variants, packaging, and size• Unmatched categories to items
SMART DATA BLENDING BETWEEN DATASETS WITH NO UNIQUE KEY
• One of the largest global Fast Moving Consumer Goods (FMCG) companies in over 150 countries with over 400 brands,
• Data analytics functionality headquartered in Singapore CLI
ENT
CH
ALL
ENG
ER
ESU
LTS
Usage Habits• Time , Location and Music Profile
listened to• Download history• Frequent users• Merchandise and Concert Ticket
Purchase History
Taste / Preferences• Favorite Artists to Stream /
Download / Watch Music Video / Share
• Favorite Music Profiles to Stream / Download / Watch Music Video / Share
External Sources• WeChat in-App purchases: Taxi , Food
Delivery, Movie Tickets, Concert Tickets• Use WeChat feature to identify songs
playing and add it to music profile
Customer Value• Spend capacity• Recency• Frequency
USER PROFILING & MICROSEGMENTATIONData-driven segmentation allows discovery of micro-segments by profiling customer behavior and not demographics., previously not discoverable by traditional statistics-based, model-driven market segmentation
Identifymarketsegmentsoutsideoftraditionaldemographicsegmentationbyfindingbehavioralpatternsandclusteringtheportfoliointoveryspecificmicro-segments-developingacustomerDNAthatisclearandactionable.
Ourself-learning,self-evolvingmachinesareabletoadjusttohowcustomersmoveacross
segmentsthroughtimeandproviderecommendationsasneededonnextbest
actionforeachcustomer.
Ourpredictivemachinesnotonlyprofilebasedonhistoricalbehavior,butareabletopredictfuturecustomerbehavior–whethertheywillspendmoreorless,churn,etc.
SmarterSegmentation
EnableDynamic
Segmentation
PredictCustomerBehavior
Adatos vs. Traditional Methods
Adatos Case Studies
• Identification of over 50 market basket of goods to be used in product bundling.
• Identified over 30 micro-segments to be used in targeted marketing efforts
The Challenge: Profile current customer behavior and develop a market basket recommender to offer better product bundles. Data Size: Store data at 2,500 line items per hour.
PROFILING CONVENIENCE STORE CUSTOMERS
One of North America’s largest convenience store operators with more than 16,000 stores across Canada, the United States, Europe, Mexico, Japan, China, and IndonesiaC
LIEN
T C
HA
LLEN
GE
RES
ULT
S
Adatos Case Studies
• By profiling customers based on payment behavior, false positives were reduced, thereby reducing loan default rates by 80%, potentially saving the lending company more than $720 million pesos (15M US$)
Develop a Credit Scoring system based on borrower bill payment behavior, borrower demographics, and historical loan payment behaviorData: • Loan customer data and loan repayment data of half a million accounts, spanning five
years• Bill Payment data across 200 billers, for 36 months, at about 34 million transactions per
month (>1B transactions)
PROFILING LENDER BEHAVIOR FOR A SMART CREDIT SCORING SYSTEM
• The Philippines’ largest bill payment aggregators, covering 70% of the bill payments across 200 billers in a population of 100M.
• One of the biggest retail and micro lending institutions in the Philippines, with a 33B loan bookC
LIEN
T C
HA
LLEN
GE
RES
ULT
S
MICROSEGMENTS – Examples
REAL WORLD USE CASE - TA FENG DATASET
• Grocery POS Data for one (1) branch
• 4 months: November 2001 – Feb 2002• 817,739 Transactions• 32,266 Customers• 23,812 Product ID’s / SKU’s
Data Available• Age Group• Residential Area w/ relative
distance to grocery• Product Category / Sub-class
(Masked – numeric, no semantic information on category nor product)
• Transaction Info : Product, Price, Quantity bought
TA FENG DATASET – CURRICULUM Features on Dataset
• Age Group• Residential Area w/ relative
distance to grocery• Product Category / Sub-class
(Masked – numeric, no semantic information on category nor product)• Transaction Info : Product,
Price, Quantity bought
Derived Features• Customer Total Spend• Customer Spend per Visit• Total Number of Visits• Frequency of Visits (Average # days between
visits)• Lifetime (Duration from first to last visit)• Number of Categories Bought per Visit (Variety)• % of Weekend & Weekday Shopping Days • High vs Low Value shopper: % of High, Medium
and Low Priced items per Category bought
Identify Micro segments through clustering
1
30
4
2
Describe MICRO Segments
MICRO SEGMENT
Average of Avg Days
Between
StdDev of Avg Days Between
cluster_0 89.88 40.72cluster_1 20.63 12.00cluster_2 82.11 42.02cluster_3 70.95 46.61cluster_4 11.89 6.89
Frequency : Average Number of Days Between Visits
cluster 1cluster 2cluster 3cluster 4
cluster 0
Com
bine
d D
imen
sion
Avg. Number of Days Between Visits
Describe MICRO SegmentsMICRO SEGMENTS
Avg Total Visits
StdDev of Total Visits
cluster_0 1.43 0.62cluster_1 6.44 3.86cluster_2 1.71 0.91cluster_3 2.19 1.66cluster_4 13.52 9.39
Loyalty: Total Number of Visits over 4 Months
cluster 1cluster 2cluster 3cluster 4
cluster 0
Com
bine
d D
imen
sion
Total Number of Visits per Customer in 4 Months
Describe MICRO Segments
MICRO SEGMENTS
Average of Unique Cats per Visit (Mean)
StdDev of Unique Cats per Visit (Mean)
cluster_0 10.01 3.04cluster_1 6.27 2.53cluster_2 19.91 5.04cluster_3 3.29 1.74cluster_4 9.13 4.48
Variety: Unique Categories per Visitcluster 1cluster 2cluster 3cluster 4
cluster 0
Com
bine
d D
imen
sion
Unique Categories per Visit (Mean)
Describe MICRO Segments
MICRO SEGMENTS
Average of Spend Per Visit (Mean)
StdDev of Spend Per Visit (Mean)
cluster_01,602.5
5 858.90cluster_1 910.19 529.23cluster_2 3,112.35 1,201.52cluster_3 495.86 414.40
cluster_41,389.5
6 887.94Spending Capacity: Spend per Visit
cluster 1cluster 2cluster 3cluster 4
cluster 0
Com
bine
d D
imen
sion
Spend per Visit (Mean)
Describe MICRO Segments
MICRO SEGMENTS
Average of Total Spend
StdDev of Total Spend
cluster_02,332.1
4 1,721.33
cluster_14,937.8
5 2,646.95
cluster_25,258.7
3 3,460.76cluster_3 998.96 956.85
cluster_413,676.
55 6,076.59
Spending Capacity: Total Spend over 4 Years
cluster 1cluster 2cluster 3cluster 4
cluster 0
Com
bine
d D
imen
sion
Total Spend over 4 Years
DESCRIBE MICRO SEGMENTSHigh Value vs Low Value Shopper: % of High-priced category items bought over all purchases(E_pct) & % of Lowest-priced category items bought over all purchases (A_pct)
cluster 0 cluster 1 cluster 2
cluster 3
cluster 4 • Cluster 0, Cluster 1, and Cluster 3 have both high value and low value customers; Cluster 1 customers don’t have as high % of highest and lowest priced items
• Cluster 2 customers generally have lower % of lowest-value purchases
• Cluster 4 customers generally have lower % of highest-value purchases
Describe MICRO Segments
Distance from Store per Cluster
% o
f Cus
tom
ers
in A
rea
Clusters
0%
25%
50%
75%
100%
cluster_0 cluster_1 cluster_2 cluster_3 cluster_4
1-Closest2345Unknown
Area (from Farthest to Closest) • Cluster 1 and Cluster 4
have most customers living closest to the store
• Cluster 0 and Cluster 2 have most customers living farthest from the store
Describe MICRO Segments
Age Composition per Cluster
% o
f Cus
tom
ers
in A
ge R
ange
Clusters
• Age doesn’t appear to have an impact on segment behavior
0%
25%
50%
75%
100%
cluster_0 cluster_1 cluster_2 cluster_3 cluster_4
AgeRange1(25-29yrs)AgeRange2(30-39yrs)AgeRange3(40-49yrs)AgeRange4(50-59yrs)AgeRange5(60+)
Describe MICRO Segments
Weekend vs Weekday Visits
% o
f Visi
ts (
Ave
rage
)
Clusters
• Cluster 1, 3, and 4 have mostly more weekday purchases
• Cluster 0 and Cluster 2 mostly have equal weekend and weekday visits
0%
25%
50%
75%
100%
cluster_0 cluster_1 cluster_2 cluster_3 cluster_4
AverageofWeekendVisit%AverageofWeekdayVisits%
MICRO SEGMENTS
Average of Weekend Visit %
Average of Weekday Visits %
cluster_0 50.61% 49.39%cluster_1 37.25% 62.75%cluster_2 54.68% 45.32%cluster_3 36.20% 63.80%cluster_4 39.76% 60.24%
MICRO SEGMENT SUMMARY
Cluster 0 Cluster 1 Cluster 2 Cluster 3 Cluster 4Frequency
(Avg. # of Days Between Visits)
Infrequent (Avg 90 days)
Frequent (Avg 20 days)
Infrequent (Avg 82 days)
Infrequent (Avg 71 days)
Frequent (Avg 12 days)
Loyalty (Total # of Visits in
4 months)
Casual Shopper (Avg 1.43 x in 4 months)
Loyal Shopper (Avg 6.44x in 4 months)
Casual Shopper (Avg 1.71x in 4 months)
Casual Repeat Shopper (Avg 2.19x in 4 months)
Loyal Shopper (Avg 13.52x in 4 months)
Variety (Unique Categories
per Visit)Moderate Variety (Avg 10 Cats per Visit)
Moderate Variety (Avg 6.27 Cats per Visit)
High Variety (Avg 20 Cats per Visit)
Low Variety (Avg 3.29 Cats per Visit)
Moderate Variety (Avg 9 Cats per Visit)
Spending Capacity (Spend per Visit)
Moderate Spender (Avg 1.6k per Visit)
Low Spender (Avg 910 per Visit)
High Spender (Avg 3112 per Visit)
Low Spender (Avg 495 per Visit)
Moderate Spender (Avg 1,389 per Visit)
Spending Capacity (Total Spend over
4 Years)Low Spender (Avg 2,332 total)
Moderate Spender (Avg 4,938 total)
Moderate Spender (Avg 5,259 total)
Low Spender (Avg 999 total)
High Spender (Avg 13,677 total)
High Value vs Low Value
High Value & Low Value Choices
Mostly Moderate Value Choices
Few Low Value Choices, some High Value Choices
High Value & Low Value Choices
Few High Value Choices
Distance from Store Far from store Near Store Far from store
Near to moderately near store Near Store
Weekend vs Weekday
Weekend & Weekday Shopper Weekday Shopper Weekend shopper Weekday Shopper Weekday Shopper
THANK YOU
Questions
AustralianContact:
Paul DovasRegional Sales Director, Australia
M: +61 (0) 407 981 755 E: [email protected] W: www.adatos.com