20 October 2016 Wolfgang Hauner Chief Data Officer, Munich Re Bildquelle: Mark Moffett / Getty Images Big Data Analytics @ Munich Re Munich Re Life Forum Bildquelle: Mark Moffett / Getty Im
20 October 2016Wolfgang HaunerChief Data Officer, Munich Re
Bildquelle: Mark Moffett / Getty Images
Big Data Analytics @ Munich ReMunich Re Life Forum
Bildquelle: Mark Moffett / Getty Im
Agenda
20 October 2016 2Big Data Analytics @ Munich Re / Wolfgang Hauner
Data Analytics Framework1 Current Analytics Activities2
Method Example: AI3 Advanced Analytics: MR-Examples from the field4
© Munich Re
Loc-based services
Smart HomeTelematics
VirtualAssistantSystems
Haptic Technologies
Integrated Systems
Autonomous Systems and Devices
Automated Decision Taking
Cloud/Client ArchitectureNew Payment
Models
Big Data
Internet of Things
Cybersecurity
Digitalization
Computing Everywhere
Robotics/DronesWearable Devices
Risk-based Security
Context-aware Computing
Open Data
Collaborative Consumption
Predictive Analytics
Industrialization 4.0
Web 4.0Web-Scale IT
Software-defined Anything
Crowdsourcing
Mobile Health Services
3D Printing
Augmented and virtual worlds
Citizen Development
User Centered Design
Digital Identity
On-Demand-Everything
Big Data in Trend Radar
20 October 2016 3Big Data Analytics @ Munich Re / Wolfgang Hauner
Big Data
Digitization
Internet of Things
© Munich Re
When does it become BIG Data?
20 October 2016 4Big Data Analytics @ Munich Re / Wolfgang Hauner
43 zettabytes of data will probably be generated by 2020
300 times the volume in 2005
40,000,000,000,000,000,000,000
ByteKilobyteMegabyteGigabyteTerabytePetabyteExabyteZettabyte
Source: IBM
4 KB Commodore VC 203.5 inch floppy disk
Data contained in a library floor
4 TB in Memory Big Data Platform MR
Petabyte Storage Big Data Plattform
Google, Facebook, Microsoft…
All words ever spoken by humans
Yes or No
© Munich Re
Big Data Analytics
Methods Regression Models
Machine Learning Models
Text Mining
Technology Hardware
(Compute power)
Software (SAS, R, Spark, …)
Data Internal Data
External Data
Structured Data
Unstructured Data
People Data Scientists
Data Engineers
Business People
Big Data Analytics is a Combination of Methods, Technology, Data and People
520 October 2016Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
Building the Team, and the Environment
Programming
Story-telling
Statistics
Visualization
System Implemen-
tation
DB Administration
Maths
Modelling Data Storage
Business-/Domain
knowledge
20 October 2016Big Data Analytics @ Munich Re / Wolfgang Hauner 6
Business-Units IT
© Munich Re
Building the Infrastructure
20 October 2016 7Big Data Analytics @ Munich Re / Wolfgang Hauner
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
SASHANA Hadoop Stack
HANA
User InterfaceUser Interface User Interface
A2P
Data Lake (HDFS)
Long term unstructured and structured data
BI Lab Production
© Munich Re
20 October 2016Big Data Analytics @ Munich Re / Wolfgang Hauner 8© Munich Re
Which topics drive our clients?
Up-/Cross-Selling
Data Sources
Textmining Churn Analysis
Supply Chain
Social Media Analysis
Fraud Detection
Big Data Technology
Predictive UW
Telematics
Sensor Data/IoT
Geospatial
Big Data Analytics @ Munich Re / Wolfgang Hauner 20 October 2016 9© Munich Re
Big Data use cases in insurance
Make the uninsurable insurable
Diabetics
Wind Energy
Consolidate the information and process
Automated underwriting
Risk management platform
Artificial Intelligence supported workflow
Early Loss Detection
Visual Loss Adjustment
Image: dpa Picture Alliance Image: Getty Images
Image: Getty Images
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Big Data Analytics @ Munich Re / Wolfgang Hauner 20 October 2016 10© Munich Re
Agenda
20 October 2016 11Big Data Analytics @ Munich Re / Wolfgang Hauner
Data Analytics Framework1 Current Analytics Activities2
Method Example: AI3 Advanced Analytics: MR-Examples from the field4
© Munich Re
Pilot Fact SheetInternet Research & Intelligence System (IRIS)
20 October 2016 12Big Data Analytics @ Munich Re / Wolfgang Hauner
Multi-dimensional searches based on standardized search technology to accelerate web research (example Tianjin)
Extended analytics to gather further data insights, e.g., based on topic analysis and organizational grouping
Parallel processing and delta mechanism for multi-processed search requests
Results shown in different visualizations (word cloud, table, topic analysis, etc.) and exportable to Excel
Results Benefits
Outlook Additional analytics modules for better insights and
broader application Collaboration functionalities for more efficient
case analysis
© Munich Re
Text Mining and Web Crawling:Hong Kong Monetary authority announces FinTech ‘sandbox’
20 October 2016 13Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
OrganisationClients who are not active anymore are removed. The remaining data is split into INSURANCE and no INSURANCE
SeparationNow the data is randomly split into 5 even boxes. Each box contains both INSURANCE and no INSURANCE. However, the portion within each box varies.
TestingFor testing the first so called “set of training data” the first 4 boxes are aggregated again. Now they are used for sampling the first buying characteristics.
Random Forest (RF)Using machine learning methods, 300 decision trees will be generated simulating customer characteristics. Simulations show chains of combination for INSURANCE and no INSURANCE
ValidationThe just created random forest will now be used to back-test the remaining 5th
box: How accurate can we forecast who bought INSURANCE and who not?
?
Cross-Selling with Machine LearningAnalysis of pension, life and investment portfolios for product development and targeted sales
20 October 2016 14Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
Agenda
20 October 2016 15Big Data Analytics @ Munich Re / Wolfgang Hauner
Data Analytics Framework1 Current Analytics Activities2
Method Example: AI3 Advanced Analytics: MR-Examples from the field4
© Munich Re
Methods Neural Network Insurance specific Visual Intelligence
20 October 2016 16Big Data Analytics @ Munich Re / Wolfgang Hauner
Insurance Companies, e.g., Munich Re, …
AI Community, e.g., Google, Facebook, …
Insurance specific Vision Intelligence
General ObjectVision Intelligence
Images left: used under license from shutterstock.comImage right: Getty Images
© Munich Re
System of interconnected nodes, exchanging information
Weights of connections can be adjusted by supervised/ unsupervised “learning”
Pros: Accuracy usually high, prediction fast
Cons: “Black box” – acquired knowledge not easily comprehensible, training effort high, appropriate data needed
Application areas, e.g., speech recognition, computer vision, medical diagnosis, automated trading, game-playing (AlphaGo)
MethodsNeural Network
17Big Data Analytics @ Munich Re / Wolfgang Hauner
Input Hidden Output
Image: used under license from shutterstock.com
Image: used under license from shutterstock.com
Image: Getty Images Image: Getty Images
No pothole identified
Image: used under license from shutterstock.com
Pothole identified
Image: used under license from shutterstock.com
No pothole identified
20 October 2016© Munich Re
MethodsPotential use-cases of Neural Network Infrastructure Insurance
20 October 2016 18Big Data Analytics @ Munich Re / Wolfgang Hauner
Detect road damage
Categorize damage
Estimate claim
Trigger repair action
Image: used under license from shutterstock.com Image: used under license from shutterstock.com
© Munich Re
Agenda
20 October 2016 19Big Data Analytics @ Munich Re / Wolfgang Hauner
Data Analytics Framework1 Current Analytics Activities2
Method Example: AI3 Advanced Analytics: MR-Examples from the field4
© Munich Re
Remarks The set of explaining variables
differs based on the covers included (as expected)
Top 10 factors are mainly linked to accidental risk (occupation, activity, job position, free time activity). Explained by the high percentage of cases with accidental covers included
Predictive Underwriting with Machine LearningWhich factors explain the underwriting outcome, which are not significant?
20 October 2016 20
Only 20 from 58 fields are required to predict the underwriting result0 10 20 30 40 50 60 70 80 90 100
Occupation CodeQ: Sports
SubsidiaryBMI
Job activityCovers included
Job positionQ: Under treatment
Free time activityAge
Q: Systemdis./addict./scelet.GenderQ: Bike
Entry yearDiff. Age to partnerSum_insured_Life
Sum_insured_TRANSRelationship to benef. 1Relationship to benef. 2
Insurance cover code
Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
Remarks There are no questions which
are always answered with “YES” or “NO”
Some questions did not have any impact in the model (the data could not explain why)
Just because factors did not have any impact in the model didn’t mean the relating questions could be waived (impact on selection given, i.e., HIV question, rehabilitation for addiction) → careful consideration required
Predictive Underwriting with Machine LearningWhich application questions impact the underwriting outcome, which do not?
20 October 2016 21
Impact on probability for standard or loaded/rejected decisionCurrently doing dangerous sports?Currently under treatment or advised surgery?Internal disease, skeletal condition, addition?Cancer or neuro-psychol. condition in last 10y?Motorbike as competition?HIV/AIDS?Motorcross?Taken drugs in last 10y?Daily use of motorbike?Currently pregnant?Had treatment or medical exams in last 3y?Motorbike?Hospitalization in last 3y?Family history?Smoked in previous 12 months?Previous or advised rehabilitation for addition?Pregnancy complication?Stopped usual tasks in previous year?Plan to visit/reside abroad?
HighHigherLowLowerNo
Legend:
Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
Compare different Machine Learning algorithms (Support Vector Machines, Random Forests, Boosted Trees, Regression Boosting, Lasso-regularized Regression) with classical GLMs
Applied to Mortality data
Additionally: Clear visualization of main and interaction effects
Machine Learning as alternative to classical modeling Get better performance + applicable to Big Data
20 October 2016 22Big Data Analytics @ Munich Re / Wolfgang Hauner
Machine Learning helps in understanding and selecting the most relevant influential factors
© Munich Re
Observed and predicted average claimed amounts in 2012 itemized by age and gender (here only women)
Modern Machine Learning TechniquesAllows to detect more detailed patterns
23
GLM Random Forest
The traditional approach takes age and gender into account and therefore mostly performs quite good on average. Only the random forest detects the peak for women in their thirties (pregnancy treatments)
Traditional approach
20 October 2016Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
20 October 2016 24
Claim ID Description Topic 1 … Topic m Code
1 “Heart attack“ -0.35 … 0.64 08008
… … … … … …
n “Breast cancer“ 0.17 … -0.04 09999
Claim ID Description Topic 1 … Topic mPredicted
code
1 “Stroke“ -0.25 … -0.14 01999
… … … … … …
k “Ovarian cyst“ 0.81 … 0.63 04325
Claim IDPredicted
codeProba-
bility Check
1 01999 0.83
… … …
k 04325 0.27
Claims with codes – “training dataset” Machine learning model Rules
Claims without codes – “scoring dataset”Verify predictions on scoring dataset
MOCA (Medical and Occupational Coding Assistant)Tool that maps codes to medical descriptions
Big Data Analytics @ Munich Re / Wolfgang Hauner© Munich Re
Digitization
Big Data
IoTBuilding the
Infrastructure
Building The Team
Building Business-
Cases
Consolidate Process & Information
A.I. supported Workflow
Make the Un-insurable Insurable
…
Big Data Trend is a fact,
bringing insurance industry challenges & opportunities …
→ engaging the trend properly
… to turn the challenges to business potential. Broad data sources
Advanced analytics
Visualization
ML/AI
Values for Insurance
From Trend to Business ValueAn on-going journey
20 October 2016Big Data Analytics @ Munich Re / Wolfgang Hauner 25© Munich Re
Thank you!
Contact: Wolfgang Hauner, Chief Data Officer, Munich [email protected]© 2016 Münchener Rückversicherungs-Gesellschaft © 2016 Munich Reinsurance Company
Image: Bayerische Zugspitzbahn Bergbahn AG / LechnerCenter for International Earth Science Information Network - CIESIN - Columbia University. 2016. Gridded Population of the World, Version 4 (GPWv4): Population Count. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC).