12 DATA SCIENCE AT DNB WORLD BANK - FINSAC CONFERENCE ON FINTECH IMAN VAN LELYVELD 22 MAY 2019 1
12
DATA SCIENCE AT DNB
WORLD BANK - FINSAC CONFERENCE ON FINTECH
IMAN VAN LELYVELD 22 MAY 2019
1
Agenda
Process How did it go? What have we accomplished so far?
Technology Challenges & Solutions
Showcase Explanation of PoCs: what are the new possibilities?
What’s next? What have we learned?
A dot on the horizon …
IT and tools PeopleData
A dot on the horizon …
Data PeopleIT and tools
The dream … … and reality
Which elements do we need?
• Research Area Network (RAN), Data Platform + AnalyticalWorkspaces/Datalabs/Data scienceToolkit, memory/cpu/storage
• Cloud deployment; Data(platform) connectivity, other connectivity (open data, etc..), quick scaling of datalabs
• Open source tooling (e.g. R,python, Git, Neo4J, MongoDB, SQLlight, MySQL, …..)
Tool
ing • Appreciation of the scientific method
• Knowledge of statistics (descriptive, explorative, predictive, causal, ...)
• Knowledge of coding in ‘interpreter’ languages (Python, R, Julia, ...) andsupport (Anaconda, JupyterNotebooks, Git, ...)
Peop
le
• Informal: knowledge networks, lunches, seminars
• Creating a community, many already do ‘something’ with datascience: Get-togethers, what do people need?, datascience 101 sessions, seminar withexternals, deep-dive sessies (R, Python, GITLab, Neo4J, MySQL, MongoDB, etc..), show first results
Cul
ture• Decentral vs. Central
• Governance (!!!) – data protection, deployment of analysis (KIVKII)
• Agile, pilots, data science as a brand• FTE’s
Org
anis
atio
n
Process: venturing down unbeaten paths…
• Technology
• Legal• tension between experiments and a complete contract
• Means and projects
Agenda
Process How did it go? What have we accomplished so far?
Technology Challenges & Solutions
Showcase Explanation of PoCs: what are the new possibilities?
What’s next? What have we learned?
ModernFlexibleScalable
SecureManageableTraceable
DNB Surf Sara
DNB Net.RAN High Secure
X10
X∞
Internet
VPNVPN Tunnel
SSH TunnelHTTPS
workplace workplace
Log
Repo.GIT
Ansible
workplace
ProjectProject
ProjectProject
DNBproxy
DNB Surf Sara
DNB Net.RAN High Secure
X10
X∞
Internet
VPNVPN Tunnel
SSH TunnelHTTPS
workplace workplace
Log
Repo.GIT
Ansible
workplace
Ansible
ProjectProject
ProjectProject
DNBproxy
Infinitenumber of projects
Makes governancepossible through a case-orientedapproach
Per VM- 100TB- 80cores- 512GB
X.. VM’sper project
Isolationper project
Agenda
Process How did it go? What have we accomplished so far?
Technology Challenges & Solutions
Showcase Explanation of PoCs: what are the new possibilities?
What’s next? What have we learned?
PoC 1: Credit Risk 90
PoC 2: CCP Risk Indicators 50
PoC 3: CDS Contagion 100
PoC 4: IRS Margin Requirements 60
PoC 5: Pattern recognition in Solvency II reports 20
PoC 6: Residential Real Estate 10
PoC 7: AnaCredit 10
PoC 8: Future securities statistics 5
PoC 5: Pattern recognition in Solvency II reports
Plausibility of group reporting in Solvency II Sizable reports (> 4.000 dimensions per group)
Domain knowledge required
Specifying DQ checks labour intensive
Applying machine learning algorithms Checking causal relations within reports
Comparing group reporting and solo reporting
On going work Further DQ checks
Linking with other financial data
Other algorithms (# dimensions >> # groups)
User-interface for visualising and feedback
Agenda
Process How did it go? What have we accomplished so far?
Technology Challenges & Solutions
Showcase Explanation of PoCs: what are the new possibilities?
What’s next? What have we learned?
Lessons?
• The hard core
• Hackathon
• Cross-pollination• Lelione• STAT ML
Community• Python & R lunches
• A whiff of Data Science
• Hands on case study with Jupyter Notebook for DNB board and management
• Manifest
• What is responsible data science?
• Datapreneur
• 22 participants from across the whole bank starting with open source and tackling their own business problems
• Python, R, GIT, Agile, co-coding
• Training
• Data Science 101, Joint with DNB Academy