Taming Brain Complexity – Computational and Data Analytics Challenges ORAP AI for HPC and HPC for AI 6 Novembre 2018, au CNRS, rue Michel-Ange, Paris
Taming Brain Complexity – Computational and Data Analytics ChallengesORAPAI for HPC and HPC for AI
6 Novembre 2018, au CNRS, rue Michel-Ange, Paris
Prof. Dr. Katrin Amunts
Director, Institute of Neuroscience and Medicine, Research Center Jülich Director, C. & O. Vogt-Institute for Brain Research, Heinrich-Heine-University DüsseldorfScientific Research Director, The Human Brain Project
Decoding the Human Brain
Multiscale in space and time, multimodal
ETHICS & SOCIETY
NEUROSCIENCEEXPERIMENT & THEORY
The Brain: multi-scale and multi-level
RESEARCH INFRASTRUCTUREDATA ANALYTICS & SIMULATION
Processing Element
Processing Element
Processing Element
Processing Element
Rout
er
SerDesSerDesSerDes
SerDes SerDes SerDes SerDes
MCU
Memory Interface
Shared Memory
Shared Memory
Areal Fp1
Areal Fp2
106
106
0 50 100 150 200
4
3
2
1
Mah
alan
obis
dist
ance
Profile Index
Areal Fp1 Areal Fp2
Cytoarchitectonic subdivision of thefrontal pole
3D-Reconstruction of areasFp1 and Fp2 in 10 brains
shown: 3 of 10 brains Bludau et al., Neuroimage, 2014
Bringing areas into a common reference brain
Bludau et al., Neuroimage, 2014 shown: 3 of 10 brains
Cytoarchitectonic probabilistic maps
JuBrain Atlas
Amunts and Zilles, Neuron, 2015; MPM: Simon Eickhoff et al., 2005
Millions ofimagepatches
Deep Learning for brain mappingHow to map thecytoarchitecture at high throughput?
Whole brain
7400 sections(2-3 PByte / brain)
Expert annotation(many hours of work)
Automatic classification(few minutes)
Deep Learning on HPC, JURECA
Spitzer, Amunts, Dickscheid et al. (2018). Improving cytoarchitectonic segmentation of human brain areas with self-supervised Siamese Networks. MICCAI
Columnar activity in area hMT depending on the perception
Castelo-Branco, Goebel, Neuenschwander, Singer (2000). Nature, 405, 685-689.: Castelo-Branco, Formisano, Backes, Zanella, Neuenschwander, Singer. & Goebel (2002). Activity patterns in human motion-sensitive areas depend on the interpretation of global motion. Proc Natl Acad Sci USA, 99, 13914-13919.
Malikovic A, Amunts K, Schleicher A, Mohlberg H, Eickhoff SB, Wilms M, Palomero-Gallager N, Armstrong E, Zilles K
(2007). Cytoarchitectonic analysis of the human extrastriatecortex in the region of V5/MT+: A probabilistic, stereotaxic
map of area hOc5. Cerebral Cortex 17(3): 562-574
Anatomical map of thevisual cortex
Artificial Intelligence (AI) and the Brain
business.financialpost.com/technology/federal-and-ontario-governments-invest-up-to-100-million-in-new-artificial-intelligence-vector-institutewww.handelsblatt.com/politik/deutschland/koalitionsverhandlungen-groko-packt-das-megathema-kuenstliche-intelligenz-an/20927750.htmlwww.sciencemag.org/news/2018/04/15-billion-artificial-intelligence-research-europe-pins-hopes-ethicsen.rfi.fr/france/20180329-france-invest-1.5-billion-euros-by-2022-boost-ai-researchwww.forbes.com/sites/samshead/2018/04/26/britain-france-and-germany-fight-it-out-to-be-europes-ai-leader
How to enable neuro-inspired Deep Learning ?
Artificial neuronal networks
Wiring diagram of the human nervous system
Urbansky et al. 2014
Retina
Amygdala
Pulvinar
LGN
V1
V2
V3
V4
V5
dorsal stream
vental stream
A Building Block
He, Zhang, Ren, Sun (2015): Deep Residual Learning for Image Recognition
!(x)
weight layer
weight layer
!(x) + x +
relu
relu
x
x
identity
Learning from the brain
Co-funded by the European Union
Slide! !
14
Learning from the brainN
eurom
orphic in
stallation
Encapsulating the properties of pyramidal neurons
Artificial neuronal networksPerceptron
Deep learning for creating cellular models in 3D
Neuronal architectureCerebral cortex
A whole human brain has about 86 billion nerve cells, and 7500 sections.An image of a section with a resolution of 1 µm has a size of200.000 x 100.000 pixels,and includes 30 virtual sections,which results in ~ 2.1 PB per brain.
Big brains – big data
Challenge data handlingCreating the basis for a cellular brain model @ 1 micron
GPFS
JSCINM1
10Gbit
LE01
LE02
10Gbit
Microscopegateway(Buffer
storage)1Gbit
NFS Gateway
SSD RAID
~70 MB/s
LE03
LE04
LE05
LE06
LE07
LE08
Processing/Visualization Storage
Archive (Tape)Deep Storage
Key figures for scanning: § 2.5 sections (whole brain, 30
optical planes per day and microscope
§ 20 sections per day
Data volume:§ 30GB/h pro microscope
§ 4 microscopes :120 GB/h and 2.9 TB/day
§ 8 microscopes: 240 GB/h and 5.8 TB/day
Expected volume per brain: ca. 2.1 PBScanning time: appr. 1 year
brain stemstriatumcortex thalamus
ic
sc
thc
Neurons and their connections
BigBrain Model
Resulting in 2 PetaByte per brain
Nerve cell connections in the brain
3D-Polarized Light Imaging
Nerve cell connections in the brain
Tissue section with 1.7 TByteCalculated on JURECA @ JSC
Fiber architecture up to the level of axons
Analysis workflow is managed by UNICOREWorkflow components utilize GPUs and/or CPUs, i.e. they optimally use JURECA and JURON resourceshdf5 file format is used for parallel I/OGranted compute time: 335.500 hours CPU, 85.000 hoursGPU
Coronal human brain section:§ 5.000 overlapping image tiles (in total 90.000)§ 120.000 x 100.000 pixel per stitched image§ 1,3 µm x 1,3 µm x 60 µm voxel sizeEntire human brain with 3.000 sections:
7 PByte
Workflows for 3D-Polarized Light Imaging
stitc
hing
segm
enta
tion
orie
ntat
ion
anal
ysis
sign
alan
alys
is
structuretensor analysis
Simulation for 3D-Polarized Light ImagingUsing a Maxwell solver to understand light interaction
inclined fibres have lower transmittance than flat
modelling of fibres
transmitting polarized light through modelled fibres comparing with experiments
§ Finite-Difference Time-Domain (FDTD) algorithm§ propagation of electromagnetic waves through brain
tissue§ approximation of Maxwell‘s equations by finite
differences§ Massively parallel algorithm optimized for JUQUEEN§ Granted compute time: 19.000.000 hours CPU
High Performance Analytics & Computing Pilot systems JULIA & JURON
JULIA JURON
• Neural network simulations• Simulator optimization for KNL-
based systems• Deep learning: pattern recognition• Application benchmarking
• Neural network simulations• Simulator optimization for GPU-
based systems • Deep learning: image segmentation
for 3D-PLI• Image analysis and modelling (TVB)• Application benchmarking
• IBM-NVIDIA and Cray developed pilot systems in Pre-Commercial Procurement (RUP)
• Based on HBP use cases• Focus on:
• Dense memory integration• Scalable visualization • Dynamic resource management
• Hosted at JSC• Improvement of software stack in
the last months available to all HBP scientists
• https://hbp-hpc-platform.fz-juelich.de/
Thomas Lippert (Jülich), Thomas Schulthess (CSCS Lugano) and teams in the HPAC Platform
In silico Drug Design: Emerging Role of Artificial Intelligence
Proprietary Data
Public Data
…and more
Annotate, Curate & Normalize
• Aggregate and Synthesize Information
• Understand Mechanisms of Disease
• Generate Data and Models
• Repurpose Existing Drugs
• Generate Novel Drug Candidates
• Validate Drug Candidates
• Design Drugs
Molecular Simulations
QM/MM
MM
• Node in fast network: infiniband, Intel OmniPath
• ~100 million core-hours/ per system
• Nodes with fast-CHIPs
• ~10 million core-hours/per system
Artificial Intelligence
Virtual Screening
• large space disk, large amount of Nodes with fast I/O
• < 0.1 core-hours for 10 million compound databases
Ligand Identification
= predictive models!
§ physico-chemical and structural properties
§ verified bio-interactions
§ Machine/deep learning
Novel therapeuticsFrom in silico ligand design to in vitro and in cell essays
Copyright © 2018 Molsoft LLC.
§ Molecular Simulations
§ Cheminformatics § In silico Virtual Screening
§ Ligand Optimization
= ML
= *predictive models!
§ physico-chemical and structural properties
§ experimentally verified bio-interactions
§ machine learning
integration of many scientific disciplines
Translational medicine
In silico
*Lima, 2016 Expert Opinion on Drug Discovery*Chen, 2018, Drug Discovery Today
Adenosin A1 D1, D2, D4
D1, D2, D4
Dopamine
AMPA, NMDA, kainate
AMPA, NMDA, kainate
AMPA, NMDA, kainate
AMPA, NMDA, kainate
Glutamate
GABAA, bz.binding site
GABAA, bz.binding site
GABAA, bz.binding site
GABAA, bz.binding site
GABAB
GABAB
GABAA, bz.binding site
GABAB
GABA
nicotinic, M1, M2, M3
nicotinic, M1, M2, M3
Acetylcholine
a1, a2
a1, a2
Noradrenaline
5-HT1A, 5-HT2
5-HT1A, 5-HT2
Serotonin
Molecular architecture of the hippocampus
Receptor changes in Alzheimer
Kontrolle
M. Alzheimer
Hohe Rezeptordichte
Niedrige Rezeptordichte
cholinerger,muskarinischerM1-Rezeptor
Karl Zilles et al., (Juelich), Subproject Human Brain Organization
Mouse brain hippocampus, CA1-Region ~1’000 compartments/neuron, 1” simulation requires 5 h at JUQUEEN) produces appr. 4TBMigliore et al., Palermo, Simulation platform
Modelling and model validation using quantitative
cytoarchitectonic data
31
Whole-brain network model of human brain
activityDTI
Parcellation
fMRI
VALIDATION
CONSTRAINTSCyto-
architecture
§ Atlas parcellation and cyto as bridges to use experimental data in modeling§ Usefulness of atlas data for modeling has been verified
Simulation mouse hippocampus, CA1~1’000 compartments/neuron,(1” simulation: 5 h on JUQUEENgenerates approx. 4TB)Michele Miggliore and colleagues,Simulation platform, HBP
Timo Dickscheid, Katrin Amunts (FZ Jülich), Jan Bjaalie (Univ. Oslo), Subproject Neuroinformatics
50 million patients with epilepsy30% of all patients develop resistency against drugsEpilepsy surgery is then the only alternative and aims atremoving epileptogenic tissue after invasive SEEG.
The success rateof neurosurgery isconatant for about50 years
Jirsa et al. Brain 2015; Proix et al. Brain 2017; Pillai & Jirsa Neuron 2017; Proix, Jirsa et al Nat Comm 2018
Modelling brain activity Epilepsy
14804 registered users, June 2018http://www.thevirtualbrain.org
Individualizedintervention
New in-silico methods for personalized medicine
Neuroimaging Personalisized network models
Cellular model of the subthalamic nucleusDeep Brain Stimulation (DBS)
Amunts, Lepage, Borgeat, Mohlberg, Dickscheid, Rousseau, Bludau, Bazin, Lewis, Oros-Peusquens, Shah, Lippert, Zilles, Evans (2013) BigBrain: An ultrahigh-resolution 3D human brain model. Science, 340(6139): 1472-1475
Dataset High throughputscanning
Data size
Cell segmentation Automaticbrain
mapping
1 section1µm resolution
~20 min
~10 GByte
~ 25 min ~ 3 min
Whole brain1x1x20 µm
~2-3 weeks
~70TByte
~ 4 months ~ 2 weeks
Whole brain1x1x1 µm
~1 year
~ 2PByte
~ 5 years ~ 10 months
4 nodes
Human, whole-brain data sets at cellular resolution:High Performance Computing is critical
Fenix: Consortium of Supercomputing Centers
• Barcelona Supercomputing Center• CEA Computing Centre TGCC• Italian supercomputing centre CINECA• CSCS in Lugano • JSC at Forschungszentrum Jülich
GoalProvide services for federated data infrastructure tightly coupled to supercomputers for HBP and other scientific communities
Thomas Lippert, Dirk Pleiter, Jülich & Thomas Schulthess, Lugano/Zürich, subproject HPAC
Special requirementsregarding storage,
interactive computing, visualization, flexibility for
heterogeneous userprofiles
High computing powerfor simulation &
analysis of “Big Data”
Neuroscience teams up with HPC
www.humanbrainproject.eu
#HumanBrainProj
/TheHumanBrainProject
#HumanBrainProj
Co-funded by the European Union
Thanks
Brain Mapping, JuelichKatrin AmuntsMarkus AxerSebastian BludauSimon Eickhoff David GräßelOlga KedoHartmut MohlbergMiriam MenzelNicola Palomero-GallagherKarl Zilles Timo DickscheidYann LeprinceSarah HaasHannah SpitzerMarcel HuysegomsPhilipp GlockMartin Schober
HBP BrainFrancesco PavoneRainer GoebelViktor JirsaJean-Philippe LachauxJeff MullerMichele MigglioreMartin TelefontJan Bjaalie
JSC, JuelichThomas Lippert Dirk PleiterOliver BückerKristel MichielsenGiulia Rosetti, Paolo CarloniBoris Orth, Anna Lührs
McGill UniversityAlan EvansClaude LepageReza AdalatKonrad Wagstyl
Supported by the European Unions Horizon 2020 Framework Research and Innovation under Grant (Human Brain Project SGA1, SGA2).
HBP at a glance§ 10 years, EUR 1 billion total funding
(50% core project, 50% partnering projects)§ EUR 88 million (SGA2 core project, 2018-2020) § Core project: 116 institutions, 19 countries§ 12 Subprojects§ embedded in various initiatives:
BrainScaleS, Supercomputing and Modeling the Human Brain, SpiNNaker, PRACE, BBP et al.
§ Development of a Joint Platform§ Driven by co-design projects and “use cases”