Page 1
High Performance Computing - Physico-chemical
applications to molecular and biomolecular systems
Max von Laue
1879-1960
Paul Langevin
1879-1946
Joseph Fourier
1768-1830
National Institute for R&D of Isotopic and Molecular Technologies65-103 Donath Str., P.O.Box 700 RO-400293 Cluj-Napoca 5, ROMANIA
Calin Gabriel Floare
Page 2
Outline
• What is parallel and high performance computing ?
• Why Use Parallel computing ?
• IBM BG/P system @ UVT
• GPU & FPGA High Performance Heterogeneous Computing
• INCDTIM Data Center containing a Grid site & a cluster
• The story of a serendipitous discovery
• Molecular Dynamics simulations on a very big system
• HPC-Europa 2 Program
1/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 3
What is parallel computing ?
• Traditionally, software is written for serial computation:
To be run on a single computer having a single CPU
A problem is broken into a discrete series of instructions
Instructions are executed one after the other
Only one instruction may execute at any moment in time
• Parallel computing is the simultaneous use of multiple compute
resources to solve a computational problem:
To be run using multiple CPUs
A problem is broken into discrete parts that can be solved concurrently
Each part is further broken down to a series of instructions
Instructions from each part execute simultaneously on different CPUs
• The compute resources can include:
A single computer with multiple processors
An arbitrary number of computers connected by a network
A combination of both
2/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 4
• Parallel computing is an evolution of serial computing that attempts to emulate what has always been the state
of affairs in the natural world: many complex, interrelated events happening at the same time, yet within a
sequence.
The Universe is parallel
The Real World is massively parallel
3/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 5
Why Use parallel computing ?
• Historically, parallel computing has been considered to be the ―high end of computing‖, and has been used to model
difficult scientific and engineering problems found in the real world.
• Today, commercial applications provide an equal or greater driving force in the development of faster computers.
These applications require the processing of large amounts of data in sophisticated ways.
• Why use it ?
Save time and/or money
Solve larger problems
Provide concurrency
Use of non-local resources (SETI@home, Folding@home)
Limits of serial computing (Transmissions speeds, Limits to miniaturization, Economic limitations)
https://computing.llnl.gov/tutorials/parallel_comp/
4/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 6
Blue Brain and Human Brain Project
http://www.neuron.yale.edu/neuron/
NEURON is a simulation environment
for modeling individual neurons and
networks of neurons.
is an attempt to create a synthetic brain by reverse-engineering the mammalian and human brain down to the
molecular level.
Founded in May 2005 by the Brain and Mind Institute of the École Polytechnique in Lausanne, Switzerland, is to
study the brain's architectural and functional principles. The project is headed by the Institute's director, Henry
Markram.
Using a Blue Gene supercomputer running Michael Hines's NEURON software,
the simulation does not consist simply of an artificial neural network, but involves
a biologically realistic model of neurons. It is hoped that it will eventually shed
light on the nature of consciousness.
http://bluebrain.epfl.ch
• IBM Blue Gene/P Massively Parallel Computer
• 4 racks, one row, wired as a 16x16x16 3D torus
• 4096 quad-core nodes, PowerPC 450, 850 MHz
• Energy efficient, water cooled
• 56 Tflops peak, 46 Tflops LINPACK
• 16 TB of memory (4 GB per compute node)
• 1 PB of disk space, GPFS parallel file system
• OS Linux SuSE SLES 10
If selected from amongst six other candidates by the Future and Emerging Technologies
(FET) Flagship Program launched by the European Commission, the Blue Brain Project will
upgrade to become the Human Brain Project and will receive funding up to 100 million
euros a year for 10 years.5/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 7
IBM BG/P system @ UVT
• IBM Blue Gene/P Massively Parallel Computer
• 1x rack, 1024 compute cards (32 compute cards / node)
• 1x Quad PowerPC 450 @ 850 MHz – Double FPU
• 4x TB of memory (4 Gb RAM / compute card)
• 4x power servers p520
• 2x DS3524 and EXP3000 – totally 2×48 SAS HDD
• GPFS parallel file system
• One Cisco Nexus 7010 Switch with 64x10GbE and 98x1GbE
• 1x Torus Network, 1x Collective network, 1x10GbE network (for I/O’s)
• OS Linux SuSE SLES 10
Blue Gene/P system overview
• System-on-a-Chip (SoC)
• PowerPC 450 CPU
850 MHz Frequency
Quad Core
• 4 GB RAM
• Network Connections
IBM BG/P Compute Card
6/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 8
GPUs (Graphical Processing Units)
Octoputer Microway - 8 Tesla cards
Tesla C2050/C2070
The Tesla C2050 / Tesla C2070 is capable of running 515
GFLOPs/sec of double precision processing performance.
Tesla C2050 comes standard with 3 GB of GDDR5 memory
at 144 GB/s bandwidth. Tesla C2070 comes standard with 6
GB of GDDR5 memory.
In the future, 2010 may be known as the year of the GPU.
http://www.nvidia.com/cuda
Fermi Architecture
The soul of a supercomputer in the body of a GPU
NVIDIA Fermi GF100 Block Diagram
7/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
CUDA (Compute Unified Device Architecture) is the
computing engine in NVIDIA GPUs
Page 9
FPGAs (Field Programmable Gate Arrays)
Reconfigurable Computing uses FPGAs as Attached Processing Elements in a Computing System, in order to
Dramatically Increase the Processing Speed.
Annapolis Micro Systems, Inc. (Annapolis, Maryland), the leader in Commercial
Off the Shelf (COTS) Field Programmable Gate Array (FPGA) Based High
Performance Computing, announces the availability of its new WILDSTAR 6
PCIe Card, with up to three Xilinx Virtex 6 FPGAs.
Hightech Global Xilinx Virtex
6 PCIe Development Board
A Field Programmable Gate Arrays (FPGA) is an integrated circuit designed to be configured by the customer
or designer after manufacturing—hence "field-programmable".
Dr. Wim Vanderbauwhede from Glasgow University
creates 1000 core processor using FPGAs
http://www.dcs.gla.ac.uk/~wim/
http://www.gannetcode.org/
The Gannet platform aims to make it easier to
design complex reconfigurable Systems-on-Chip.
Dini Group DNV6F6PCIe
Xilinx Virtex LX550T
Annapolis’s Wildstar 6 PCIe
8/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 10
INCDTIM Data Center
• Hewlett Packard Blade C7000 with 16 Proliant BL280c G6 (2 Intel Quad-core Xeon x5570 @
2.93 GHz, 16 Gb RAM, 500 Gb HDD) running, TORQUE, MAUI, GANGLIA (http://hpc.itim-
cj.ro), NAGIOS, configured from scratch – Scientific Linux 5.3 (Boron)
• We installed different Intel compilers, mathematical and MPI libraries
• We are using different Quantum chemistry codes like: AMBER, GROMACS, NAMD,
LAMMPS, CPMD, CP2K, Gaussian, NWCHEM, GAMESS, ORCA, MOLPRO, DFTB+,
Siesta, VASP, Accelrys Materials Studio
• We are hosting also the RO-14-ITIM Grid site (http://grid.itim-cj.ro)
9/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 11
http://hpc.itim-cj.ro/ganglia
10/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 12
The story of a serendipitous discovery1
α-cyclodextrine, αCD:
the association of 6 glucose units: (C6O5H10)6
4-methylpyridine, 4MP:
C6NH7
…..and a bit of water
1 M. Plazanet, C. Floare, M. R. Johnson, R. Schweins, H. P. Tommsdorff, Freezing on heating of liquid solutions, J. Chem. Phys., 121(11),
5031 (2004), ILL Annual Report 2004, 54-55 and the papers which followed.
11/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 13
Concentration, αCD[g]/4MP[l]
40
70
60
50
80
40
70
60
50
80
Tem
per
ature
°C
Liquid phase
100 200150 300250100 200150 300250
Solid phase
200g/l ~ 1 αCD for 50 4MP
12/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 14
http://www.ill.eu/about/movies/experiments/in16-a-liquid-paradox/
A movie by A. Filhol, Laue-Langevin Institute
Azobenzene
: melts at
66oC
CD-4MP :
freezes at
66oC
13/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 15
40 45 50 55 60 65 70 75 80 85 90 95 1000
50
100
150
200
250
300
Solubility αCD in 4MP
Concentr
ation m
g/m
l
Temperature °C
14/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 16
Characterize the changes of the structure and of the
molecular dynamics by:
• elastic and inelastic neutron scattering
• neutron and X-ray diffraction,
• low-field NMR and
• molecular dynamics simulations
How we can rationalize these surprising observations?
As temperature increases, entropy must increase, how
is this compatible with the observation that crystalline
order is established and that molecular motions are
slowed down?
15/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 17
NEUTRON SCATTERING AT
THE INSTITUTE
LAUE-LANGEVIN (ILL)
X-ray SCATTERING AT
ESRF
Page 18
17/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 19
a) Hysteresis-like fixed window (elastic) scan, IN10, ILL; b) Quasi-elastic neutron spectra, IN5, ILL
18/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 20
F. Ding and N. Dokholyan, Trends in Biotechnology 23(9) 450 (2005)
19/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 21
Model studied system:
one a-CD molecule
50 molecules of 4MP826 atoms
A periodic box with the dimensions 24Å× 24Å× 24Å, containing:
2004 - NPT molecular dynamics simulations using Accelrys CERIUS2 v4.6 with
COMPASS forcefield running on different SGI workstation
20/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 22
20 a-CD molecules
1120 molecules of 4MP
240 water molecules
NPT ensemble MD using AMBER9
60 A3 box
18920 atoms
This system will be studied at
CINECA, Italy, on a project founded by
HPC-Europa2 program on 256 CPUs
• Initially we have to optimize the forcefields using the force-matching method
• 100 ns long trajectories at differenttemperatures must be calculated for goodstatistics
• Hydrogen-bond dynamics and clusterformation analysis
• Correlation coefficients
An AMBER benchmark on IBM SP5 cluster (IBM p575
Power 5, bassi.nersc.gov, 118 8-cpu nodes, 1.9 GHz
Power 5+ cpu, 2 MB L2 cache, 36 MB L3 cache, 32 GB
memory per node) produced 22ns/day when using 256
cores, on a system containing around 23500 atoms.
speed of 0.22ns day (1 core), 0.39ns day
(2 cores) and 0.69 (4 cores)
Infiniband is needed for a further scale up
21/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 23
GPU Codes
22/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 24
1 Million atoms systems simulation now
possible on a desktop workstation
Amber 11 GPU performance compared with that
on Kracken@ORNL, Dihydrofolate reductase
(DHFR) solvated in water, 23558 atoms.
23/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 25
• Milu (Miramare Interoperable Lite User Interface), a tool to set up easily an
UI on (almost) any machine
(https://eforge.escience-lab.org/gf/project/milu/)
• BEMuSE: Bias-Exchange Metadynamics Submission Environment
(https://euindia.ictp.it/bemuse/)
• EPICO – eLab Procedure for Installation and Configuration
(http://epico.escience-lab.org/)
• Training Tools: GRID Seed (http://gridseed.escience-lab.org)
Moodle Platform (http://www.moodle.org)
• Amazon Elastic Compute Cloud (EC2) - from $0.02 per hour
24/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
http://aws.amazon.com/ec2/pricing/
http://aws.amazon.com/ec2/instance-types/
Page 26
• Freezing on heating of liquid solutions, M. Plazanet, C. Floare,
M.R. Johnson, R. Schweins, H.P. Trommsdorff, J. Chem. Phys.
121 (2004) 5031
• J. Chem. Phys. 125 (2005) 154504
• Chem. Phys. 317 (2006) 153
• Chem. Phys. 331 (2006) 35
• J. Phys. Cond. Mat. 19 (2007) 205108
• Phys. Chem. Chem. Phys. 12 (2010) 7026
To know more about it :
25/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 27
26/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 28
27/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 29
•PhysicsWeb, 24/09/2004
•Science News, 16/10/2004
•Physics World, 11/2004
•ILL bulletin, 11/2004
•Science et avenir, 12/2004
•Science et vie, 01/2005
•Geo magasine, german edition, 01/2005
•http://www.scienceinschool.org/repository/docs/defying.pdf
•…
28/30INCDTIM Seminar, June 16, 2011, Cluj-Napoca, Romania
Page 31
Thank you for your attention
Page 33
Deterministic – provides us with a trajectory of the system
Use physics to find the potential energy between all pairs of atoms
Move atoms to the next state
Repeat
―Molecular dynamics (MD) provides the methodology for detailed microscopic modeling on
the molecular scale. The theoretical underpinnings amount on little more than Newton’s laws
of motion. After all, the nature of matter is to be found in the structure and motion of its
constituent building blocks, and the dynamics is contained in the solution of the N-body
problem‖*
* D. C. Rapaport, The Art of Molecular Dynamics Simulation, Cambridge University Press (2004)
Classical N-body problem lacks a
general analytical solution the only path open is the numerical
one
• From atom positions, velocities, and accelerations, calculate atom positions and velocities at the next time step.
• Integrating infinitesimal steps yields the trajectory of the system for any desired time range.
• There are efficient methods for integrating these elementary steps with Verlet and leapfrog algorithms being
the most commonly used.
Molecular Dynamics Method
Page 34
Energy function
AMBER
Force Fieldbond
angle
dihedral
van der
Waals
electrostatic
implicit
solvation
Covalent terms
Non-covalent terms
polarization
• Target function that MD tries to optimize
• Describes the interaction energies of all atoms and molecules in the system
• Always an approximation - closer to real physics (accuracy increases) if more computation time,
smaller time steps and more interactions