Running GROMACS on GPUs: a Benchmark Study Dr. LI Jianguo Bioinformatics Institute, A*STAR & Singapore Eye Research Institute, SingHealth
Running GROMACS on GPUs: a
Benchmark Study
Dr. LI Jianguo
Bioinformatics Institute, A*STAR &
Singapore Eye Research Institute, SingHealth
Popular Molecular Dynamics Packages for Biomolecules� AMBER
Why molecular dynamics (MD) simulations?� Decipher the mechanism of molecular systems
� Predict the macroscopic properties based on microscopic interactions
� One of the frontiers of biological research (Nobel Prize in Chemistry in 2013)
(http://www.nobelprize.org/nobel_prizes/chemistry)
Why GROMACS?� Flexible, many modules are available, and more than 100 analysis tools
� Fast
� Free
� AMBER
� CHARMM
� GROMACS
� NAMD
� LAMMPS
GPU versions of GROMACS
Old version of GROMACS: no GPU support, only CPU version.
GROMACS 4.5:
� Support GPU through OpenMM library
� CPU is only for input-output
� Cannot use multiple GPUs
� Only limited features were implemented.
GROMACS 4.6.x: native support for GPU acceleration.
� With verlet cut-off scheme� With verlet cut-off scheme
� NVIDIA hardware with compute capability >=2.0
� Support most GROMACS features, such as pull code, virtual site,
PME, reaction-field etc.
� The whole simulation work is divided between CPU and GPU:
� GPU: non-bonded force calculations on GPU
� CPU: others such as bonded forces, PME
How GROMACS 4.6 works on GPU
*http://www.gromacs.org/GPU_acceleration
CPU-GPU load imbalance:
To achieve the best performance, an optimal CPU/GPU ratio is needed.
The ideal CPU/GPU load balance
Parameters for running GPU simulation
In mdp file:
cutoff-scheme = Verlet
nstlist = 10 ; likely 10-50
coulombtype = pme ; or reaction-field
vdw-type = cut-off
rcoulomb = 1.0
fourierspacing = 0.12
mdrun options:mdrun options:
-ntmpi: number of MPI threads
-ntomp: number of OpenMP threads per MPI thread
-gpu_id: List of GPU id's to use
For example, if you have 4 GPU with 16 CPU cores, you can use:
mdrun -ntmpi 2 –ntomp 6 –gpu_id 01 (use two GPU and 12 CPU cores)
mdrun -ntmpi 4 –ntomp 4 –gpu_id 0123 (use 4 GPU and 16 CPU cores)
Benchmark 1: peptide in water
Simulation of 23 residue peptide in water, 14165 atoms, Rhombic
dodecahedron box, Hbonds constrained. Either PME or Reaction-Field was
tested.
CPU: Intel i7 (8 cores)
GPU: GTX590 (dual GPU card)
GPU can significantly accelerate the MD simulation. There is an optimal CPU/GPU ratio. For
GTX590, CPU/GPU ratio should be 4 or above.
Benchmark 2: membrane system
• Bacterial membrane: 512lipids, 23388 water + 128 Ions, totally 100k atoms
• Gromos53a6 force field, PME, NPT
• Gromacs version-4.6.3, K20 GPU, Intel E5 CPU
8cpu+1gpu
12cpu+1gpu
16cpu+1gpu
8cpu+2gpu
12cpu+2gpu
16cpu+2gpu
Higher level GPU card requires more CPU to match its computational capacity. For K20
card, ideal CPU/GPU ratio is 8-12.
0 10 20 30
4cpu
8cpu
16cpu
4cpu+1gpu
6cpu+1gpu
ns/day
Benchmark 3: Implicit Solvent simulation of P53 tetramer
• P53 protein tetramer: 24185 atoms
• Amber99sb force field
• Gromacs version-4.6.1, GTX590 GPU, Intel i7 CPU
4cpu+1gpu
8cpu+1gpu
2cpu+2gpu
4cpu+2gpu
8cpu+2gpu
Implicit solvent model require smaller CPU/GPU ratio due to the lack of PME on CPU.
0 5 10 15 20
8cpu
1cpu+1gpu
2cpu+1gpu
4cpu+1gpu
ns/day
Summary and Take-home Message
GPU can significantly accelerate the MD simulations using GROMACS
More than 10 times acceleration can be obtained for simulations using implicit solvent
model
For the best performance, an optimal ratio of CPU/GPU is needed. Different GPU cards
need different number of CPUs
Acknowledgement:
� NOVATEE, NVIDIA & ACRC, A*STAR
� Prof Roger and Chandra
� BII IT group: Tai Pang
Thank you!