A plane wave pseudopotential density functional theory molecular dynamics code on multi-GPU machine - GPU Technology Conference, San Jose, May 17th, 2012 Weile Jia 1 , Long Wang 1 , Zongyan Cao 1 , Jiyun Fu 1 , Xuebin Chi 1 , Weiguo Gao 2 , Lin-Wang Wang 3 (1) Supercomputing Center, CNIC, Chinese Academy of Science (2) Fudan University (3) Material Science Division, Lawrence Berkeley National Laboratory
34
Embed
A plane wave pseudopotential density functional theory ... · No. of computing units 32 64 128 256 Titan CPU(1core/node) 493s 274s 162s 106s Titan CPU(16 core/node)-- 543s 323s 215s
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A plane wave pseudopotential density functional theory molecular dynamics code on multi-GPU machine - GPU Technology Conference, San Jose, May 17th, 2012
Weile Jia1, Long Wang1, Zongyan Cao1, Jiyun Fu1,
Xuebin Chi1, Weiguo Gao2, Lin-Wang Wang3
(1) Supercomputing Center, CNIC, Chinese Academy of Science
(2) Fudan University
(3) Material Science Division, Lawrence Berkeley National Laboratory
Testing results on Mole-8.5: • Generally, more GPU means
faster speed
• Economically, 3 GPUs per node is the optimal way (price* computation time is the lowest)
Different physical kernels
The computation intensive kernel times and their contribution to the total times for different
numbers of processors on Titan and Mole-8.5
CPU/GPU No. 50 100 150 200 250 50 100 150 200 250
CPU/GPU No.
Different computational tasks
The times of different operations as functions to the total number of CPU/GPU units used
and their contributions to the total computational time.
50 100 150 200 250 CPU/GPU No.
50 100 150 200 250 CPU/GPU No.
1800 MD steps of GaInP
Atomic correlation functions
Conclusions
PEtot_GPU achieved 12s per MD step, 18x faster than PEtot_CPU, 7x faster than the fastest reported PWP-DFT MD on CPU CG_AllBand GPU has 30x speedup compared with CPU