National Institute of Advanced Industrial Science and Technology Running flexible, robust and scalable grid application: Hybrid QM/MD Simulation Hiroshi Takemiya, Yusuke Tanimura and Hiroshi Takemiya, Yusuke Tanimura and Yoshio Tanaka Yoshio Tanaka Grid Technology Research Center Grid Technology Research Center National Institute of Advanced Industria National Institute of Advanced Industria l Science and Technology, Japan l Science and Technology, Japan
22
Embed
National Institute of Advanced Industrial Science and Technology Running flexible, robust and scalable grid application: Hybrid QM/MD Simulation Hiroshi.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
National Institute of Advanced Industrial Science and Technology
Running flexible, robust and scalable grid application:
Grid Technology Research Center Grid Technology Research Center National Institute of Advanced Industrial Science and TNational Institute of Advanced Industrial Science and T
echnology, Japanechnology, Japan
Goals of the experiment
To clarify functions needed to execute To clarify functions needed to execute large scale grid applicationslarge scale grid applications
requires many computing resources for a long time
1000 ~ 10000 CPUs1 month ~ 1 year
3 requirements3 requirementsScalability
Managing a large number of resources effectively
RobustnessFault detection and fault recovery
FlexibilityDynamic Resource SwitchingCan’t assume all resources are always available during the experiment
Difficulty in satisfying these requirements
Existing grid programming models are hard to satisfy Existing grid programming models are hard to satisfy the requirementsthe requirements
GridRPCDynamic configuration
Does not need co-allocationEasy to switch computing resources dynamically
Good fault tolerance (detection) One remote executable fault client can retry or use other remote executable
Hard to manage large numbers of servers Client will become bottleneck
Grid-enabled MPIFlexible communication
Possible to avoid communication bottleneck
Static configuration Need co-allocationCan not change the No. of processes during execution
Poor fault tolerance One process fault all process faultFault tolerant MPI is still in the research phase
Gridifying applications using GridRPC and MPI
Combining GridRPC and MPICombining GridRPC and MPIGrid RPC
Allocating server (MPI) programs dynamicallySupporting loose communication between a client and serversManaging only tens to hundreds of server programs
MPISupporting scalable execution of a parallelized server program
Suitable for gridifying applications consisting of Suitable for gridifying applications consisting of loosely-coupled parallel programsloosely-coupled parallel programs
ScalabilityScalabilityLarge scale experiment in SC2004
Gridfying QM/MD simulation program based on our approachExecuting a simulation using ~1800 CPUs of 3 clustersOur approach can manage a large No. of computing resources
RobustnessRobustnessLong run experiment on the PRAGMA testbed
Executing TDDFT program over a monthNinf-G can detect servers faults and return errors correctly
Conducting an experiment to show the validity of our Conducting an experiment to show the validity of our approachapproach
Long run QM/MD simulation on the PRAGMA testbed implementing scheduling mechanism as well as fault tolerant mechanism
Using totally 1793 CPUs on 3 clusters Succeeded in running QM/MD program over 11 hours Our approach can manage a large No. of resources
Using totally 1793 CPUs on 3 clusters Succeeded in running QM/MD program over 11 hours Our approach can manage a large No. of resources
Related Work
ScalabilityScalabilityLarge scale experiment in SC2004
Gridfying QM/MD simulation program based on our approachExecuting a simulation using ~1800 CPUs of 3 clustersOur approach can manage a large No. of computing resources
RobustnessRobustnessLong run experiment on the PRAGMA testbed
Executing TDDFT program over a monthNinf-G can detect servers faults and return errors correctly
Conducting an experiment to show the validity of our Conducting an experiment to show the validity of our approachapproach
Long run QM/MD simulation on the PRAGMA testbed implementing scheduling mechanism as well as fault tolerant mechanism
Long run Experiment on the PRAGMA testbed
PurposePurposeEvaluate quality of Ninf-G2Have experiences on how GridRPC can adapt to faults
Ninf-G stabilityNinf-G stabilityNumber of executions : 43Execution time
(Total) : 50.4 days (Max) : 6.8 days (Ave) : 1.2 days
Number of RPCs: more than 2,500,000
Number of RPC failures: more than 1,600
(Error rate is about 0.064 %)Ninf-G detected these failures and returned errors to the application
0
5
10
15
20
25
30
0 50 100 150
Elapsed time [hours]
Nu
mb
er
of
ali
ve
se
rve
rs
AIST
SDSC
KISTI
KU
NCHC
Related Work
ScalabilityScalabilityLarge scale experiment in SC2004
Gridfying QM/MD simulation program based on our approachExecuting a simulation using ~1800 CPUs of 3 clustersOur approach can manage a large No. of computing resources
RobustnessRobustnessLong run experiment on the PRAGMA testbed
Executing TDDFT program over a monthNinf-G can detect servers faults and return errors correctly
The present experiment reinforces the evidence of tThe present experiment reinforces the evidence of the validity of our approachhe validity of our approach
Long run QM/MD simulation on the PRAGMA testbed implementing a scheduling mechanism for flexibility as well as fault tolerance
Necessity of Large-scale Atomistic Simulation
Modern material engineering requires detailed knowledge baseModern material engineering requires detailed knowledge based on microscopic analysisd on microscopic analysis
Future electronic devicesMicro electro mechanical systems (MEMS)
Features of the analysisFeatures of the analysisnano-scale phenomena
A large number of atoms
Sensitive to environmentVery high precision
Quantum description of bond breaking
[ Deformation process ][ Stress distribution ]
Large-scale Atomistic Simulation
Stress enhances the possibility of corrosion?
Hybrid QM/MD Simulation (1)
Enabling large scale simulation with Enabling large scale simulation with quantum accuracyquantum accuracy
Combining classical MD Simulation with QM simulation
MD simulationSimulating the behavior of atoms in the entire regionBased on the classical MD using an empirical inter-atomic potential
QM simulationModifying energy calculated by MD simulation only in the interesting regionsBased on the density functional theory (DFT)
MD Simulation
QM simulationbased on DFT
Hybrid QM/MD Simulation (2)
Suitable for Grid ComputingSuitable for Grid ComputingAdditive Hybridization
QM regions can be set at will and calculated independentlyComputation dominant
MD and QMs are loosely coupledCommunication cost between QM and MD: ~ O(N)
Very large computational cost of QMComputation cost of QM: ~ O(N3)Computation cost of MD: ~ O(N)
A lot of sources of parallelismMD simulation: executed in parallel (with tight communication)each QM simulation: executed in parallel (with tight communication)QM simulations: executed independently (without communication)MD and QM simulations: executed in parallel (loosely coupled)
QM1
QM2
loose
independent
MD simulation
QM simulation
QM simulation
tight
tight
tight
Modifying the Original Program
Eliminating initial set-up routine in the QM programEliminating initial set-up routine in the QM program
Adding initialization functionAdding initialization function
Eliminating the loop structure in the QM programEliminating the loop structure in the QM program
Tailoring the QM simulation as a functionTailoring the QM simulation as a function
Replacing MPI routine to Ninf-G function callsReplacing MPI routine to Ninf-G function calls
MD part QM part
initial set-up
Calculate MD forces of QM+MD regions
Update atomic positions and velocities
Calculate QM force of the QM regionCalculate QM force of the QM regionCalculate QM force of the QM region
Calculate QM force of the QM regionCalculate QM force of the QM regionCalculate QM force of the QM region
Implementation of a scheduling mechanism
Inserting scheduling layer between application and Inserting scheduling layer between application and grpc layers in the client programgrpc layers in the client program
Application does not care about schedulingFunctions of the layerFunctions of the layer
Dynamic switching of target clustersChecking availabilities of clusters
Available periodMaximum execution time
Error detection/recoveryDetecting server errors/time-outingTime-outing
Preventing application from long waitLong wait in the batch queueLong data transfer time
Trying to continue simulation on other clustersImplemented using Ninf-G
Client program
QMMD simulation Layer(Fortran)
Scheduling Layer
GRPC layer(Ninf-G System)
Long run experiment on the PRAGMA testbed
GoalsGoalsContinue simulation as long as possibleCheck the availability of our programming approach
Experiment TimeExperiment TimeStarted at the 18th Apr. End at the end of May (hopefully)
Target SimulationTarget Simulation5 QM atoms inserted in the box-shaped SiTotally 1728 atoms5 QM regions each of which consists of only 1 atom
Entire region Central region Time evolution of the system
Testbed for the experiment
AIST: UMEAIST: UME
NCHC: ASENCHC: ASE SINICA: PRAGMASINICA: PRAGMA
SDSC: Rocks-52 Rocks-47SDSC: Rocks-52 Rocks-47
UNAM: MaliciaUNAM: MaliciaKU: AMATAKU: AMATA
NCSA: TGCNCSA: TGC
8 clusters of 7 institutes in 5 countries8 clusters of 7 institutes in 5 countriesAIST, KU, NCHC, NCSA, SDSC, SINICA and UNAMunder porting for other 5 clusters
Using 2 CPUS for each QM simulationUsing 2 CPUS for each QM simulation
Change target the cluster at every 2 hoursChange target the cluster at every 2 hours
CNICCNIC
KISTIKISTI
BIIBII
TITECHTITECH
USMUSM
Porting the application
5 steps to port our application5 steps to port our application(1) Check the accessibility using ssh(2) Executing sequential program using globus-job-run(3) Executing MPI program using globus-job-run(4) Executing Ninfied program(5) Executing our application
TroublesTroublesJobmanager-sge had bugs to execute MPI programs
Fixed version was released from AIST
Inappropriate MPI was specified in jobmanagersLAM/MPI does not support execution through GlobusMpich-G is not available due to the certificate problemRecommended to use mpich library
Full Cert
GRAM
Limited Cert
<client> <front end> <back end>
PBS/SGEmpirun
GRAM
Executing the application
Expiration of certificatesExpiration of certificatesWe had to care about many kinds of globus related certs
User cert, host cert, CA cert, CRL…
Globus error message is bad“check host and port”
Poor I/O performancePoor I/O performancePrograms compiled by Intel fortran compiler takes a lot of time for I/O
2 hours to output several Mbytes data!Specifying buffered I/O
Using NFS file system is another cause of poor I/O performance
Remaining processesRemaining processesServer processes remain on the backend nodes while job is deleted from a batch-queueSCMS web is very convenient to find such remaining processes
Preliminary result of the experiment
Succeeded in calculating ~ 10000 time steps during Succeeded in calculating ~ 10000 time steps during 2 weeks2 weeks
No. of GRPC executed: 47593 timesNo. of failure/time-out: 524 times
Most of them (~80 %) occurred in the connection phaMost of them (~80 %) occurred in the connection phasese
Due to connection failure/batch system down/queuing time outTime out for queueing: ~ 60 sec
Other failures include;Other failures include;Exceeding max. execution time (2 hours)Exceeding max. execution time/1 time step (5 min)Exceeding max. CPU time the cluster specified (900 sec)
Giving a demonstration!!Giving a demonstration!!
Execution Profile: Scheduling
Example of exceeding max. execution time Example of exceeding max. execution time
(~60 sec)(~80 sec)
Execution Profile: Error Recovery
Example of error recovering Example of error recovering Batch system faultQueueing time-outExecution time-out