Top Banner
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center
17

Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Dec 18, 2015

Download

Documents

Domenic Scott
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Abhinav Bhatele, Laxmikant V. KaleUniversity of Illinois at Urbana-Champaign

Sameer KumarIBM T. J. Watson Research Center

Page 2: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Molecular DynamicsA system of [charged] atoms with bondsUse Newtonian Mechanics to find the positions and

velocities of atomsEach time-step is typically in femto-secondsAt each time step

calculate the forces on all atoms calculate the velocities and move atoms around

September 9th, 2008 Abhinav S Bhatele 2

Page 3: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

NAMD: NAnoscale Molecular DynamicsNaïve force calculation is O(N2)Reduced to O(N logN) by calculating

Bonded forces Non-bonded: using a cutoff radius

Short-range: calculated every time step Long-range: calculated every fourth time-step (PME)

September 9th, 2008 Abhinav S Bhatele 3

Page 4: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Hybrid of spatial and force decomposition

NAMD’s Parallel Design

September 9th, 2008 Abhinav S Bhatele 4

Page 5: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Parallelization using Charm++

September 9th, 2008 Abhinav S Bhatele 5

Static Mapping

Load Balancing

Bhatele, A., Kumar, S., Mei, C., Phillips, J. C., Zheng, G. & Kale, L. V. 2008 Overcoming Scaling Challenges in Biomolecular Simulations across Multiple Platforms. In Proceedings of IEEE International Parallel and Distributed Processing Symposium, Miami, FL, USA, April 2008.

Page 6: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Communication in NAMDEach patch multicasts its

information to many computes

Each compute is a target of two multicasts only

Use ‘Proxies’ to send data to different computes on the same processor

September 9th, 2008 Abhinav S Bhatele 6

Page 7: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Topology Aware TechniquesStatic Placement of Patches

September 9th, 2008 Abhinav S Bhatele 7

Page 8: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Topology Aware Techniques (contd.)Placement of computes

September 9th, 2008 Abhinav S Bhatele 8

Page 9: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Load Balancing in Charm++Principle of Persistence

Object communication patterns and computational loads tend to persist over time

Measurement-based Load Balancing Instrument computation time and communication volume

at runtime Use the database to make new load balancing decisions

September 9th, 2008 Abhinav S Bhatele 9

Page 10: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

NAMD’s Load Balancing StrategyNAMD uses a dynamic centralized greedy strategyThere are two schemes in play:

A comprehensive strategy (called once) A refinement scheme (called several times during a run)

Algorithm:Pick a compute and find a “suitable” processor to place it

on

September 9th, 2008 Abhinav S Bhatele 10

Page 11: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Choice of a suitable processorAmong underloaded processors, try to:

Find a processor with the two patches or their proxiesFind a processor with one patch or a proxyPick any underloaded processor

September 9th, 2008 Abhinav S Bhatele 11

Highest Priority

Lowest Priority

Page 12: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Load Balancing MetricsLoad Balance: Bring Max-to-Avg Ratio close to 1Communication Volume: Minimize the number of

proxiesCommunication Traffic: Minimize hop bytes

Hop-bytes = Message size X Distance traveled by message

September 9th, 2008 Abhinav S Bhatele 12

Agarwal, T., Sharma, A., Kale, L.V. 2008 Topology-aware task mapping for reducing communication contention on large parallel machines, In Proceedings of IEEE International Parallel and Distributed Processing Symposium, Rhodes Island, Greece, April 2006.

Page 13: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Results: Hop-bytes

September 9th, 2008 Abhinav S Bhatele 13

Page 14: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Results: Performance

September 9th, 2008 Abhinav S Bhatele 14

Page 15: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Simulation of WW DomainWW: 30,591- atom simulation on NCSA’s Abe cluster

September 9th, 2008 Abhinav S Bhatele 15

Freddolino, P. L., Liu, F., Gruebele, M., & Schulten, K. 2008 Ten-microsecond MD simulation of a fast-folding WW domain Biophysical Journal 94 L75-L77.

Page 16: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Future WorkA scalable distributed load balancing strategyGeneralized Scenario:

multicasts: each object is the target of multiple multicasts use topological information to minimize communication

Understanding the effect of various factors on load balancing in detail

September 9th, 2008 Abhinav S Bhatele 16

Page 17: Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

NAMD Development Team:Parallel Programming Lab, UIUC – Abhinav Bhatele, Sameer Kumar, David Kunzman, Chee Wai Lee, Chao Mei, Gengbin Zheng, Laxmikant V. KaleTheoretical and Computational Biophysics Group – Jim Phillips, Klaus Schulten

Acknowledgments:Argonne National Laboratory, Pittsburgh Supercomputing Center (Shawn Brown, Chad Vizino, Brian Johanson), TeraGrid