Top Banner
Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC Tseng-Hui (Frank) Lin [email protected] [email protected]
38

Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

Jan 08, 2016

Download

Documents

macha

Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC. Tseng-Hui (Frank) Lin [email protected] [email protected]. Performed functions still useful Large user population Invested big money Rewriting is expensive Rewriting is risky. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

Parallelizing Legacy Applications in Message

Passing Programming Model and the Example of MOPAC

Tseng-Hui (Frank) Lin

[email protected]

[email protected]

Page 2: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 2

Legacy Applications

• Performed functions still useful

• Large user population• Invested big money• Rewriting is expensive• Rewriting is risky

• Changed through long time period

• Modified by diff people• Historical code• Dead code• Old concepts

• Major bugs fixed

Page 3: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 3

What Legacy Applications Need

• Provide higher resolution

• Run bigger data

• Graphic representation for scientific data

• Keep certified

Page 4: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 4

How to Meet the Requirements

Improve performance:

Parallel computing Keep Certified:

Change critical parts only• Better user interface:

Add GUI

Page 5: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 5

System Configuration

Page 6: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 6

Distributed vs Shared Memory

Page 7: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 7

Message Passing Programming

• Non-parallelizable parts– Data dependent forces sequential execution– Not worthy to parallelize

• Workload distribution • Input data distribution • Distributed Computation

– Load balance

• Results collection

Page 8: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 8

Non-Parallelable Parts

pT

P

pT

Sp

P

P

T

TS

)1(1

1

Amdahl’s Law

Page 9: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 9

MOPAC

• Semi-empirical molecular orbital pkg– MNDO, MINDO/3, AM1, PM3

• MOPAC 3 submitted to QCPE in 1985

• MOPAC 6 ported to many platforms– VMS– UNIX (our work based on this version)– DOS/Windows

• MOPAC 7 is current

Page 10: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 10

MOPAC input file

L1 : UHF PULAY MINDO3 VECTORS DENSITY LOCAL T=300

L2 : EXAMPLE OF DATA FOR MOPAC

L3 : MINDO/3 UHF CLOSED-SHELL D2D ETHYLENE

L41: C

L42: C 1.400118 1

L43: H 1.098326 1 123.572063 1

L44: H 1.098326 1 123.572063 1 180.000000 0 2 1 3

L45: H 1.098326 1 123.572063 1 90.000000 0 1 2 3

L46: H 1.098326 1 123.572063 1 270.000000 0 1 2 3

L5 :

Keywords

Comments

Molecule Structure in Z-matrix (Internal Coordinate)

Blank Line End-of-Data

Title

Page 11: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 11

Hartree-Fock Self Consistent Field

EH ˆ

SCFC

)]|()|[(

)]|()|(2[

21

*2

PH

ccHF

core

aaa

coren

Schrödinger equation

Matrix equation form (3.1.3)

Matrix representation of Fock matrix (3.1.4)

Page 12: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 12

HF-SCF Procedure

S1: Calc molecular integrals O(n4)

S2: Guess initial eigenvector C

S3: Use C to compute F O(n4)

S4: Transform F to orthogonal basis O(n3)

diagonalize F to get a new C O(n3)

S5: Stop if C converged

S6: Guess new C and goto S3

SCFC

Page 13: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 13

MOPAC computation

• Ab initio HF-SCF– evaluate all integrals rigorously– accuracy– requires high computing power– limited molecule size

• Semi-empirical HF-SCF– use the same procedure– reduce computational complexity– support larger molecule size

Page 14: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 14

Semi-empirical SCF• Ignore some integrals

• Use experiment results to replace integrals

• Assume AO basis is orthogonal

S1, S3: O(n4)=>O(n2)

S4 orthogonalization not needed

New bottle neck: diagonalization

Complexity: O(n3)

Page 15: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 15

Parallelization Procedure

• Sequential analysis– Time profiling analysis– Program flow analysis– Comp Complexity analysis

• Parallel analysis– Data dependence resolution– Loop parallelization

• Integration– Communication between modules

Page 16: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 16

Sequential Analysis

• Time profiling analysis– Pick up the computational intensive parts– Usually use smaller input data

• Program flow analysis– Verify the chosen ones are commonly used– Domain expert not required

• Comp Complexity analysis– Workload distribution changed significantly for

different data sizes

Page 17: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 17

MOPAC Sequential AnalysisData Size 1X 10X 100X

DCART O(n2) 7.42 0.85 0.09DENSIT O(n3) 15.89 18.10 18.36DIAG O(n3) 64.53 73.51 74.55HQRII O(n3) 6.01 6.85 6.94Sum 93.85 99.30 99.93Max Speed-up 16.26 142.74 1407.57Sum of O(n3) 86.43 98.45 99.84Max Speed-up 7.37 64.69 637.92

Assume the complexity of the rest part is O(n2)

Page 18: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 18

Loop Parallelization

• Scalar forward subst: remove temp vars

• Induction variable subst: resolv depend

• Loop interchange/mergeenlarge granularity, reduce synchronization

• Scalar expansionresolve data dependence on scalars

• Variable copyingresolve data dependence on arrays

Page 19: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 19

MOPAC Parallelization: DENSIT

• Function: compute density matrix

• 2 1-level loops inside of a 2-level loop

• Triangular computational space

• Merge the outer 2-level loop to 1 loop with range [1..n(n+1)/2]

• Lower comp/comm ratio (when n small)

• benefit from low latency communication when n is small

Page 20: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 20

MOPAC Parallelization: DIAG

• P1: Generate Fock modular orbital matrix– Higher comp/comm ratio– Find global maximum TINY from local ones– Need to re-distribute matrix FMO for Part 2

• P2: 2X2 rotation to eliminate significant off-diagonal elements– “if” structure cause load imbalance– Need to exchange the inner most loop out– Some calculations run on all nodes to save comm

Page 21: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 21

MOPAC Parallelization: HQRII

• Function: standard eigensolver

• R. J. Allen survey

• Use PNNL PeIGS pdspevx() function

• Use MPI communication library

• Small chunk data exchange, good if n/p>8

• Implemented in C, different way to pack matrix (row major)

Page 22: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 22

Integration

Page 23: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 23

Comm between Modules

• Parallel - sequential– Use TCP/IP– Auto upgrade to shared memory if possible

• Sequential - user interface– Input and output files– Application/Advanced Visualization System

(AVS) remote module communication

• User interface - display– AVS

Page 24: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 24

MOPAC Cntl Panel & Module

Page 25: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 25

MOPAC GUI

Page 26: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 26

Data Files and Platform

• Platforms:– SGI Power Challenge– IBM SP2

Data file Lightatom

HeavyAtom

Data Sizenlight+4nheavy

1crn 0 327 1308

Vcop_4 279 169 955

C60_3 0 180 720

C60_2 0 120 480

porphyrn 33 58 265

Page 27: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 27

DENSIT Speed-up

Page 28: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 28

DENSIT Speed-up

Power Challenge SP2

Page 29: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 29

DIAG Speed-up

Page 30: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 30

DIAG Speed-up

Power Challenge SP2

Page 31: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 31

HQRII Speed-up

Page 32: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 32

HQRII Speed-up

Power Challenge SP2

Page 33: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 33

Overall Speed-up

Projected, assuming sequential part is O(n2)

Power Challenge SP2

Page 34: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 34

Overall SpeedupData File 16 Power

Challenge16 SP2 32 SP2

1crn 11.85610.015

13.83811.367

22.53116.511

Vcop_4 11.1709.068

12.80910.092

21.38014.598

C60_3 9.8457.810

12.2589.204

18.55112.227

C60_2 8.1746.309

10.1567.373

13.5448.928

porphyrn 5.7464.549

6.0424.722

6.6225.049

Assume non-parallelizable part is O(1) and O(n2)

Page 35: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 35

Related work: IBMApplication: Conformational searchFocus: Throughput

Page 36: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 36

Related work: SDSC

• Focus: performance

• Parallelizing:– Evaluate electronic repulsion integrals– Calculate first and second derivatives– Solve eigensystem

• Platform: 64-node iPSC/860

• Results: – Geometry optimization: speedup=5.2– Vibration analysis: speedup=40.8

Page 37: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 37

Achievements

• Parallelize legacy apps from CS perspective

• Keep code validated

• Performance analysis procedures

• Predict large data performance

• Optimize parallel code

• Improve performance

• Improve user interface

Page 38: Parallelizing Legacy Applications in Message Passing Programming Model and the Example of MOPAC

April 7, 2000 38

Future Work

• Shared memory model

• Web based user interface

• Dynamic node allocation

• Parallelization of subroutines with lower computational complexity