Top Banner
An Introduction to An Introduction to Parallel Programming Parallel Programming with MPI with MPI March 22, 24, 29, 31 March 22, 24, 29, 31 2005 2005 David Adams David Adams [email protected] [email protected] http:// http:// research.cs.vt.edu/lasca/schedule research.cs.vt.edu/lasca/schedule
25

An Introduction to Parallel Programming with MPI

Mar 15, 2016

Download

Documents

jermaine-burke

An Introduction to Parallel Programming with MPI. March 22, 24, 29, 31 2005 David Adams [email protected] http://research.cs.vt.edu/lasca/schedule. Outline. Disclaimers Overview of basic parallel programming on a cluster with the goals of MPI Batch system interaction Startup procedures - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to Parallel Programming with MPI

An Introduction to Parallel An Introduction to Parallel Programming with MPIProgramming with MPI

March 22, 24, 29, 31March 22, 24, 29, 31 20052005

David AdamsDavid [email protected]@vt.edu

http://http://research.cs.vt.edu/lasca/scheduleresearch.cs.vt.edu/lasca/schedule

Page 2: An Introduction to Parallel Programming with MPI

OutlineOutline DisclaimersDisclaimers Overview of basic parallel programming on a cluster Overview of basic parallel programming on a cluster

with the goals of MPIwith the goals of MPI Batch system interactionBatch system interaction Startup proceduresStartup procedures Quick reviewQuick review Blocking message passingBlocking message passing

Non-blocking message passingNon-blocking message passing Lab dayLab day

Collective communicationsCollective communications

Page 3: An Introduction to Parallel Programming with MPI

ReviewReview

Functions we have covered in detail:Functions we have covered in detail:MPI_INITMPI_INIT MPI_FINALIZEMPI_FINALIZEMPI_COMM_SIZE MPI_COMM_SIZE MPI_COMM_RANKMPI_COMM_RANKMPI_SENDMPI_SEND MPI_RECVMPI_RECV

Useful constants:Useful constants:MPI_COMM_WORLD MPI_COMM_WORLD MPI_ANY_SOURCEMPI_ANY_SOURCEMPI_ANY_TAGMPI_ANY_TAG MPI_SUCCESSMPI_SUCCESS

Page 4: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

SENDRECV

RECVSEND

RECVSEND

Page 5: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 1

Page 6: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 2

Page 7: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 3

Page 8: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 4

Page 9: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 5

Page 10: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 6

Page 11: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 7

Page 12: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 8

Page 13: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 9

Page 14: An Introduction to Parallel Programming with MPI

Motivating Example for DeadlockMotivating Example for Deadlock

P10

P2P1

P9

P8

P7 P6

P5

P4

P3

Timestep: 10!

Page 15: An Introduction to Parallel Programming with MPI

SolutionSolution

MPI_SENDRECV(sendbuf, sendcount, MPI_SENDRECV(sendbuf, sendcount, sendtype, dest, sendtag, recvbuf, sendtype, dest, sendtag, recvbuf, recvcount, recvtype, source, recvtag, recvcount, recvtype, source, recvtag, comm, status, ierror)comm, status, ierror)

The semantics of a send-receive operation is The semantics of a send-receive operation is what would be obtained if the caller forked two what would be obtained if the caller forked two concurrent threads, one to execute the send, concurrent threads, one to execute the send, and one to execute the receive, followed by a and one to execute the receive, followed by a join of these two threads.join of these two threads.

Page 16: An Introduction to Parallel Programming with MPI

Nonblocking Message PassingNonblocking Message Passing

Allows for the overlap of communication Allows for the overlap of communication and computation.and computation.Completion of a message is broken into Completion of a message is broken into four steps instead of two.four steps instead of two. post-sendpost-send complete-sendcomplete-send post-receivepost-receive complete-receivecomplete-receive

Page 17: An Introduction to Parallel Programming with MPI

Posting OperationsPosting Operations

MPI_ISEND (BUF, COUNT, DATATYPE, DEST, TAG, MPI_ISEND (BUF, COUNT, DATATYPE, DEST, TAG, COMM, REQUEST, IERROR)COMM, REQUEST, IERROR)

IN <type> BUF(*)IN <type> BUF(*) IN INTEGER, COUNT, DATATYPE, DEST, TAG, COMM,IN INTEGER, COUNT, DATATYPE, DEST, TAG, COMM, OUT IERROR, REQUESTOUT IERROR, REQUEST

MPI_IRECV (BUF, COUNT, DATATYPE, SOURCE, MPI_IRECV (BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, REQUEST, IERROR)TAG, COMM, REQUEST, IERROR)

IN <type> BUF(*)IN <type> BUF(*) IN INTEGER, COUNT, DATATYPE, SOURCE, TAG, COMM,IN INTEGER, COUNT, DATATYPE, SOURCE, TAG, COMM, OUT IERROR, REQUESTOUT IERROR, REQUEST

Page 18: An Introduction to Parallel Programming with MPI

Request ObjectsRequest ObjectsAll nonblocking communications use request All nonblocking communications use request objects to identify communication operations and objects to identify communication operations and link the posting operation with the completion link the posting operation with the completion operation.operation.Conceptually, they can be thought of as a Conceptually, they can be thought of as a pointer to a specific message instance floating pointer to a specific message instance floating around in MPI space.around in MPI space.Just as in pointers, request handles must be Just as in pointers, request handles must be treated with care or you can create request treated with care or you can create request handle leaks (like a memory leak) and handle leaks (like a memory leak) and completely lose access to the status of a completely lose access to the status of a message.message.

Page 19: An Introduction to Parallel Programming with MPI

Request ObjectsRequest Objects

The value MPI_REQUEST_NULL is used to The value MPI_REQUEST_NULL is used to indicate an invalid request handle. Operations indicate an invalid request handle. Operations that deallocate request objects set the request that deallocate request objects set the request handle to this value.handle to this value.

Posting operations allocate memory for request Posting operations allocate memory for request objects and completion operations deallocate objects and completion operations deallocate that memory and clean up the space.that memory and clean up the space.

Page 20: An Introduction to Parallel Programming with MPI

Completion OperationsCompletion OperationsMPI_WAIT(REQUEST, STATUS, IERROR)MPI_WAIT(REQUEST, STATUS, IERROR)

INOUT INTEGER REQUESTINOUT INTEGER REQUEST OUT STATUS, IERROROUT STATUS, IERROR

A call to MPI_WAIT returns when the operation identified by A call to MPI_WAIT returns when the operation identified by REQUEST is complete.REQUEST is complete.MPI_WAIT is the blocking version of completion operations where MPI_WAIT is the blocking version of completion operations where the program has determined it can’t do any more useful work the program has determined it can’t do any more useful work without completing the current message. In this case, it chooses to without completing the current message. In this case, it chooses to block until the corresponding send or receive completes.block until the corresponding send or receive completes.In iterative parallel code, it is often the case that an MPI_WAIT is In iterative parallel code, it is often the case that an MPI_WAIT is placed directly before the next post operation that intends to use the placed directly before the next post operation that intends to use the same request object variable.same request object variable.Successful completion of the function MPI_WAIT will set Successful completion of the function MPI_WAIT will set REQUEST=MPI_REQUEST_NULL.REQUEST=MPI_REQUEST_NULL.

Page 21: An Introduction to Parallel Programming with MPI

Completion OperationsCompletion OperationsMPI_TEST(REQUEST, FLAG, STATUS, IERROR)MPI_TEST(REQUEST, FLAG, STATUS, IERROR)

INOUT INTEGER REQUESTINOUT INTEGER REQUEST OUT STATUS(MPI_STATUS_SIZE)OUT STATUS(MPI_STATUS_SIZE) OUT LOGICAL FLAGOUT LOGICAL FLAG

A call to MPI_TEST returns flag=true if the operation identified by A call to MPI_TEST returns flag=true if the operation identified by REQUEST is complete.REQUEST is complete.MPI_TEST is the nonblocking version of completion operations.MPI_TEST is the nonblocking version of completion operations.If flag=true then MPI_TEST will clean up the space associated with If flag=true then MPI_TEST will clean up the space associated with REQUEST, deallocating the memory and setting REQUEST = REQUEST, deallocating the memory and setting REQUEST = MPI_REQUEST_NULL.MPI_REQUEST_NULL.MPI_TEST allows the user to create code that can attempt to MPI_TEST allows the user to create code that can attempt to communicate as much as possible but continue doing useful work if communicate as much as possible but continue doing useful work if messages are not ready.messages are not ready.

Page 22: An Introduction to Parallel Programming with MPI

Maximizing OverlapMaximizing OverlapTo achieve maximum overlap between computation and To achieve maximum overlap between computation and communication, communications should be started as communication, communications should be started as soon as possible and completed as late as possible. soon as possible and completed as late as possible.

Sends should be posted as soon as the data to be sent is Sends should be posted as soon as the data to be sent is available.available.

Receives should be posted as soon as the receive buffer can be Receives should be posted as soon as the receive buffer can be used.used.

Sends should be completed just before the send buffer is to be Sends should be completed just before the send buffer is to be reused.reused.

Receives should be completed just before the data in the buffer Receives should be completed just before the data in the buffer is to be reused.is to be reused.

Overlap can often be increased by reordering the Overlap can often be increased by reordering the computation.computation.

Page 23: An Introduction to Parallel Programming with MPI

Setting up your account for MPISetting up your account for MPI

http://courses.cs.vt.edu/~cs4234/MPI/first_http://courses.cs.vt.edu/~cs4234/MPI/first_exercise.htmlexercise.html

List of 124 machine names:List of 124 machine names: http://courses.cs.vt.edu/~cs4234/MPI/124hosts.txthttp://courses.cs.vt.edu/~cs4234/MPI/124hosts.txt

Page 24: An Introduction to Parallel Programming with MPI

More StuffMore StuffNote: to login the 124 linux machines from the outside world, you do "ssh Note: to login the 124 linux machines from the outside world, you do "ssh rlogin.cslab.vt.edu". You will then be logged into one of the machines in the lab. rlogin.cslab.vt.edu". You will then be logged into one of the machines in the lab. Set up public/private key pair. You only have to do this once. It will allow you to Set up public/private key pair. You only have to do this once. It will allow you to launch mpi jobs from any of the launch mpi jobs from any of the McBMcB 124 machines 124 machines, and have them run on any of , and have them run on any of these machines, without having to type passwords. these machines, without having to type passwords.

First, enter the command ssh-keygen -t dsa -N "" The result of this command will First, enter the command ssh-keygen -t dsa -N "" The result of this command will be something like this:: Generating public/private dsa key pair. Enter file in which be something like this:: Generating public/private dsa key pair. Enter file in which to save the key (/home/ugrads/NAME/.ssh/id_dsa): Your identification has been to save the key (/home/ugrads/NAME/.ssh/id_dsa): Your identification has been saved in /home/ugrads/NAME/.ssh/id_dsa. Your public key has been saved in saved in /home/ugrads/NAME/.ssh/id_dsa. Your public key has been saved in /home/ugrads/NAME/.ssh/id_dsa.pub. The key fingerprint is: /home/ugrads/NAME/.ssh/id_dsa.pub. The key fingerprint is: 89:ff:00:5f:06:fd:d0:a2:9e:51:b1:00:cd:0a:76:6f 89:ff:00:5f:06:fd:d0:a2:9e:51:b1:00:cd:0a:76:6f [email protected] [email protected]

Then do this cd .ssh cp id_dsa.pub authorized_keys2 Then do this cd .ssh cp id_dsa.pub authorized_keys2 To make sure this step worked, try ssh'ing to another machine in the lab, e.g., To make sure this step worked, try ssh'ing to another machine in the lab, e.g.,

"ssh strawberry". You should be able to do this without being prompted for a "ssh strawberry". You should be able to do this without being prompted for a passwordpassword

Page 25: An Introduction to Parallel Programming with MPI

Even More StuffEven More Stuff

Put /home/staff/ribbens/mpich-1.2.6/bin in Put /home/staff/ribbens/mpich-1.2.6/bin in your path. your path. Make a subdirectory, mkdir MPI, and cd to Make a subdirectory, mkdir MPI, and cd to it. it. Hello world example Hello world example Copy Copy hello.chello.c from /home/staff/ribbens/MPI. from /home/staff/ribbens/MPI. Compile and link: mpicc -o hello hello.c Compile and link: mpicc -o hello hello.c Run on 4 processors: mpirun -np 4 hello Run on 4 processors: mpirun -np 4 hello Learn more about mpirun: mpirun -help Learn more about mpirun: mpirun -help