Top Banner
Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010
79

Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Jan 02, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Introduction to Parallel Programming

with C and MPI at MCSR

Part 1

The University of Southern Mississippi

April 8, 2010

Page 2: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

What is a Supercomputer?

Loosely speaking, it is a “large” computer with an architecture that has been optimized for bigger solving problems faster than a conventional desktop, mainframe, or server computer.

- Pipelining

- Parallelism (lots of CPUs or Computers)

Page 3: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Supercomputers at MCSR: mimosa

- 253 CPU Intel Linux Cluster – Pentium 4- Distributed memory – 500MB – 1GB per node- Gigabit Ethernet

Page 4: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Supercomputers at MCSR: redwood

- 224 CPU Memory Supercomputer- Intel Itanium 2- Shared Memory: 1GB per node

Page 5: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Supercomputers at MCSR: sequoia

- 46 node Linux Cluster- 8 cores (CPUs) per node = 368 cores total- 2 GB memory per core (16 GB per node)- Shared memory intra-node- Distributed memory inter-node

- Intel Xeon processors

Page 6: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Supercomputers at MCSR: sequoia

Page 7: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

What is Parallel Computing?

Using more than one computer (or processor) to complete a computational problem

Page 8: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

How May a Problem be Parallelized?

Data Decomposition

Task Decomposition

Page 9: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Models of Parallel Programming

• Message Passing Computing– Processes coordinate and communicate results via calls to message passing

library routines– Programmers “parallelize” algorithm and add message calls– At MCSR, this is via MPI programming with C or Fortran

• Sweetgum – Origin 2800 Supercomputer (128 CPUs)• Mimosa – Beowulf Cluster with 253 Nodes• Redwood – Altix 3700 Supercomputer (224 CPUs)

• Shared Memory Computing– Processes or threads coordinate and communicate results via shared memory

variables– Care must be taken not to modify the wrong memory areas– At MCSR, this is via OpenMP programming with C or Fortran on sweetgum

Page 10: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Computing at MCSR

• Process Creation• Manager and Worker Processes• Static vs. Dynamic Work Allocation • Compilation• Models• Basics• Synchronous Message Passing• Collective Message Passing• Deadlocks• Examples

Page 11: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Process Creation

• Dynamic– one process spawns other processes & gives them work

– PVM

– More flexible

– More overhead - process creation and cleanup

• Static– Total number of processes determined before execution

begins

– MPI

Page 12: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Processes

• Often, one process will be the manager, and the remaining processes will be the workers

• Each process has a unique rank/identifier

• Each process runs in a separate memory space and has its own copy of variables

Page 13: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Work Allocation

• Manager Process– Does initial sequential processing– Initially distributes work among the workers

• Statically or Dynamically

– Collects the intermediate results from workers– Combines into the final solution

• Worker Process– Receives work from, and returns results to, the manager– May distribute work amongst themselves

(decentralized load balancing)

Page 14: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Compilation

• Compile/link programs w/ message passing libraries using regular (sequential) compilers

• Fortran MPI example:include mpif.h

• C MPI example:#include “mpi.h”

Page 15: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Compilation

Page 16: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 17: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Models

• SPMD – Shared Program/Multiple Data– Single version of the source code used for each process– Manager executes one portion of the program; workers

execute another; some portions executed by both– Requires one compilation per architecture type– MPI

• MPMP – Multiple Program/Multiple Data– Once source code for master; another for slave– Each must be compiled separately– PVM

Page 18: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Basics

• Each process must first establish the message passing environment

• Fortran MPI example:integer ierror

call MPI_INIT (ierror)

• C MPI example:MPI_Init(&argc, &argv);

Page 19: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Basics

• Each process has a rank, or id number– 0, 1, 2, … n-1, where there are n processes

• With SPMD, each process must determine its own rank by calling a library routine

• Fortran MPI Example:integer comm, rank, ierrorcall MPI_COMM_RANK(MPI_COMM_WORLD, rank,

ierror)

• C MPI ExampleMPI_Comm_rank(MPI_COMM_WORLD, &rank);

Page 20: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Basics

• Each process has a rank, or id number– 0, 1, 2, … n-1, where there are n processes

• Each process may use a library call to determine how many total processes it has to play with

• Fortran MPI Example:integer comm, size, ierrorcall MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)

• C MPI ExampleMPI_Comm_size(MPI_COMM_WORLD, &size);

Page 21: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Basics

• Each process has a rank, or id number– 0, 1, 2, … n-1, where there are n processes

• Once a process knows the size, it also knows the ranks (id #’s) of those other processes, and can send or receive a message to/from any other process.

• C Example:MPI_Send(buf, count, datatype, dest, tag, comm, ierror)

------DATA---------- ---EVELOPE--- -status------MPI_Recv(buf, count, datatype, sourc,tag,comm, status,ierror)

Page 22: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

MPI Send and Receive Arguments

• Buf starting location of data• Count number of elements• Datatype MPI_Integer, MPI_Real, MPI_Character…• Destination rank of process to whom msg being sent• Source rank of sender from whom msg being received

or MPI_ANY_SOURCE

• Tag integer chosen by program to indicate type of messageor MPI_ANY_TAG

• Communicator id’s the process team, e.g., MPI_COMM_WORLD

• Status the result of the call (such as the # data items received)

Page 23: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Synchronous Message Passing

• Message calls may be blocking or nonblocking

• Blocking Send– Waits to return until the message has been received by the

destination process

– This synchronizes the sender with the receiver

• Nonblocking Send– Return is immediate, without regard for whether the message has

been transferred to the receiver

– DANGER: Sender must not change the variable containing the old message before the transfer is done.

– MPI_ISend() is nonblocking

Page 24: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Synchronous Message Passing

• Locally Blocking Send– The message is copied from the send parameter

variable to intermediate buffer in the calling process– Returns as soon as the local copy is complete– Does not wait for receiver to transfer the message from

the buffer– Does not synchronize– The sender’s message variable may safely be reused

immediately – MPI_Send() is locally blocking

Page 25: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Synchronous Message Passing

• Blocking Receive– The call waits until a message matching the given tag has been

received from the specified source process.– MPI_RECV() is blocking.

• Nonblocking Receive– If this process has a qualifying message waiting, retrieves that

message and returns– If no messages have been received yet, returns anyway– Used if the receiver has other work it can be doing while it waits– Status tells the receive whether the message was received– MPI_Irecv() is nonblocking– MPI_Wait() and MPI_Test() can be used to periodically check to see

if the message is ready, and finally wait for it, if desired

Page 26: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Collective Message Passing

• Broadcast– Sends a message from one to all processes in the group

• Scatter– Distributes each element of a data array to a different

process for computation

• Gather– The reverse of scatter…retrieves data elements into an

array from multiple processes

Page 27: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Collective Message Passing w/MPI

MPI_Bcast() Broadcast from root to all other processes

MPI_Gather() Gather values for group of processes

MPI_Scatter() Scatters buffer in parts to group of processes

MPI_Alltoall() Sends data from all processes to all processes

MPI_Reduce() Combine values on all processes to single val

MPI_Reduce_Scatter() Broadcast from root to all other processes

MPI_Bcast() Broadcast from root to all other processes

Page 28: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Message Passing Deadlock

• Deadlock can occur when all critical processes are waiting for messages that never come, or waiting for buffers to clear out so that their own messages can be sent

• Possible Causes– Program/algorithm errors

– Message and buffer sizes

• Solutions– Order operations more carefully

– Use nonblocking operations

– Add debugging output statements to your code to find the problem

Page 29: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 30: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Sample PBS Script

sequoia% vi example.pbs #!/bin/bash#PBS -l nodes=4 # Mimosa#PBS –l ncpus=4 # Redwood#PBS –l ncpus=4 # Sequoia#PBS –l cput=0:5:0 # Request 5 minutes of CPU time#PBS –N examplecd $PWDrm *.pbs.[eo]*icc –lmpi –o add_mpi.exe add_mpi.c #Sequoiampiexec -n 4 add_mpi.exe #Sequoia

sequoia % qsub example.pbs37537.sequoia.mcsr.olemiss.edu

Page 31: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

PBS: Querying Jobs

Page 32: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

MPI Programming Exercises

Hello World

sequential

parallel (w/MPI and PBS)

Add the prime numbers in an Array of numbers

sequential

parallel (w/MPI and PBS)

Page 33: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Log in to sequoia & get workshop filesA. Use secure shell to login from your PC to hpcwoods

ssh [email protected]

B. Use secure shell to from hpcwoods to your training account on sequoia:

ssh tracct1@sequoia

ssh tracct2@sequoia

C. Copy workshop files into your home directory by running: /usr/local/apps/ppro/prepare_mpi_workshop

Page 34: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 35: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute hello.c

Page 36: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Page 37: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Add macro to include theheader file for the MPI library calls.

Page 38: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Add function call to initialize

the MPI environment

Page 39: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Add function call find out how many parallel

processes there are.

Page 40: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Add function call to find out which process

this is – the MPI process ID of this process.

Page 41: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

Add IF structure so that the manager/boss process can do

one thing, and everyone else (the workers/servants)

can do something else.

Page 42: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine hello_mpi.c

All processes, whether manager or worker, must finalize MPI operations.

Page 43: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Compile hello_mpi.c

Why won’t this compile?

You must link to the MPI library.

Compile it.

Page 44: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Run hello_mpi.exe

On 1 CPU

On 2 CPUs

On 4 CPUs

Page 45: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 46: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 47: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 48: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 49: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 50: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

hello_mpi.pbs

Page 51: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Submit hello_mpi.pbs

Page 52: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Submit hello_mpi.pbs

Page 53: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Submit hello_mpi.pbs

Page 54: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 55: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 56: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 57: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 58: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 59: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 60: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 61: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 62: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 63: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 64: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 65: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 66: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 67: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine, compile, and execute add_mpi.c

Page 68: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine add_mpi.pbs

Page 69: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine add_mpi.pbs

Page 70: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine add_mpi.pbs

Page 71: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Submit PBS Script: add_mpi.pbs

Page 72: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Examine Output and Errors add_mpi.c

Page 73: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Determine Speedup

Page 74: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Determine Parallel Efficiency

Page 75: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Page 76: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

How Could Speedup/Efficiency Improve?

Page 77: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

What Happens to ResultsWhen MAXSIZE NotEvenly Divisible by n?

Page 78: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Exercise 1:Change Code to Work When

MAXSIZE is Not EvenlyDivisible by n

Page 79: Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.

Exercise 2:Change Code to Improve Speedup