Summary of MPI commands Luis Basurto
Jan 11, 2016
Summary of MPI commands
Luis Basurto
Large scale systems
• Shared Memory systems
– Memory is shared among processors
• Distributed memory systems
– Each processor has its own memory
MPI
• Created in 1993 as an open standard by large scale system users and creators.
• Each system provider implements MPI for its systems.
• Currently at version MPI-2.0
• Other implementations of MPI such as MPICH,OpenMPI.
How many commands?
• 130+ commands
• 6 basic commands (we will cover 11)
• C and Fortran bindings
How does an MPI program work?
Start program on n processors
For i=0 to n-1
Run a copy of program on processor i
Pass messages between processors
End For
End Program
What are messages?
• Simplest message: an array of data of one type.
• Predefined types correspond to commonly used types in a given language
– MPI_REAL (Fortran), MPI_FLOAT (C)
– MPI_DOUBLE_PRECISION (Fortran), MPI_DOUBLE (C)
– MPI_INTEGER (Fortran), MPI_INT (C)
• User can define more complex types and send packages.
Before we start
Include MPI in our program
• In C/C++
#include “mpi.h”
• In Fortran
include 'mpif.h'
• In C MPI calls are functions
MPI_Init();
• In Fortran they are subroutines
call MPI_Init(ierror)
A note about Fortran
• All calls to MPI include an extra parameter, an error code of type integer.
• Used to test the success of the function (i.e. The function executed correctly).
Basic Communication
• Data values are transferred from one processor to another
– One processor sends the data
– Another receives the data
• Synchronous
– Call does not return until the message is sent or received
• Asynchronous
– Call indicates a start of send or receive, and another call is made to determine if finished
MPI_init()
• Initializes the MPI environment
• Every MPI program must have this.
• C
– MPI_Init();
• If using command line arguments
– MPI_Init( &argc, &argv );
• Fortran
– call MPI_Init(ierror)
MPI_Finalize()
• Stops the MPI environment
• Every MPI program must have this at the end.
• C
MPI_Finalize ( );
• Fortran
call MPI_Finalize(ierr)
MPI_Comm_size()
• Returns the size of the communicator (number of nodes) that we are working with.
• CMPI_Comm_size ( MPI_COMM_WORLD, &p );
• Fortrancall MPI_COMM_SIZE(MPI_COMM_WORLD, p, ierr )
MPI_Comm_rank()
• Return the zero based rank (id number) of the node executing the program.
• CMPI_Comm_rank ( MPI_COMM_WORLD, &id );
• Fortrancall MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr )
A note con communicators
• MPI_COMM_WORLD is the default communicator (all nodes in the cluster)
• Communicators can be created dynamically in order to assign certain tasks to certain nodes (processors).
• Inter communicator message passing is possible.
MPI_Send()
• C
MPI_Send(void *buf, int count, MPI_Datatype dtype, int dest, int tag, MPI_Comm comm);
• Fortran
Call MPI_Send(buffer, count, datatype, destination,tag,communicator, ierr)
MPI_Recv()
• C
MPI_Recv(void *buf, int count, MPI_Datatype dtype, int src,int tag, MPI_Comm comm,
MPI_Status *stat);
• Fortran
Call MPI_Recv(buffer, count, datatype,source, tag, communicator,status, ierr)
MPI_Bcast()
• Send message to all nodes
• C
MPI Bcast(void * buf, int count, MPI_Datatype dtype, int root, MPI Comm comm);
• Fortran
CALL MPI_BCAST(buff, count, MPI_TYPE, root, comm, ierr)
MPI_Reduce()
• Receive message from all nodes, do operation on every element.
• C
MPI_Reduce(void *sbuf, void* rbuf, int count, MPI_Datatype dtype, MPI_Op op, int root, MPI Comm comm);
• Fortran
CALL MPI_REDUCE(sndbuf, recvbuf,count, datatype,operator,root,comm,ierr)
MPI_Barrier()
• Used as a synchronization barrier, every node that reaches this point must wait until all nodes reach it in order to proceed.
• C
MPI_Barrier(MPI_COMM_WORLD);
• Fortran
call MPI_Barrier(MPI_COMM_WORLD,ierr)
MPI_Scatter()
• Parcels out data from the root to every member of the group in linear order by node
• C
MPI_Scatter(void *sbuf, int scount, MPI_Datatype sdtype,void *rbuf, int rcount, MPI_Datatype rdtype,int root, MPI_Comm comm)
• Fortran
CALL MPI_SCATTER(sndbuf,scount,datatype, recvbuf,rcount,rdatatype,root,comm, ierr)
MPI_Scatter
Node 0
Node 1
Node 2
Node 3
MPI_Gather()
• C
MPI_Gather(void *sbuf, int scount, MPI_Datatype sdtype,void *rbuf, int rcount, MPI_Datatype rdtype,int root, MPI_Comm comm)
Fortran
CALL MPI_GATHER(sndbuf,scount,datatype, recvbuf,rcount,rdatatype,root,comm,ierr)
Deadlock
• The following code may provoke deadlock
if(rank==0) {
MPI_COMM_WORLD.Send(vec1,vecsize,MPI::DOUBLE,1,0);
MPI_COMM_WORLD.Recv(vec2,vecsize,MPI::DOUBLE,1,MPI::ANY_TAG);
}
if(rank==1) {
MPI_COMM_WORLD.Send(vec3,vecsize,MPI::DOUBLE,0,0);
MPI_COMM_WORLD.Recv(vec4,vecsize,MPI::DOUBLE,0,MPI::ANY_TAG);
}
Bcast
• Must be called by all nodes, the following code will not work
if(rank==0) {
MPI_Bcast(&value, 1, MPI_int,0, MPI_comm_world);
}
else {
/* Do something else */
}
Questions