Basics of Message- passing • Mechanics of message-passing – A means of creating separate processes on different computers – A way to send and receive messages • Single program multiple data (SPMD) model – Logic for multiple processes merged into one program – Control Statements separate processor blocks of logic – A compiled program is stored on each processor – All executables are started together statically – Example: MPI (Message Passing Interface) • Multiple program multiple data (MPMD) model – Each processor has a separate master program – Master program spawns child processes dynamically – Example: PVM (Parallel Virtual Machine)
Basics of Message-passing. Mechanics of message-passing A means of creating separate processes on different computers A way to send and receive messages Single program multiple data (SPMD) model Logic for multiple processes merged into one program - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Basics of Message-passing• Mechanics of message-passing
– A means of creating separate processes on different computers– A way to send and receive messages
• Single program multiple data (SPMD) model– Logic for multiple processes merged into one program– Control Statements separate processor blocks of logic– A compiled program is stored on each processor– All executables are started together statically– Example: MPI (Message Passing Interface)
• Multiple program multiple data (MPMD) model– Each processor has a separate master program– Master program spawns child processes dynamically– Example: PVM (Parallel Virtual Machine)
PVM (Parallel Virtual Machine)
• Multiple process control: Host process: control environment; Any process can spawn others, Daemon: control message passing
• PVM System Calls– Control: pvm_mytid(), pvm_spawn(), pvm_parent(), pvm_exit()
– Get send buffer: pvm_initsend()
– Pack for sending: pvm_pkint(), pvm_pkfloat(), pvm_pkstr()
• MPI_Init: Bring up program on all computers, pass command line arguments, establish ranks.
• MPI_Comm_rank: Determine the rank of the current process
• MPI_Comm_size: return the number of processors that are running
• MPI_Finalize: Terminate the program normally• MPI_Abort: Terminate with an error code when
something bad happens
Standard Send (MPI_Send)
int MPI_Send(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm)
• Input Parameters– buf: initial address of send buffer (choice)
– count: integer number of elements in send buffer
– type: type of each send buffer element (ex: MPI_CHAR, MPI_INT, MPI_DOUBLE, MPI_BYTE, MPI_PACK, etc.)
– dest: rank of destination (integer)
– tag: message tag (integer)
– comm communicator (handle)
• Note: MPI_PACK allows different data types to be sent in a single buffer using the MPI_Pack and MPI_Unpack functions.
• Note: Google MPI_Send, MPI_Recv, etc for more intormation
Block until message is received or data copied to a buffer
Matching Message Tags• Differentiates between types of messages• The message tag is carried within message.• Wild card codes allow receipt of any message from any source
– MPI_ANY_TAG: matches any message type– MPI_ANY_SOURCE: matches messages from any sender– Sends cannot use wildcards (pull operation, not push)
Send message type 5 from buffer x to buffer y in process 2
Process 1 Process 2
send(&x,2,5);
recv(&y,1,5);
x y
Movementof data
Waits for a message from process 1 with a tag of 5
Status of Sends and ReceivesMPI_Status status;
MPI_Recv(&result, 1, MPI_DOUBLE, MPI_ANY_SOURCE,
MPI_ANY_TAG, MPI_COMM_WORLD, &status);
• status.MPI_SOURCE /* rank of sender */• status.MPI_TAG /* type of message */• status.MPI_ERROR /* error code */
• MPI_Get_count(&status, recv_type, &count) /* number of elements */
Console Input and Output• Input
– Console input must be initiated at the host process
if (rank==0) { printf("Enter some fraction");scanf("%lf", &value);fflush(stdin);
}
or gets(data) to read a string
• Output – Any process can initiate an output
– MPI uses internal library functions to route the output to the process initiating the program
– Transmission using a library functions initiated before normal application transmissions can arrive after, or visa versa
Groups and Communicators
• Group: A set of processes ordered by relative rank• Communicators: Context required for sends and receives
• Purpose: Enable collective communication (to subgroups of processors)
• The default communicator is MPI_COMM_WORLD– A unique rank corresponds to each executing process– The rank is an integer from 0 to p – 1– The number of processors executing is p
• Applications can create subset communicators– Each processor has a unique rank in each sub-communicator– The rank is an integer from 0 to g-1– The number of processors in the group is g
Example
MPI Group Communicator Functions
Typical Usage
1. Extract group from communicator: MPI_Comm_group
2. Form new group: MPI_Group_incl or MPI_Group_excl
3. Create new group communicator: MPI_Comm_create
4. Determine group rank: MPI_Comm_rank
5. Communications: MPI message passing functions
6. Destroy created communicators and groups: MPI_Comm_free and MPI_Group_free
Details
• MPI_Group_excl:– New group without certain processes from an existing group
– int MPI_Group_excl(MPI_Group group, int n, int *ranks, MPI_Group *newgroup);
• MPI_Group_incl:– New group withc selected processes from an existing group
– int MPI_Group_incl(MPI_Group group, int n, int *ranks, MPI_Group *newgroup);
• Synchronous – Send Completes when data safely received– Receive completes when data is available– No copying to/from internal buffers
• Asynchronous – Copy to internal message buffer– Send completes when transmission begins– Local buffers are free for application use– Receive polls to determine if data is
available
Process 1 Process 2
send(&x, 2);
recv(&y, 1);
x y
Generic syntax (actual formats later)
Synchronized sends and receivesProcess 1 Process 2
send();
recv();Suspend
Time
processAcknowledgment
MessageBoth processescontinue
(a) send() occurs before recv()
Process 1 Process 2
recv();
send();Suspend
Time
process
Acknowledgment
MessageBoth processescontinue
Request to send
Request to send
(b) recv() occurs before send()
Point to Point MPI calls• Buffered Send (receiver gets to it when it can)
– Completes after data is copied to a user supplied buffer– Becomes synchronous if no buffers are available
• Ready Send (guarantee transmission is successful)– A matching receive call must precede the send– Completion occurs when remote processor receives the data
• Standard Send (starts transmission if possible)– If receive call is posted, completes when transmission starts– If no receive call is posted, completes when data is buffered by MPI,
but becomes synchronous if no buffers are available
• Blocking - Return occurs when the call completes• Non-Blocking - Return occurs immediately
– Application must periodically poll or wait for completion– Why non-blocking? To allows more parallel processing
Buffered Send ExampleApplications supply a data buffer area using
MPI_Buffer_attach() to hold the data during transmission
Process 1 Process 2
send();
recv();
Message buffer
Readmessage buffer
Continueprocess
Time
Note: transmission is between sender/receiver MPI buffersNote: copying in and out of buffers can be expensive
MPI_Wait(io, stat);} else if (myrank == 1) { MPI_Recv(&x,1,MPI_INT,0,99,MPI_COMM_WORLD,stat);}
• MPI_Isend() and MPI_Irecv() return immediately• MPI_Rsend returns when received by remote computer, MPI_Bsend
Buffered send, MPI_Send Standard send• MPI_Wait() returns after transmission, MPI_Test() returns non-zero after
transmission, returns zero otherwise
Message Passing Order
Note: Messages originating from a processor will always be received in order. Messages from different processors can be received out of order.
lib()
lib()
send(…,1,…);
recv(…,0,…);
Process 0 Process 1
send(…,1,…);
recv(…,0,…);(a) Messages received out of order
(b) Messages received in order
lib()
lib()
send(…,1,…);
recv(…,0,…);
Process 0 Process 1
send(…,1,…);
recv(…,0,…);
Destination
Source
Collective Communication
• MPI_Bcast()): Broadcast or Multicast data to processors in a group• Scatter (MPI_Scatter()): Send parts of an array to separate processes• Gather (MPI_Gather()): Collect array elements from separate processes• AlltoAll (MPI_Alltoall()): A combination of gather and scatter. All
processes send; then sections of the combined data are gathered• MPI_Reduce(): Combine values from all processes to a single value
using some operation (function call).• MPI_Reduce_scatter(): First reduce and then scatter result• MPI_Scan(): Reduce values received from processors of lower rank in
the group. (Note: this is a prefix reduction)• MPI_Barrier(): Pause until all processors reach the barrier call
MPI operations on groups of processes
Advantages•MPI can use the processor hierarchy to improve efficiency• Although, we can implement collective communication using standard send and receive calls, collective operations require less programming and debugging
Reduce, BroadCast, All Reduce
Reduce, then broadcast
Butterfly Allreduce
Predefined Collective Operations
• MPI_MAX, MPI_MIN: maximum, minimum
• MPI_MAXLOC, MPI_MINLOC: – If the output buffer is out
– For each index, out[i].val and out[i].rank contains the max (or min) value and the processor rank containing it
• MPI_SUM, MPI_PROD: sum, product
• MPI_LAND, MPI_LOR, MPI_LXOR: logical &, |, ^
• MPI_BAND, MPI_BOR, MPI_BXOR: bitwise &, |, ^
Derived MPI Data Types/* Goal: send items, each containing a double, integer, and a string */