MPI Communicators

advertisement
Group and Communicator
Management Routines
Research Computing
UNC - Chapel Hill
Instructor: Mark Reed
Email: markreed@unc.edu
Groups
 A group is an ordered set of processes.
 Each process in a group is associated with a
unique integer rank.
 Rank values start at 0 and go to N-1, where N is
the number of processes in the group.
 In MPI, a group is represented within system
memory as an object
• accessible to programmer only by a "handle".
 A group is always associated with a
communicator object.
its.unc.edu
2
Communicators
 A communicator encompasses a group of
processes that may communicate with each
other.
 All messages must specify a communicator.
 Like groups, communicators are represented
within system memory as objects, accessible to
the programmer only by "handles".
 E.G., handle of the communicator comprising
all tasks is MPI_COMM_WORLD
its.unc.edu
3
Groups - Communicators
 Communicators specify a communication
domain, i.e. communicators provide a self-
contained, communication “world” in which to
exchange messages
• typically they bind process groups and contexts
together to form a safe communication space
within the group
 intracommunicators are used for
communication within a group
 intercommunicators are used for
communication between disjoint groups
its.unc.edu
4
Groups - Communicators
 From the programmer's perspective, a
group and a communicator often appear
the same.
 The group routines are primarily used to
specify which processes should be used to
construct a communicator
its.unc.edu
5
Group and Communicator
Objects
 Primary purposes
• Allow you to organize tasks, based upon
function, into task groups.
• Enable Collective Communications operations
across a subset of related tasks.
• Provide basis for implementing user defined
virtual topologies
• Provide for safe communications
its.unc.edu
6
Communicators
 Groups/communicators are dynamic -
they can be created and destroyed during
program execution.
 Processes may be in more than one
group/communicator.
• They will have a unique rank within each
group/communicator.
 MPI provides over 40 routines related to
groups, communicators, and virtual
topologies
its.unc.edu
7
Typical usage:
 Extract handle of global group from
MPI_COMM_WORLD using MPI_Comm_group
 Form new group as a subset of global group
using MPI_Group_incl or one of the many
group constructors
 Create new communicator for new group
using MPI_Comm_create
 Determine new rank in new communicator
using MPI_Comm_rank
its.unc.edu
8
Typical usage cont. :
 Conduct communications using any MPI
message passing routine
 When finished, free up new communicator
and group (optional) using MPI_Comm_free
and MPI_Group_free
its.unc.edu
9
its.unc.edu
10
Group Accessors
its.unc.edu
11
MPI_Group_rank

int MPI_Group_rank (
MPI_Group group,int *rank)
 Returns the rank of this process in the given
group or MPI_UNDEFINED if the process is not
a member.
its.unc.edu
12
MPI_Group_size
 int MPI_Group_size(MPI_Group group,
int
*size)
 Returns the size of a group - number of
processes in the group.
its.unc.edu
13
MPI_Group_compare
 int MPI_Group_compare (MPI_Group group1,
MPI_Group group2, int *result)
• result - returned result of comparison
 Compares two groups and returns an integer
result which is MPI_IDENT if the order and
members of the two groups are the same,
MPI_SIMILAR if only the members are the
same, and MPI_UNEQUAL otherwise.
its.unc.edu
14
Group Constructors
its.unc.edu
15
MPI_Comm_group

int MPI_Comm_group (MPI_Comm comm,
MPI_Group *group)
• group - returned value is the handle
associated with comm
 Determines the group associated with the
given communicator.
its.unc.edu
16
MPI_Group_excl
 int MPI_Group_excl (MPI_Group group,
int n,int *ranks, MPI_Group *newgroup)
 Produces a group by reordering an existing group
and taking only unlisted members
• n - the size of the ranks array
• ranks - array with list of ranks to exclude from new
group , each should be valid and distinct
• newgroup -
new group derived from above, preserving
the order defined by group (handle)
 See also MPI_Group_range_excl
its.unc.edu
17
MPI_Group_incl
 int MPI_Group_incl (MPI_Group group,
int n, int *ranks, MPI_Group *newgroup)
 Produces a group by reordering an existing
group and taking only listed members.
• n - the size of the ranks array
• ranks - array with list of ranks to include in the
new group, each should be valid and distinct
• newgroup -
new group derived from above,
preserving the order defined by group (handle)
 See also MPI_Group_range_incl
its.unc.edu
18
MPI_Group_intersection
 int MPI_Group_intersection (
MPI_Group group1, MPI_Group group2,
MPI_Group *newgroup)
 Produces a group as the intersection of two
existing groups
• group1 - handle of first group
• group2 - handle of second group
• newgroup - handle of intersection group
its.unc.edu
19
MPI_Group_union
 int MPI_Group_union (MPI_Group group1,
MPI_Group group2, MPI_Group *newgroup)
 Produces a group by combining two groups.
• group1 - handle of first group
• group2 - handle of second group
• newgroup - handle of union group
its.unc.edu
20
MPI_Group_difference
 int MPI_Group_difference (
MPI_Group group1, MPI_Group group2,
MPI_Group *newgroup)
 Creates a group from the difference of two
groups.
• group1 - handle of first group
• group2 - handle of second group
• newgroup - handle of difference group
its.unc.edu
21
set-like operations:
 union
• All elements of the first group, followed by all
elements of second group, not in first.
 intersect
• all elements of the first group that are also in the
second group, ordered as in first group.
 difference
• all elements of the first group that are not in the
second group, ordered as in the first group.
its.unc.edu
22
set-like operations cont. :
 Note that for these operations the order
of processes in the output group is
determined primarily by order in the first
group (if possible) and then, if necessary,
by order in the second group.
 Neither union nor intersection are
commutative, but both are associative.
 The new group can be empty, that is,
equal to MPI_GROUP_EMPTY.
its.unc.edu
23
Group Destructors
its.unc.edu
24
24
MPI_Group_free
 int MPI_Group_free (MPI_Group *group)
 Frees a group
 This operation marks a group object for
deallocation.
 The handle group is set to
MPI_GROUP_NULL by the call.
 Any on-going operation using this group will
complete normally.
its.unc.edu
25
Manipulating
Communicators
Accessors, Constructors, Destructors
its.unc.edu
26
MPI_Comm_compare
 int MPI_Comm_compare (MPI_Comm comm1,
MPI_Comm comm2, int *result)
 Compares two communicators and returns integer
result
• MPI_IDENT - contexts and groups are the same
• MPI_CONGRUENT - different contexts but identical
groups
• MPI_SIMILAR - different contexts but similar groups
• MPI_UNEQUAL otherwise.
its.unc.edu
27
MPI_Comm_create
 int MPI_Comm_create (MPI_Comm comm,
MPI_Group group, MPI_Comm *newcomm)
 Creates a new communicator from the old
communicator and the new group.
• comm - communicator associated with old group
• group - new group to create a communicator for
• newcomm - returns new communicator (handle)
 Note: call is executed by all processes in comm
(even if they’re not in new group)
 returns MPI_COMM_NULL to non-members
its.unc.edu
28
MPI_Comm_split
 Partitions the group into disjoint subgroups
 arguments include 2 control arguments
• color - nonnegative integer selects process subset
• key - ranks in order by integer key value,
tiebreaker is original rank
 A new group for each distinct “color” is
created
 Use MPI_UNDEFINED as the color argument
to be excluded from all groups
its.unc.edu
29
MPI_Comm_dup
 int MPI_Comm_dup (MPI_Comm comm,
MPI_Comm
*newcomm)
 Duplicates an existing communicator with
all its associated information.
 This is useful for building safe parallel
libraries.
its.unc.edu
30
MPI_Comm_free
 int MPI_Comm_free (MPI_Comm *comm)
 Marks the communicator object for
deallocation. The handle is set to
MPI_COMM_NULL
 Any pending operations that use this
communicator will complete normally; the
object is actually deallocated only if there
are no other active references to it
its.unc.edu
31
Group and Communicator
Routines Example
/* NOTE: This does not work on all systems - buggy! */
/* Create two different process groups for separate collective
communications exchange. Requires creating new communicators*/
#include "mpi.h"
#include <stdio.h>
#define NPROCS 8
int main(argc,argv)
int argc;
char *argv[];
{
int rank, new_rank, sendbuf, recvbuf,
ranks1[4]={0,1,2,3}, ranks2[4]={4,5,6,7};
MPI_Group
orig_group, new_group;
MPI_Comm
new_comm;
its.unc.edu
32
Group and Communicator
Routines Example
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
sendbuf = rank;
/* Extract the original group handle */
MPI_Comm_group(MPI_COMM_WORLD, &orig_group);
/* Divide tasks into two groups based upon rank */
/* Note new_group has a different value on each PE */
if (rank < NPROCS/2) {
MPI_Group_incl(orig_group,NPROCS/2,ranks1,&new_group);
} else {
MPI_Group_incl(orig_group,NPROCS/2,ranks2,&new_group);
}
its.unc.edu
33
Group and Communicator
Routines Example
/* Create new new communicator and then perform collective
communications */
MPI_Comm_create(MPI_COMM_WORLD, new_group, &new_comm);
MPI_Allreduce(&sendbuf, &recvbuf,1,MPI_INT, MPI_SUM, new_comm);
MPI_Group_rank (new_group, &new_rank);
printf("rank= %d newrank= %d recvbuf= %d\n",
rank, new_rank, recvbuf);
MPI_Finalize();
}
its.unc.edu
34
Sample program output:
rank= 7 newrank= 3 recvbuf= 22
rank= 0 newrank= 0 recvbuf= 6
rank= 1 newrank= 1 recvbuf= 6
rank= 2 newrank= 2 recvbuf= 6
rank= 6 newrank= 2 recvbuf= 22
rank= 3 newrank= 3 recvbuf= 6
rank= 4 newrank= 0 recvbuf= 22
rank= 5 newrank= 1 recvbuf= 22
its.unc.edu
35
Previous Example Done
with MPI_Comm_split
/* this fixes the buggy Maui code by using MPI_Comm_split */
#include "mpi.h"
#include <stdio.h>
#define NPROCS 8
#define MASTER 0
#define MSGSIZE 7
int main(argc,argv)
int argc;
char *argv[];
{
int rank, new_rank,sendbuf, recvbuf,color;
char
msg[MSGSIZE+1]=" ";
MPI_Comm
new_comm;
its.unc.edu
36
Split example, Cont.
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
sendbuf = rank;
/* Divide tasks into two distinct groups. First */
/* create new group and then a new communicator.*/
/* Find new rank in new group and setup for the */
/* collective communication broadcast if MASTER.*/
/* use integer division to split group into 2 "colors“ */
/* 0 and 1 */
color = (2*rank)/NPROCS;
MPI_Comm_split(MPI_COMM_WORLD,color,rank,&new_comm);
MPI_Comm_rank(new_comm, &new_rank)
its.unc.edu
37
Split Concluded
if (new_rank == MASTER) sprintf(msg,"Group %d",color+1);
MPI_Bcast (&msg, MSGSIZE, MPI_CHAR, MASTER, new_comm);
MPI_Allreduce (&sendbuf, &recvbuf, 1,MPI_INT,MPI_SUM, new_comm);
printf("rank= %d newrank= %d msg= %s sum=%d\n",
rank, new_rank, msg, recvbuf);
MPI_Finalize();
}
its.unc.edu
38
Download