if (rank == 0)

advertisement
Lecture 2
Message-Passing Computing
消息传递计算
张少强
http://bioinfo.uncc.edu/szhang
sqzhang@163.com
2.1
机群计算的软件工具
八十年代后期九十年代中期-
Parallel Virtual Machine (PVM)
Message-Passing Interface (MPI)
统一标准
1.都是基于消息传递的并行编程模型.
2.为消息传递均提供库例程集user-level libraries.
3.均用顺序编程语言 (C, C++, ...).
4.均可免费获取.
2.2
MPI
(Message Passing Interface)
• 为了广泛的使用和可移植性,统一标准.
• MPI定义了标准,而不是具体的实现.
(version 2: MPI-2)
• 几种MPI标准的实现:MPICH,LAMMPI,及
HP,IBM (MPL),SUN等供应商的实现。
2.3
用库(library)程序进行消息传递
2.4
计算机之间的消息传递路线是由安装在各计算机的后台进程(daemon
processes)完成的.
Workstation
daemon process
每台计算机可以有不止一个进程
Application
program
(executable)
消息通过网络传递
Workstation
Workstation
Application
program
(executable)
Application
program
(executable)
.
2.5
用MPI Libraries进行消息传递编程
两个必要编程方法:
1. 创建独立进程(process)使他们能在不同的计算机
上执行的方法
2. 发送和接收消息的方法
2.6
1.在不同的计算机创建进程
2.7
Multiple program, multiple data (MPMD) model
多程序多数据流
• 每个处理器执行不同的程序
Source
file
Source
file
Compile to suit
processor
Executable
Processor 0
Processor p - 1
2.8
Single Program Multiple Data (SPMD) model
• Same program executed by each processor每个处
理器均执行相同的程序
• Control statements select different parts for each
processor to execute. 用控制语句为每个处理器选择
执行的程序部分.
Source
file
基本的 MPI 方式
Compile to suit
processor
Executables
Processor 0
Processor p - 1
2.9
MPI初始化
#include “mpi.h”
#include <stdio.h>
#include <math.h>
Void main(int argc, char *argv[])
{
MPI_Init(&argc, &argv); /*初始化MPI*/
…
/*&argc 为参数计数,&argv为参数向量*/
…
…
MPI_Finalize(); /*结束MPI*/
}
2.10
MPI初始化后,所有进程进入一个通信子
communicator的通信集合中,每个进程被给定一个序
号rank 从0,1,到p-1
Rank可翻译成进程号
程序用控制语句, 特别是 IF 语句来指示进程执行特定
的动作.
Example
if (rank == 0) {...}
/* do this */;
else if (rank == 1) {... }
/* do this */;
.
.
.
2.11
Master-Slave approach
主进程-从进程方法
通常计算被构造成主-从进程的模型
主进程完成一些actions,而从进程的actions是一样,只
是数据不一样.(One process (the master), performs one set of
actions and all the other processes (the slaves) perform identical
actions although on different data)
i.e.
if (rank == 0) {master…} /* master do this */;
else { slaves...} /* all slaves do this */;
2.12
Static process creation静态进程创建
• 所有进程在程序执行前须说明(标识出来).
• 执行固定数目的进程
2.13
MPMD Model
with Dynamic Process Creation
(动态进程的创建)
• One processor executes master process.
• Other processes started from within master process
可以不用事先定义
进程数目
Process 1
“派生”
spawn();
启动进程2执行
Process 2
Time
MPI_Comm_spawn()
2.14
2.发送和接收消息的方法
2.15
Basic “point-to-point”
Send and Receive Routines
基本的“点对点”发送和接收
用send() 和recv()库进行进程之间的消息传递
源进程1中的数据x发送到目的进程2的y处
Process 1
Process 2
x
y
send(&x, 2);
数据移动
recv(&y, 1);
Generic syntax
伪代码 Send(), recv(): 在MPI的表示形如MPI_?send()和MPI_?recv()
2.16
MPI 消息传递用 MPI_send() and MPI_recv()
缓冲暂存数据
x和y的数据类型相同
2.17
MPI的Send() 和Recv()
•阻塞( blocking)消息传递, 本地操作完成后就返回
MPI_Send() – 阻塞发送将发送消息并返回,消息可能没有到达
目的地,但进程可以自由地继续执行.
MPI_Recv() – 当消息接收到,数据也接收到后返回。进程有拖
延。
•非阻塞(nonblocking)
MPI_Isend():无论历程是否已经局部完成都允许执行下一条
语句
MPI_Irecv():即使没接收到消息也将立即返回。
I: immediate
2.18
Message Tag消息标记
• 用来区分所发送的不同类型的消息
• 消息标记与消息一同发送.
• 若不需要特殊类型的匹配, 可以用万能标记.
recv() 可与任何send()匹配.
2.19
消息标记tag举例
To send a message, x, with message tag 5 from
a source process, 1, to a destination process, 2,
and assign to y:
Process 1
Process 2
x
y
send(&x,2, 5);
Movement
of data
recv(&y,1,
5);
Waits for a message from process 1 with a tag of 5
2.20
不安全的消息传递 – 举例
Process 0
Process 1
Destination
send(…,1,…);
lib()
send(…,1,…);
Source
recv(…,0,…);
(a) 期望的行为
lib()
recv(…,0,…);
Process 0
Process 1
send(…,1,…);
(b) 可能的行为
lib()
send(…,1,…);
recv(…,0,…);
lib()
recv(…,0,…);
2.21
MPI Solution
“Communicators”(通信子)
• 定义通信域( communication domain) – 一组只能
组内通信的进程.
• Communication domains of libraries can be
separated from that of a user program.
• Used in all point-to-point and collective MPI
message-passing communications (集合通信).
2.22
默认通信子 MPI_COMM_WORLD
• 作为应用程序中所有进程的第一个通信子存.
• Processes have a “rank” in a communicator.
2.23
Using SPMD Computational Model
main (int argc, char *argv[]) {
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /*find rank */
if (myrank == 0)
master();
else
slave();
MPI_Finalize();
}
where master() and slave() are to be executed by master
process and slave process, respectively.
2.24
Parameters of blocking send
MPI_Send(buf, count, datatype, dest, tag, comm)
发送缓冲区地址
每项的数据类型
要发送的项数
消息标记
目的进程的序号
通信子
MPI_INT,
MPI_FLOAT,
MPI_CHAR
2.25
Parameters of blocking receive
MPI_Recv(buf, count, datatype, src, tag, comm, status)
Status
Message tag after operation
Address of
Datatype of
receive buffer
each item
Maximum number
Rank of source Communicator
of items to receive
process
2.26
Example
To send an integer x from process 0 to process 1,
MPI_Comm_rank(MPI_COMM_WORLD,&myrank); /* find rank */
if (myrank == 0) {
int x;
MPI_Send(&x, 1, MPI_INT, 1, msgtag, MPI_COMM_WORLD);
} else if (myrank == 1) {
int x;
MPI_Recv(&x,1,MPI_INT,0,msgtag,MPI_COMM_WORLD,status);
}
2.27
Sample MPI “Hello World” program
#include <stddef.h>
#include <stdlib.h>
#include "mpi.h"
main(int argc, char **argv ) {
char message[20];
int i,rank, size, type=99;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
if(rank == 0) {
strcpy(message, "Hello, world");
for (i=1; i<size; i++)
MPI_Send(message,13,MPI_CHAR,i,type,MPI_COMM_WORLD);
} else
MPI_Recv(message,20,MPI_CHAR,0,type,MPI_COMM_WORLD,&status);
printf( "Message from process =%d : %.13s\n", rank,message);
MPI_Finalize();
}
2.28
Program sends message “Hello World” from master process
(rank = 0) to each of the other processes (rank != 0). Then, all
processes execute a println statement.
In MPI, standard output automatically redirected from remote
computers to the user’s console so final result will be
Message from process =1 : Hello, world
Message from process =0 : Hello, world
Message from process =2 : Hello, world
Message from process =3 : Hello, world
...
except that the order of messages might be different but is
unlikely to be in ascending order of process ID; it will depend
upon how the processes are scheduled.
2.29
Setting Up the Message Passing
Environment
Usually computers specified in a file, called a hostfile
or machines file.
File contains names of computers and possibly
number of processes that should run on each
computer.
Implementation-specific algorithm selects computers
from list to run user programs.
2.30
Users may create their own machines file for their
program.
Example
node01 (or IP address or domain name)
node02
node03
node04
node05
node06
If a machines file not specified, a default machines
file used or it may be that program will only run on a
single computer.
2.31
Compiling/Executing MPI Programs
• Minor differences in the command lines required
depending upon MPI implementation.
• For the assignments, we will use MPICH or
MPICH-2.
• Generally, a machines file need to be present that
lists all the computers to be used. MPI then uses
those computers listed. Otherwise it will simply
run on one computer
2.32
MPICH and MPICH-2
• Both Windows and Linux versions
• Very easy to install on a Windows system.
2.33
MPICH Commands
Two basic commands:
• mpicc, a script to compile MPI programs
• mpirun, the original command to execute an
MPI program, or
• mpiexec - MPI-2 standard command mpiexec
replaces mpirun although mpirun still exists.
2.34
Compiling/executing (SPMD) MPI program
For MPICH. At a command line:
To start MPI:
Nothing special.
To compile MPI programs:
for C
mpicc -o prog prog.c
for C++
mpiCC -o prog prog.cpp
A positive integer
To execute MPI program:
mpiexec -n no_procs prog
or
mpirun -np no_procs prog
2.35
Executing MPICH program on
multiple computers
Create a file called say “machines”
containing the list of machines, say:
node01
node02
node03
node04
node05
node06
2.36
mpiexec -machinefile machines -n 4 prog
would run prog with four processes.
Each processes would execute on one of machines
in list. MPI would cycle through list of machines
giving processes to machines.
Can also specify number of processes on a
particular machine by adding that number after
machine name.)
2.37
Debugging/Evaluating Parallel
Programs Empirically
2.38
Visualization Tools
Programs can be watched as they are executed in
a space-time diagram (or process-time diagram):
Process 1
Process 2
Process 3
Computing
Time
Waiting
Message-passing system routine
Message
2.39
Implementations of visualization tools are
available for MPI.
An example is the Upshot program visualization
system.
2.40
Evaluating Programs Empirically
Measuring Execution Time
To measure execution time between point L1 and point
L2 in code, might have construction such as:
.
L1: time(&t1);
/* start timer */
.
.
L2: time(&t2);
/* stop timer */
.
elapsed_Time = difftime(t2, t1); /*time=t2-t1*/
printf(“Elapsed time=%5.2f secs”,elapsed_Time);
2.41
MPI provides the routine MPI_Wtime() for returning
time (in seconds):
double start_time, end_time, exe_time;
start_time = MPI_Wtime();
.
.
end_time = MPI_Wtime();
exe_time = end_time - start_time;
2.42
Download