Parallel Programming in MPI (part 1)

January 10, 2012 1 並列処理：複数の演算器で仕事を分担 Parallel Processing: Share a job among multiple processors  Multi-Core / Multi-CPU PC  1台の計算機内 within one PC  小規模問題向け for small problems  Cluster of PCs / Supercomputer  複数の計算機を相互接続 Interconnect computers  中規模～大規模問題向け for middle- to large-scale problems 計算機間で通信が必要 Communication is required among computers 2 どうやって、プログラムに通信を記述するか？ How to Describe Communications in a Program?  TCP, UDP ?  Good: - 多くのネットワークに実装されており，可搬性が高い． Portable: Available on many networks.  Bad: - 接続やデータ転送の手続きが複雑 Protocols for connections and data-transfer are complicated. - 広域ネットワークを対象に設計されており，オーバーヘッドが大きい． High overhead, since they are designed for wide-area (= unreliable) networks. 記述可能だが，並列処理には適さない Possible. But not suitable for parallel processing. 3 MPI (Message Passing Interface)  並列計算向けに設計された通信関数群 A set of communication functions designed for parallel processing  C, C++, Fortranのプログラムから呼び出し Can be called from C/C++/Fortran programs.  "Message Passing" = Send + Receive  実際には，Send, Receive 以外にも多数の関数を利用可能． Actually, more functions other than Send and Receive are available.  ともかく、プログラム例を見てみましょう Let's see a sample program, first. 4 #include <stdio.h> #include "mpi.h" int main(int argc, char *argv[]) { int myid, procs, ierr, i; double myval, val; MPI_Status status; FILE *fp; char s[64]; Setup MPI environment MPI_Init(&argc, &argv); Get own process ID (= rank) MPI_Comm_rank(MPI_COMM_WORLD, &myid); Get total number of processes MPI_Comm_size(MPI_COMM_WORLD, &procs); If my ID is 0 if (myid == 0) { fp = fopen("test.dat", "r"); input data for this process and keep it in myval fscanf(fp, "%lf", &myval); i = １～procs－１ for (i = 1; i < procs; i++){ input data and keep it in val fscanf(fp, "%lf", &val); MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); } use MPI_Send to send value in val to process i fclose(fp); } else MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); processes with ID other than 0 use MPI_Recv to receive data from process 0 and keep it in myval printf("PROCS: %d, MYID: %d, MYVAL: %e\n", procs, myid, myval); MPI_Finalize(); print-out its own myval return 0; } end of parallel computing 5 プログラム例の実行の流れ Flow of the sample program.  複数の"プロセス"が，自分の番号（ランク）に応じて実行 Multiple "Processes" execute the program according to their number (= rank). rank 0 rank 1 read data myval from a file rank 2 read data from a file receive data from rank 0 val wait for the arrival of the data send val to rank 1 read data from a file send val to rank 2 print myval receive data from rank 0 myval val print myval wait for the arrival of the data myval print myval 6 実行例 Sample of the Result of Execution  各プロセスがそれぞれ勝手に表示するので、表示の順番は毎回変わる可能性がある。 The order of the output can be different, since each process proceeds execution independently. PROCS: PROCS: PROCS: PROCS: 4 4 4 4 MYID: MYID: MYID: MYID: 1 2 0 3 MYVAL: MYVAL: MYVAL: MYVAL: 20.0000000000000000 30.0000000000000000 10.0000000000000000 40.0000000000000000 rank rank rank rank 1 2 0 3 7 MPIインタフェースの特徴 Characteristics of MPI Interface  MPI プログラムは，普通の C言語プログラム MPI programs are ordinal programs in C-language  Not a new language  各プロセスが同じプログラムを実行する Every process execute the same program  ランク（＝プロセス番号）を使って，プロセス毎に違う仕事を実行 Each process executes its own work according to its rank(=process number)  他のプロセスの変数を直接見ることはできない。 A process cannot read or write variables on other process directly Rank 0 Read file Read file myval Rank 2 val Receive Send Read file Rank 1 val Receive myval Print myval Send Print myval myval Print myval 8 TCP, UDP vs MPI  MPI:並列計算に特化したシンプルな通信インタフェース Simple interface dedicated for parallel computing  SPMD(Single Program Multiple Data-stream) model  全プロセスが同じプログラムを実行 All processes execute the same program  TCP, UDP: 各種サーバ等，様々な用途を想定した汎用的な通信インタフェース Generic interface for various communications, such as internet servers  Server/Client model  各プロセスが自分のプログラムを実行 Each process executes its own program. 9 TCP Client sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); memset(&echoServAddr, 0, sizeof(echoServAddr)); echoServAddr.sin_family = AF_INET; echoServAddr.sin_addr.s_addr = inet_addr(servIP); echoServAddr.sin_port = htons(echoServPort); connect(sock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)); echoStringLen = strlen(echoString); send(sock, echoString, echoStringLen, 0); initialize int main(int argc, char *argv[]) { int myid, procs, ierr, i; double myval, val; MPI_Status status; FILE *fp; char s[64]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); if (myid == 0) { fp = fopen("test.dat", "r"); fscanf(fp, "%lf", &myval); for (i = 1; i < procs; i++){ fscanf(fp, "%lf", &val); MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); } fclose(fp); } else MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); totalBytesRcvd = 0; printf("Received: "); while (totalBytesRcvd < echoStringLen){ bytesRcvd = recv(sock, echoBuffer, RCVBUFSIZE - 1, 0); totalBytesRcvd += bytesRcvd; echoBuffer[bytesRcvd] = '\0' ; printf(echoBuffer); } printf("\n"); close(sock); initialize TCP Server servSock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); memset(&echoServAddr, 0, sizeof(echoServAddr)); echoServAddr.sin_family = AF_INET; echoServAddr.sin_addr.s_addr = htonl(INADDR_ANY); echoServAddr.sin_port = htons(echoServPort); bind(servSock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)); listen(servSock, MAXPENDING); for (;;){ clntLen = sizeof(echoClntAddr); clntSock = accept(servSock,(struct sockaddr *)&echoClntAddr, &clntLen); recvMsgSize = recv(clntSock, echoBuffer, RCVBUFSIZE, 0); while (recvMsgSize > 0){ send(clntSock, echoBuffer, recvMsgSize, 0); recvMsgSize = recv(clntSock, echoBuffer, RCVBUFSIZE, 0); } close(clntSock); } MPI #include <stdio.h> #include "mpi.h" printf("PROCS: %d, MYID: %d, MYVAL: %e\n", procs, myid, myval); MPI_Finalize(); return 0; } initialize 10 MPIの位置づけ Layer of MPI  ネットワークの違いを、MPIが隠ぺい Hide the differences of networks Applications MPI Sockets TCP … XTI UDP IP Ethernet driver, Ethernet card … … High-Speed Interconnect (InfiniBand, etc.) 11 MPIプログラムのコンパイル，実行 How to compile & execute MPI programs  Compile command: mpicc Example) mpicc -O3 test.c -o test.exe optimization option O is not 0 source file to compile  Execution command: executable file to create mpirun Example) mpirun -np 8 ./test.exe number of processes executable file to execute 12 Ex 0) MPIプログラムの実行 Execution of an MPI program  psihexaにログインして、以下を実行しなさい。 Login to psihexa, and try the following commands. $ $ $ $ $ $ cp /tmp/test-mpi.c . cp /tmp/test.dat . cat test-mpi.c cat test.dat mpicc test-mpi.c –o test-mpi mpirun -np 8 ./test-mpi  時間に余裕があったら，プロセス数を変えたり，プログラムを書き換えたりしてみる． Try changing the number of processes, or modifying the source program. MPIライブラリ MPI Library  MPI関数の実体は，MPIライブラリに格納されている The bodies of MPI functions are in "MPI Library".  mpicc が自動的に MPIライブラリをプログラムに結合する mpicc links the library to the program mpicc main() { MPI_Init(...); ... MPI_Comm_rank(...); ... MPI_Send(...); ... } source program compile link Executable file MPI_Init MPI_Comm_rank ... MPI Library 14 MPIプログラムの基本構造 Basic Structure of MPI Programs #include <stdio.h> #include "mpi.h" Crucial lines header file "mpi.h" int main(int argc, char *argv[]) { ... MPI_Init(&argc, &argv); Function for start-up ... MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); ... MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); ... MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); You can call MPI functions in this area ... MPI_Finalize(); Functions for finish return 0; } 15 今日の MPI関数 MPI Functions Today  MPI_Init  Initialization  MPI_Finalize  Finalization  MPI_Comm_size  Get number of processes  MPI_Comm_rank  Get rank (= Process number) of this process  MPI_Send & MPI_Recv  Message Passing  MPI_Bcast & MPI_Gather  Collective Communication ( = Group Communication ) 16 MPI_Init Usage: int MPI_Init(int *argc, char **argv);  MPIの並列処理開始 Start parallel execution of in MPI  プロセスの起動やプロセス間通信路の確立等。 Start processes and establish connections among them. #include <stdio.h> #include "mpi.h"  他のMPI関数を呼ぶ前に、 int main(int argc, char *argv[]) 必ずこの関数を呼ぶ。 { Most be called once before calling int myid, procs, ierr; double myval, val; otherMPI functions  引数： Parameter: Example MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); ...  main関数の2つの引数へのポインタを渡す。 Specify pointers of both of the arguments of 'main' function.  各プロセス起動時に実行ファイル名やオプションを共有するために参照。 Each process most share the name of the executable file, and the options given to the mpirun command. 17 MPI_Finalize Usage: int MPI_Finalize();  並列処理の終了 Finishes paralles execution  このルーチン実行後はMPIルーチンを呼び出せない MPI functions cannot be called after this function. Example main() { ... MPI_Finalize(); }  プログラム終了前に全プロセスで必ずこのルーチンを実行させる。 Every process needs to call this function before exitting the program. 18 MPI_Comm_rank Usage: int MPI_Comm_rank(MPI_Comm comm, int *rank);  そのプロセスのランクを取得する Get the rank(= process number) of the process  2番目の引数に格納 Returned in the second argument Example ... MPI_Comm_rank(MPI_COMM_WORLD, &myid); ...  最初の引数 = “コミュニケータ” 1st argument = "communicator"  プロセスのグループを表す識別子 An identifier for the group of processes  通常は，MPI_COMM_WORLD を指定 In most cases, just specify MPI_COMM_WORLD, here.   MPI_COMM_WORLD: 実行に参加する全プロセスによるグループ a group that consists all of the processes in this execution プロセスを複数のグループに分けて、それぞれ別の仕事をさせることも可能 Processes can be devided into multiple groups and attached different jobs. 19 MPI_Comm_size Usage: int MPI_Comm_size(MPI_Comm comm, int *size);  プロセス数を取得する Get the number of processes  2番目の引数に格納される Example ... MPI_Comm_size(MPI_COMM_WORLD, &procs); ... 20 一対一通信 Message Passing  送信プロセスと受信プロセスの間で行われる通信 Communication between "sender" and "receiver"  送信関数と受信関数を，"適切"に呼び出す． Functions of Sending and Receiving most be called in a correct manner.  "From" rank and "To" rank are correct  Specified size of the data to be transferred is the same on both side  Same "Tag" is specified on both side Rank 0 Send To: Rank 1 Size: 10 Integer data Tag: 100 Rank 1 Receive From: Rank 0 Size: 10 Integer data Tag: 100 Wait for the message 21 Usage: int MPI_Send(void *b, int c, MPI_Datatype d, int dest, int t, MPI_Comm comm); MPI_Send  送信内容 Information of the message to send 開始アドレス, number of elements 要素数, data type データ型, rank of the destination 送信先,  start address of the data Example ... MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); ... tag, communicator (= MPI_COMM_WORLD, in most cases) MPI_INT  data types: Integer Real(Single) MPI_FLOAT Real(Double) MPI_DOUBLE Character MPI_CHAR  tag：メッセージに付ける番号（整数） The number attached to each message   不特定のプロセスから届く通信を処理するタイプのプログラムで使用 Used in a kind of programs that handles anonymous messages. 通常は、0 を指定しておいて良い. Usually, you can specify 0. 22 Example of MPI_Send  整数変数 d の値を送信（整数1個） Send the value of an integer variable 'd' MPI_Send(&d, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);  実数配列 mat の最初の要素から100番目の要素までを送信 Send first 100 elements of array 'mat' (with MPI_DOUBLE type) MPI_Send(mat, 100, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD);  整数配列 data の10番目の要素から50個を送信 Send elements of an integer array 'data' from 10th to 59th element MPI_Send(&(data[10]), 50, MPI_INT, 1, 0, MPI_COMM_WORLD); 23 MPI_Recv Usage: int MPI_Recv(void *b, int c, MPI_Datatype d, int src, int t, MPI_Comm comm, MPI_Status *st);  Information of the message to receive  start address for storing data 受信データ格納用の開始アドレス, Example number of elements 要素数, ... MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, data type データ型, MPI_COMM_WORLD &status); rank of the source 送信元, ... tag (= 0, in most cases), communicator (= MPI_COMM_WORLD, in most cases), status  status: メッセージの情報を格納する整数配列 An integer array for storing the information of arrived message  送信元ランクやタグの値を参照可能（通常は、あまり使わない） Consists the information about the source rank and the tag. ( Not be used in most case ) 24 集団通信 Collective Communications  グループ内の全プロセスで行う通信 Communications among all of the processes in the group  Examples)  MPI_Bcast  copy a data to other processes  MPI_Gather  Gather data from other processes to an array Rank 0 Rank 1 3 1 8 2 3 1 8 2 Rank 0 Rank 2 3 1 8 2 Rank 1 7 5 Rank 2 9 7 5 9  MPI_Reduce  Apply a 'Reduction' operation to the distributed data to produce one array Rank 0 Rank 1 1 2 3 12 15 4 5 6 Rank 2 7 8 9 18 25 MPI_Bcast Usage: int MPI_Bcast(void *b, int c, MPI_Datatype d, int root, MPI_Comm comm);  あるプロセスのデータを全プロセスにコピー copy a data on a process to all of the processes  Parameters:  start address, number of elements, data type, root rank, communicator  root rank: コピー元のデータを所有するプロセスのランク rank of the process that has the original data  Example: MPI_Bcast(a, 3, MPI_DOUBLE, 0, MPI_COMM_WORLD); Rank 0 a Rank 1 a Rank 2 a Rank 3 a 26 MPI_Gather Usage: int MPI_Gather(void *sb, int sc MPI_Datatype st, void *rb, int rc, MPI_Datatype rt, int root, MPI_Comm comm);  全プロセスからデータを集めて一つの配列を構成 Gather data from other processes to construct an array  Parameters:  send data: start address, number of elements, data type, receive data: start address, number of elements, data type, (means only on the root rank) root rank, communicator  root rank: 結果の配列を格納するプロセスのランク rank of the process that stores the result array  Example: MPI_Gather(a, 3, MPI_DOUBLE, b, 3, MPI_DOUBLE, 0, MPI_COMM_WORLD); Rank 0 a Rank 1 a Rank 2 a Rank 3 a b 27 集団通信の利用に当たって Usage of Collective Communications  同じ関数を全プロセスが実行するよう、記述する。 Every process must call the same function  例えば MPI_Bcastは，root rankだけでなく全プロセスで実行 For example, MPI_Bcast must be called not only by the root rank but also all of the other ranks  送信データと受信データの場所を別々に指定するタイプの集団通信では、送信データの範囲と受信データの範囲が重ならないように指定する。 On functions that require information of both send and receive, the specified ranges of the addresses for sending and receiving cannot be overlapped.  MPI_Gather, MPI_Allgather, MPI_Gatherv, MPI_Allgatherv, MPI_Recude, MPI_Allreduce, MPI_Alltoall, MPI_Alltoallv, etc. 28 まとめ Summary  MPIでは、一つのプログラムを複数のプロセスが実行する On MPI, multiple processes run the same program  各プロセスには、そのランク（番号）に応じて仕事を割り当てる Jobs are attached according to the rank(the number) of each process  各プロセスはそれぞれ自分だけの記憶空間で動作する Each process runs on its own memory space  他のプロセスが持っているデータを参照するには、通信する Accesses to the data on other processes can be made only by explicit communication among processes  MPI functions  MPI_Init, MPI_Finalize, MPI_Comm_rank  MPI_Send, MPI_Recv  MPI_Bcast, MPI_Gather 29 References  MPI Forum http://www.mpi-forum.org/  specification of "MPI standard"  MPI仕様（日本語訳） http://phase.hpcc.jp/phase/mpi-j/ml/  理化学研究所の講習会資料 http://accc.riken.jp/HPC/training/mpi/mpi_all_2007-0207.pdf 30 Ex 1) 乱数を表示するプログラム A program that displays random numbers  「各プロセスがそれぞれ自分のランクと整数乱数を一つ表示するプログラム」を作成しなさい。 Make a program in which each process displays its own rank with one integer random number  Sample: #include <stdio.h> #include <stdlib.h> #include <sys/time.h> int main(int argc, char *argv[]) { int r; struct timeval tv; gettimeofday(&tv, NULL); srand(tv.tv_usec); r = rand(); printf("%d¥n", r); } Ex 1) (cont.)  Example of the result of execution 1: 0: 3: 2: 4: 5: 6: 7: 520391 947896500 1797525940 565917780 1618651506 274032293 1248787350 828046128 Ex 1) Sample of the answer #include #include #include #include <stdio.h> <stdlib.h> <sys/time.h> "mpi.h" int main(int argc, char *argv[]) { int r, myid, procs; struct timeval tv; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); gettimeofday(&tv, NULL); srand(tv.tv_usec); r = rand(); printf("%d: %d¥n", myid, r); MPI_Finalize(); } レポート課題：順番をそろえて表示する Report: Display in order  Ex 1) で作成したプログラムについて、以下の条件を満たすように修正しなさい。「ランク０からランクの順に、それぞれのプロセスで生成した乱数を表示する。」 Modify the program in Ex1), so that: Messages are printed out in the order of the rank of each process  Example of the result of the execution 0: 1: 2: 3: 4: 5: 6: 7: 1524394631 999094501 941763604 526956378 152374643 1138154117 1926814754 156004811

Parallel Programming in MPI (part 1)

Related documents

Products

Support

Parallel Programming in MPI (part 1)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib