System Description Document

advertisement

System Description Document

Table of contents

1. Introduction................................................................................................................. 3

1.1. General................................................................................................................ 3

1.2. Overview............................................................................................................. 3

2. Software functional description .................................................................................. 5

2.1. MPICH................................................................................................................ 5

2.2. Ox........................................................................................................................ 5

2.3. Description ox2mpich.dll ............................................................................... 5

2.4. The different functions in the dll file .................................................................. 6

2.4.1. Init............................................................................................................... 7

2.4.2. Finalize........................................................................................................ 7

2.4.3. Comm_Size................................................................................................. 7

2.4.4. Comm_rank................................................................................................. 8

2.4.5. Get_processor_name................................................................................... 8

2.4.6. Wtime.......................................................................................................... 8

2.4.7. Reduce......................................................................................................... 8

2.4.8. Bcast............................................................................................................ 9

2.4.9. Send............................................................................................................. 9

2.4.10. Recv .......................................................................................................... 10

2.4.11. Probe ......................................................................................................... 10

2.4.12. Iprobe ........................................................................................................ 10

2

1.

Introduction

1.1. General

The System Description Document (SDD) provides a brief summary of the hardware components and software programs used in the ParaDiOx. It will also give a more detailed description of how the software programs are linked together. The main purpose of ParaDiOx is to split a time consuming calculation in Ox to a number of smaller calculations and distribute them trough the network to a number of computers which can process the calculations at the same time and then return the answers to the main computer.

1.2. Overview

The ParaDiOx is based on the requirements described under headline 1.3 Demands in the

Preliminary Project Specification . The systems that can be built upon ParaDiOx consist of one “master” computer and several “slave” computers, which are all connected via a network. All calculations are controlled from the master computer. The slave computers only act unconditionally (they will get an input from the master, process the input

(calculate) and then send an output to the master).

Consider the following example (figure 1.2-1): A master computer is going to calculate an equation, using a parallel algorithm. It sends a part of the calculation (the red message) to the first slave and when the slave has finished his calculation, the master will receive the answer from it. Similar messages are sent to the other slave computers, which in turn will process their calculations at the same time. The answers are then post processed by the master. This will enhance the calculation time (but in this trivial case, the network communication will eat up all time we have won in enhanced calculation performance).

3

Figure 1.2-1 A trivial parallel calculation example.

The following documents contain additional information about the software used in

ParaDiOx:

• Preliminary Project Specification. This document explains the demands, purpose etc. in an early stage of the project. The document can be found at: http://www.nada.kth.se/projects/proj02/paradiox/ (2002.04.26)

• Ox Documentation. This document contains detailed information about Ox. The document can be found at: http://www.nuff.ox.ac.uk/Users/Doornik/doc/ox/index.html (2002.04.26)

• MPICH Documentation. This document contains manuals and documentation about MPICH as well as further documentation about Message Passing

Interface (MPI). The document can be found at: http://www-unix.mcs.anl.gov/mpi/mpich/ (2002.04.26)

4

2.

Software functional description

Paradiox is built on two third part products, called MPICH and Ox. The MPICH is the foundation that has been modified to manage the requirements needed to do distributed calculations for Ox.

2.1. MPICH

MPI is a library specification for message passing, proposed as a standard by a broadly based committee of vendors, developers and users. MPI was designed for high performance on both massively parallel machines and on workstation clusters. MPI is widely available, with both free and vendor-supplied implementations.

2.2. Ox

Ox is an object-oriented matrix language with a comprehensive mathematical and statistical function library. Matrices can be used directly in expressions, for example to multiply two matrices, or to invert a matrix. Use of the object oriented features is optional, but facilitates code to be re-used. The syntax of Ox is similar to the C, C++ and

Java languages. This similarity is most clear in syntax items such as loops, functions, arrays and classes.

2.3. Description of ox2mpich.dll

Here we describe the functions on a lower level. The ox program calls functions in the dll file and the dll file interprets that information and makes a new function call to the

MPICH software (in turn, MPICH manipulates that into the MPI standard and sends it across the world of connected MPI clients). In other words, this dll file is used to connect the Ox environment to the MPICH environment. As said in earlier sections, this dll file can be further developed in order to include more functions. As a result of that, this document will only concern itself on existing function as of this date.

A general note of information is that all functions must be defined in a certain way.

Otherwise, the Ox programs will not be able to make a function call to them.

5

void OXCALL Init(OxVALUE *rtn, OxVALUE *pv, int cArg)

{

//do stuff

}

Instead of receiving and returning values in the “ordinary” way, used by languages such as Java and C, the two arrays rtn and pv are used. The third argument, cArg contains how many elements the incoming array contains. The incoming values can be found in the pv array, whereas the rtn array is used for returning purposes.

As explained in the user guide, the arguments are not the variables themselves but rather the addresses to them. This can be cleverly used in the dll file when you want to return values, since all you have to do is to write to the same address found in the incoming array!

2.4. The different functions in the dll file

As in the function reference for the Ox programming section, we list and describe each function on the dll level one by one. If you need more capabilities, all you have to do is to implement such functions, rebuild the dll file and replace this enhanced version with the current dll file.

The syntax we have chosen to use is to make the dll file transparent to the ox programmer. For example, if the general MPI command is called “MPI_Init”, then we have named the function at the dll level to be simply “Init” and this function calls the

MPI_Init in the MPICH software. Then when you call this Init function from the ox program, you import the external function Init as “MPI_Init” and that way you have the same syntax as “before” the dll file.

6

We describe the general ideas behind the functions here in this document. More precise comments on exact syntax and other details can be found in the source code of the ox2mpich.dll file.

2.4.1. Init

As specified in the MPI standard, two addresses must be sent as arguments to the initialization function. We have only implemented to send dummy addresses, which have no useful meaning. If you create a more advanced program, which requires these arguments, the dll file must be developed.

2.4.2. Finalize

As you can see in the source code, this function is very straightforward. No arguments, no nothing. Just call the MPI_Finalize.

2.4.3. Comm_Size

Here is the first example of making use of the rtn array. The function itself returns the number of computers in the domain, since it is mapped to the MPI_Comm_Size function described in the previous chapter. void OXCALL Comm_size(OxVALUE *rtn, OxVALUE *pv, int cArg)

{

}

OxInt(rtn, 0) = numprocs;

As in the ox code, the address of the integer is sent to the MPI system and it manipulates the value at that address. That way we simply return the value after calling the

Comm_Size function.

7

2.4.4. Comm_rank

This function works the same way as the Comm_Size function. The only difference is that it calls another MPI command and returns that value in the return array.

2.4.5. Get_processor_name

This function returns a string of characters instead of an integer. Hence, the return array cannot be used and that is why we store the processor name at the address of the incoming variable in the pv array instead.

OxValSetString(OxArray(pv,0), processor_name);

2.4.6. Wtime

This function returns the double value returned by the MPI_Wtime function. Since it is a double, it can be returned in the rtn array instead of manipulating the incoming variable. double wtime = MPI_Wtime();

OxDbl(rtn,0) = wtime;

2.4.7. Reduce

This function is used for "reducing" arguments into one, using the specified operation. As explained in previous chapter, the addresses of the buffers, the root and the operation are provided by the calling unction.

First we have to find out what kind of operation is specified in the

Incoming String. An evaluation and if-else combination takes care of that problem and stores the actual operation in a variable for usage in the sending stage.

8

When the operation is known, we check what kind of data is provided. Depending on the result of this if-else combination, different strategies are chosen. If memory needs to be allocated dynamically (string, matrix and array), memory is handled and freed etcetera.

When everything has been set, the message with correct size of the data and world definition etcetera is sent.

2.4.8. Bcast

This function is used to broadcast the incoming message. In order to do that, we must first parse what kind of message that is being broadcasted. The two functions

OxLibCheckType and OxValType do this. Then we check the result with if statements. In the current release, we can manage int, double, ox_matrix, ox_array and ox_string.

When we have the type it is simply a matter of sending the information in the correct way. In the int and double cases, it just is to specify the data, the size of it, who sends it and where.

In the other three cases, it is more complex a procedure to send the data. Since the size of the matrices etc is unknown, one has to loop through the data structure and allocate memory and send the data after that. We have limited this function to handling doubles only in the cells of the array/matrix, since that probably is what needs to be broadcasted.

2.4.9. Send

This function works in the very same way as the broadcast function. The only two differences are that a receiver is specified, instead of sending to the entire domain, and an information tag is also required to be sent along with the message. This information can then be used by the receiver in order to find the desired message.

9

2.4.10. Recv

This function is a mirror function of the send function. The only difference is that this receiving function specifies what source has to have sent the message.

2.4.11. Probe

This function makes it possible to probe the environment for messages sent by the send functions. It waits until a message has been detected before continuing. It simply uses the

MPI_Probe command with correct arguments received in the pv array from the ox program.

2.4.12. Iprobe

The only difference in the Iprobe function from the Probe function above is that it merely checks whether there is a message to be received and then returns an integer to the user.

If there is no new message, the return value will be zero.

ParaDiOx is distributed as installation programs. The program that is used for creating these installation programs is GP-install that is a free to use product. The installation program contains a program written in Visual Basic (VB) that adds a user to the local machine.

10

Download