Parallel Computing with BOINC - lists.ssl.berkeley.edu Mailing Lists

advertisement
A Model and API for Communication between BOINC
Workunits
Nagarajan Kanna (nkanna@cs.uh.edu)
Jaspal Subhlok (jaspal@uh.edu)
Department of Computer Science
University of Houston, Houston, TX 77006
(DRAFT )
1. Motivation
The current BOINC platform supports execution of parallel applications composed of independent work
units. The work units cannot communicate with each other. This document presents a method and API for
data communication between work units.
2. Synopsis
A parallel application consists of communicating processes. In BOINC, processes are executed as work
units. Our basic model for communication between work units is based on reading and writing data
objects to/from a logical ``dataspace’’. Data objects in dataspace are identified by unique tags. The
fundamental concept is similar to the LINDA model and some “publish subscribe” systems.
In order to enable some forms of communication between work units, it is essential that work units be
identifiable from user code. We introduce the notion of a user managed workunit Id. (This is analogous
to a process Id in other parallel systems). A workunit Id is a parameter that is provided when a work unit
is created. An executing work unit can find its own Id using an API call. Multiple work units can have the
same Id if they represent the same computation. This can happen when redundant computation is
employed to improve robustness or when a new work unit is created to replace an existing work unit that
may have failed.
Work units execute read/write operations with identification tags for communication. A write operation
will store a data object in an abstract shared dataspace with an identification tag. Once a data object is
written, any process can retrieve the data object with a read operation with a matching tag. The data
object will continue to exist in the dataspace after a read operation. Thus, multiple work units can receive
a data object written by a single sending work unit. An explicit clear operation deletes a data object from
the dataspace. Since there is a single shared dataspace for the entire application and identification tags are
unique, communication works fine with redundant work units.
The workunit Id is typically a component of the identification tag when one work unit needs to
communicate with another workunit. As a trivial example, if each workunit has only one data object to
share, it can use its own workunit Id as the tag to index the data object in the dataspace.
This document focuses on the model and API for communication, and not the implementation. However,
the simplest implementation of such a dataspace is on a single data server node (can be the BOINC
server), although distributed implementations are also planned.
3. API
We introduce the API proposed for discovery of workunit Ids and communication between
workunits:
int boinc_write(int tag, int dataSize, byte *buffer)
int boinc_read(int tag, int dataSize, byte *buffer)
int boinc_getWorkunitId()
int boinc_getNumWorkunits()
int boinc_clear(int tag)
int boinc_clearall()
tag
dataSize
buffer
- Identifies (or indexes) each data object in the dataspace for read/write.
- Number of bytes to be written/read to/from dataspace.
- Pointer to data being written/read to/from dataspace.
boinc_write
Writes given data object (buffer) indexed with the tag into the dataspace. It is
a non-blocking operation. An ERROR return implies the operation cannot be
completed.
boinc_read
Reads a data object (with size less than or equal to dataSize) with the given
tag from the dataspace. It is a blocking operation that will wait until data is
available. A successful operation will return the actual size of data object that
is read. An ERROR return implies the operation cannot be completed.
boinc_getWorkunitId
Returns the current workunit Id within the application. If the workunit cannot
be identified, then it returns -1.
boinc_getNumWorkunits Returns the number of workunits in the given application. If the number of
workunits cannot be identified, then it returns -1.
boinc_clear
Deletes a given tag along with the corresponding data object.
boinc_clearall
Clears the dataspace for the given application
3.1 Work Unit naming convention
In BOINC a work unit name can be specified during workunit creation. To enable
communication, the workunit Id and the number of workunits have to be encoded in the workunit
name as follows:
<WorkUnitName>_<WorkunitId>_<NumWorkunits>
The runtime library will automatically extract the required information from the text string. For
example, if a work unit is named ‘vip_run1_0_16’, it indicates this is workunit Id 0 among 16
workunits.
4. Example: Solving a 2D - Laplace equation
4. 1 Problem Description
2-D Laplace equation
2
2
u
(
x
,
y
)

u ( x, y )  0
x 2
y 2
(Equation 1)
Central discretization leads to
u i 1, j  2u i , j u i , j 1
h
2

u i , j 1  2u i , j u i , j 1
h2
0
(Equation 2)
This will lead to a set of linear equations which can be solved by BiConjugate Gradient Stabilized method
(Bi-CGSTAB). It is an iterative method, which creates an approximate solution and improves it on
successive iterations. In a parallel implementation, boundary rows/columns (ghost cells) are
exchanged among processes between iterations as illustrated below.
Refer - Equation 2
Ghost cells (copy of row/column of data from neighbor process) are
exchanged between iterations at process boundaries.
4. 1 Using BOINC
This computation can be implemented in a straightforward way in BOINC with the API
discussed in this paper as illustrated below
Creating Work Units
$create_work
-appname laplace
-wu_name laplace_run1_0_11
-wu_template templates/laplace_wu
-result_template templates/laplace_result
-min_quorum 1
-target_nresults 1
inputfile
Similarly 11 more work units have to be created.
Process mapping and determining neighbor process
The Workunits are assigned parts of the global
array as shown here:
Code Snippet
…
boinc_init();
my_workunitId = boinc_getWorkunitId();
num_workunits = boinc_getNumWorkunits();
…
[The neighboring workunit Ids are nup, ndown, nleft and nright are computed based on
the distribution illustrated above]
for iterationCount = 1 to N
sendTagUp = getTag(my_workunitId, nup, iterationCount);
recvTagDown = getTag(ndown, my_workunitId, iterationCount);
//Communication in Y direction.
// The getTag function will return a unique tag for a given list of arguments.
// The computation above determines a well defined indes in the dataspace for
exchanging data with up/down workunits in each iteration.
boinc_write(sendTagUp, sizeof(buffer), &buffer);
boinc_read(recvTagDown, sizeof(buffer), &buffer);
//Actual sending of data to workunit above and receiving from workunit below
//The above steps are repeated in each direction
end for
}
……
Normal Program Execution
The value of ghost cells will be exchanged with neighboring workunits in each iteration. We have to
make sure that tag values used in successive iterations and for different communication pairs are unique.
Also we have to construct the data objects that have to transmitted as they may not be contiguous.
Failure scenario
Suppose a workunit (say B) fails to complete its operation. Then other workunits (say A) may continue to
wait for data to be available in the dataspace which will not happen. When work units are created in the
BOINC server, an expiry time is set for each work unit. Ultimately workunit (B) will be considered to
have failed and reassigned. Workunit A may fail also because of the wait and have to be reassigned. This
may lead to repeated writes to the same dataspace location, but does not impact the final results. The
application will ultimately complete when a work unit associated with each process completes.
Redundant computation
Suppose two identical work units are created for each process – say A1, A2, B1, B2, C1, C2. Each of
B1/B2 (and C1/C2) will compute the same result and write/overwrite to the dataspace with the same tag .
Work units A1/A2 are able to receive the data objects from the dataspace when B1 or B2 , and C1 or C2
have written into the dataspace. The computation will terminate successfully as long as one copy of each
of A1/A2 , B1/B2, C1/C2 completes normally.
5. Discussion
This document is a basic draft with many of the issues not addressed explicitly. Some of these
are listed here.
5.1 Verification API
int boinc_write_verify (int tag, int dataSize, byte *buffer, int (*)checkData() )
The objective of the verify_write is to check if redundant work units are computing identical data objects
with the same tag. The basic boinc_write directly writes the given data on the shared dataspace. This is an
extension to the basic API that requires a user created function for verification. Verification function is
invoked by a function pointer. It will be this function which does the actual data comparison or
verification.
5.2 Helper API
The basic API that we have listed will be sufficient to accomplish our goals. In addition, we will provide
a set of helper functions that will enhance the productivity of application developers. Some example:
 Conversion function to/from primitive data types to byte arrays
 Converting local indexes and process Ids to tag values. This function will take an n-tuple of
indexing integers and create a unique global tag. As an example, the programmer provides the
row and column of an array element, along with its own process Id, and the helper function will
convert it to a single unique tag. (similar to getTag in the example)
 A command to initiate the execution of parallel applications. (Say ‘boincrun’). The input will be
the number of processes to be used and the input files to be used for starting the application. This
program (boincrun) is responsible for creating various work units to be executed in BOINC
clients. [boincrun is similar to mpirun in MPI].
All of these functions are simply shortcuts for activities that a programmer can also implement directly.
They do not require any changes to the execution infrastructure.
5.4 Implementation
This document focuses only on the API. We list some of the higher level implementation issues.
 The basic implementation will have the BOINC server or a designated node as the
communication server. However, future implementations may employ direct client to client
communication.
 The API intentionally does not specify how long an object remains in the dataspace after it is
written and read, but never cleared An implementation may choose to automatically clear
outdated data items – loosely analogous to garbage collection in compilers.
5.5 Extensions
This basic communication framework is designed to increase the flexibility of BOINC applications.
However, it can be used to implement a message passing framework like MPI. An implementation can
also conceivably support inter-application communication beside inter-process communication within an
application.
Download