P T- PVM : A Portable Platform for Multithreaded Coordination Languages Oliver Krone , Béat Hirsbrunner , Vaidy Sunderam Institut d’Informatique, Université de Fribourg, Fribourg, Switzerland Computer Science Department, Emory University, Atlanta, GA, USA A BSTRACT. This paper describes a portable message passing and process management platform for multithreaded applications. P T- PVM is based on the PVM system and provides message passing and process management facilities at the thread level for a cluster of workstations running the UNIX operating system without changing the PVM system. Moreover P T- PVM introduces advanced programming abstractions like generative communication and client/server programming at the thread level. In the first place the platform is suitable for the implementation of coordination languages, like Linda or C O L A. 1. Introduction The coordination of distributed applications has recently become a scientific discipline in its own right and is described for example as ”managing dependencies among independent entities“ [1]. This can be achieved using a coordination language, as ”the linguistic embodiment of a coordination model“ [2], to express and describe the relationships between the active entities running in parallel. This research has led to the development of several coordination models and corresponding coordination languages like Linda [3], Linear Objects [4], or Gamma [5]. A coordination language like C O L A [6] focuses on the coordination of highly asynchronous concurrent processes running in a (massively) parallel environment. The C O L A computation space consists of a large collection of fine grained processes communicating using the asynchronous message passing paradigm. C O L A processes are dynamic in the sense that during runtime of a application new processes can be created or destroyed randomly. The runtime behavior of C O L A processes is not determined, once started they can compute forever or can immediately block on a coordination feature. This implies that the underlying process model of the software platform Research supported by grant 5003-034409 of the Swiss National Science Foundation. Part of the work was done while the first author was on visit at Emory University. Research supported in part by NSF grant ASC-9527186, NASA grant NAG 2-828 and DoE grant DEFG05-91ER25015. where C O L A is built on, should support a preemptive fine grained process scheduling management. This was the main issue for the development of P T- PVM . Another important point is portability and heterogeneity. C O L A should run on a variety of different hardware platforms which demands a portable message passing platform independent of the actual hardware platform. Therefore we choose PVM as the machine independent basic communication platform for the implementation of C O L A and hence P T- PVM . Although P T- PVM has been developed to implement the coordination language C O L A, its programming model is general enough to be used in other application areas like document recognition [8]. However, we will explain the different properties of P T- PVM in the context of its primary application area, the implementation of multithreaded coordination languages with C O L A as our primary example. The rest of this paper is organized as follows: Section 2 introduces the subject of coordination in general, and Section 3 presents the coordination features of P T- PVM in particular. Section 4 explains the system model the platform is built on, followed by Section 5 which briefly sketches a graphical user interface used for basic visualization purposes. In Section 6 we focus on performance issues of the platform and Section 7 finally describes an application which is build on top of P T- PVM , concludes the paper and gives an outlook on our plans for the future. 2. Coordination Models and Languages A programming model for distributed applications demands that coordination has to be treated orthogonally to computation which means that one can not use a coordination language without a computation language [2]. In general coordination theory deals with integral problems such as the delegation of requests (known from the client/server programming model), the specification of temporal behavior of processes, and the realization of interactions, as the abstract description of causal dependencies between processes. To fulfill these typical coordination tasks a general coordination model has to be composed of four components: (1) coordination entities as the processes or agents running in parallel which are subject of coordination, (2) a coordination media: the actual space where coordination takes place, (3) coordination laws to specify interdependencies between the active entities [9], and finally (4) a set of coordination tools. A coordinated distributed application requires (at least) that processes: have a knowledge of its communication partners; can establish communication channels to other processes; select a protocol and language to realize a dialog; have the possibility to join/leave a running parallel application. P T- PVM stands for Preemptive Threads and PVM and is a extended version of P T- PVM [7]. C-Space C-Space C-Space Sender must know receiver Central Mail Server Sender/Receiver can be anonymous Interaction with several C-Spaces Coupled Communication Anonymous Communication Blackboard Model, centralized C-Space Uncoupled Communication Figure 1: Coupled communication and uncoupled communication It is one of the main tasks of the coordination model to address these problems. To achieve coordination some sort of communication has to be established between active entities. One should distinguish two different approaches of communication metaphors: coupled communication and uncoupled communication. Coupled Communication versus Uncoupled Communication Using communication as a method to achieve coordination implies some sort of exchange of information. This exchange of information can either be realized by means of specific message passing routines (coupled communication), like in [10, 11] or by the introduction of a blackboard system (uncoupled communication) where information will be stored and hence made visible not only for a specific destination but also for other entities. Information becomes “public”, compared to message passing systems (where it is “private”), and all active entities which have access to this blackboard system can eventually read this information. Uncoupled communication or generative communication [12] has several advantages compared to message passing systems. First of all, the information becomes a first class entity, therefore an object to reason about. Furthermore information can become persistent, allowing several agents to consume the information even if the originator of the messages does not exist anymore. Figure 1 illustrates pure message passing systems and generative communication. The arrows denote the possible interaction ranging from point to point interaction over a centralized mail delivery system to totally uncoupled communication using a Coordination-Space (C-Space) [13] similar to Linda’s tuple space. 3. Coordination and P T- PVM To meet the requirements imposed by the needs of parallel languages P T- PVM includes several high level abstractions which allow the implementation of new parallel languages. These are: (1) a global Name Space for processes, services and messages and (2) an instance called “Post” which is responsible for high level message passing and finally (3) abstractions to support the client/server programming model. 3.1. Global Name Space We have implemented a global name-service in P T- PVM . New primitives are provided to register and unregister names in this name space. Names are used to identify either processes, messages or services. Processes Processes can register and unregister to and from the global name space of P T- PVM . By registering, the user can specify a symbolic name by which the processes will be identified in the future. Names can be transferred to other processes so that they can communicate with formerly unknown processes. The process of registration is dynamic, therefore allowing a dynamic change of the configuration of the parallel machine represented in the Name Space. Using symbolic names has the advantage that process identification can be done independently from the underlying process management (e.g. POSIX threads) and communication platform (e.g. UNIX sockets). Messages The extension for messages is two fold: (1) messages can be identified by a symbolic name and (2) the semantics of messages will be extended towards uncoupled communication by allowing new message types like persistent messages. By giving messages a name we considerably extend the semantics for messages known from classical message passing systems like PVM. However, there is a problem with the naming of messages as far as the reuse of a message names is concerned. If a message name can only be reused after the destination has performed the corresponding receive, a sender process could be blocked in a loop if it uses the same message name for the second send. There are two solutions for this problem: either the sender uses a fresh name for each message sent in the loop or we define a stream semantics on named messages. Streams have the advantage that they are a well known programming model. Processes receive messages in the stream in a FIFO order. Services Adopting a client server model for P T- PVM allows to combine two widely used coordination paradigms namely message passing and client/server programming in one programming tool. We do not present a classical synchronous remote procedure call interface as in [14], but a combination of service invocation and service request calls combined with classical message passing. Again, we use the global Name Space as a repository for the storage of symbolic service identifiers. Processes providing a service have to register a service name in this name space. Processes requesting a service can get the service by addressing a service name and process name in the Name Space. 3.2. Advanced Message Passing: Postal Mail Delivery Using a postal mail delivery system [15] for message passing allows for several extensions with regard to advanced message passing. Messages can now be identified – like processes or services – by a logical name. The sending of messages is not restricted to point to point communication anymore, but allows several modes: Persistent: Message will be kept by the mail delivery server and can be accessed by anybody who knows the symbolic name of the message; Hold: Hold message a certain amount of time, after this the message expires and will be removed from the Name Space; Instant: Send message to destination, do not keep it; Multiple-Copies: Keep multiple copies of the message; allows several but determined processes to receive the message. Combinations of the modes are allowed, e.g., Multiple-Copies and Hold. Processes can retrieve information by either specifying the sender and the message name, only the message name, only the sender name, or by a wild card. Beside the extensions with regard to message passing, a postal mail delivery system has few other advantages: it supports migration of processes, because it maps logical site independent process names to physical site dependent process identifiers; it hides the physical site of a process from a user; it allows for highly portable and scalable algorithms since they do not depend on certain physical communication structures of the target machine. Note the Post can be seen as a system server which offers system services to send and receive messages using a specialized protocol. E.g., the Post could provide services for reliable communication or atomic broadcasts. To summarize, the extended control instance of P T- PVM consist of an advanced message passing server which allows to treat message in a high-level way. Especially the possibility to keep message in the server, bridges the gap between coupled and uncoupled communication and allows uncoupled communication in both time and space as well as point to point communication. 3.3. Services Basically client/server programming will be made possible by introducing two new primitives csBaseSvcBind() and csBaseSvcReq(). The former primitive is used at server side to bind a service to a service name, whereas the latter primitive is used at client side to invoke the service. In order to request a service the user calls csBaseSvcReq() with the specification of the requested service, e.g., parameters and service identification. The call is nonblocking, therefore the process can continue to compute while the service provider is working. To finally receive the result a receive call is used where the message name corresponds to the service name of the requested service. Note the advantage of the multithreaded approach: a heavy weight unix process can now be both, client and server for a specific service. Also, in a running P T- PVM application it is not determined where (on which (process environment, see Section 4.)) the service will be invoked because the actual invocation of the service will be done transparently to the user on a certain . This allows for sophisticated load balancing schemes to distribute service load on available idle s. A similar approach with promising results has recently been developed on top of PVM with the disadvantage that it does not support threaded applications [16]. 4. System Model This section describes the basic programming abstractions which are used to build P T- PVM followed by an overview of its implementation (see Figures 2, 3). PE PE PE PT-PVM User Thread light-weight process Message Router handling messages for this PE heavy-weight process CPU Figure 2: Overview of the Programming Abstractions 4.1. Programming Abstractions The programming model of P T- PVM is based on three abstractions which follow directly from the implementation of different thread packages available for UNIX platforms. These abstractions determine the location where on a cluster of workstations a P T- PVM thread can run (see Figure 2): the parallel virtual machine, controlled by PVM [11], consisting of several s; one or more process environments running on each ; a serves as the “heavy-host” of a P T- PVM thread (similar to the ”pods” of [17]); one or more P T- PVM threads running on each identifier). denoted by a "! (thread Therefore a P T- PVM thread will be specified using a triple: #$%&$% "!' . The user of P T- PVM has to specify at initialization phase of the environment how many s should be used and how many s should be invoked on these s, the ( In fact it is a thread environment, but because we developed this platform for the implementation of the coordination language C O L A, where an active unity is called a process, we still call it process environment. distribution of the threads onto the )*$+ space will be done by the system or by the programmer. In order to extend the computing model of PVM onto the thread level we borrow the concept of Correspondents from C O L A [18]. A Correspondent is a high level identification concept for purpose of communication of processes; in this concrete implementation it consists of a tuple: #&$,-"!. . We distinguish two types of Correspondents depending on the invocation strategy of the new thread: local Correspondent for local invocation (on the actual of the creator) and non-local Correspondent for distributed invocation. 4.2. Thread Handling Thread management is somewhat different as described in [19, 17]. Threads have to be declared using a P T- PVM primitive. We do not use an export mechanism as in [17] but can spawn a thread directly once declared. For a detailed description of the export mechanism, we refer to [17], for possible improvements and alternatives see [20]. By default, each newly created thread runs in a thread environment which is used to store information about its creator and local parameters passed during creation of this thread. The user can specify whether a newly created thread should run on the same as the invoker (local invocation) or whether it should be distributed onto the available s (distributed invocation). The creation of massively concurrent threads can also be seen as a software implementation of active messages [21]. First Thread In order to start a distributed P T- PVM application the user has to specify one special thread in the user’s first . This thread is called m main() following the convention of ordinary C-programs where main() is the first function called by the Cruntime system. Here m main() will be the first thread invoked by the P T- PVM runtime system. Parameter Passing P T- PVM allows to pass parameters directly to a thread. The most convenient way to do so is to use predefined macros which allow to pass standard C-type variables to a thread (see [22] for details). Thread Environment Each newly created thread runs in a private thread environment. This environment is used to store information about the creator of this thread (denoted by father), its own identity (self) and local parameters, as well as some administrative information needed by the P T- PVM runtime system. It also serves as a container to store information about the current Point of View and Range of Vision of a C O L A process. Mixed Approach P T- PVM not only allows to use light weight threads as a representation of a C O L A process, but also heavy weight UNIX processes. Light weight and heavy weight pro- cesses can cooperate transparently using standard P T- PVM primitives. A newly created heavy weight UNIX process is not a , which means that it can not host light weight threads. Arbitrary UNIX processes can be invoked from within the P T- PVM environment, a user specified interpreter thread handles the output of this UNIX process, and a Correspondent (as a result of the creation of the UNIX process) is used to send messages to the process. For performance reasons the interpreter thread could run on a idle . 4.3. Message Handling In this section we give an overview of the message passing facilities provided by P T- PVM . A detailed description of the application user interface can be found in [22]. Receive Messages Threads communicate using several P T- PVM primitives, analogous to the original PVM communication primitives. To buffer messages for threads on a we use a system communication thread (message router thread) which is waiting for incoming messages designated for this . Once it has received a message, it either forwards it to the appropriate user thread (if it is already waiting for it) or saves it in a message queue (LIFO or FIFO order). This buffer management scheme has the advantage that it is easy to implement and that it fits into the general philosophy of the PVM and P T- PVM system model. Send Messages Because the standard distribution of PVM (version 3.3.7) is not thread safe (not reentrant) and because its computing model is restricted to heavy-weight UNIX processes one has to distinguish two cases for the sending of messages: (1) communication with a non-local Correspondent or (2) with a local Correspondent. To send a message to a non-local Correspondent, the sender can not call the appropriate PVM primitive directly but invokes a P T- PVM send primitive which consist of an asynchronous rpccall with the contents of the message. A second heavy-weight UNIX process serves as a “send-server” for the system and finally calls the PVM primitive. This guarantees that on each only one thread is using PVM primitives at a time (the message router thread) which is necessary because the PVM library is not reentrant. Communication to a local Correspondent is straight forward (because its corresponding thread runs on the same , therefore they both have access to a common address space) and is realized using shared variables. The asymmetric approach using a second process only for the sending of messages reduces the overall overhead of the system, because only at the sender side an additional communication and context switch is necessary to finally send a message, whereas at the receiver side the messages arrives directly at the (see Figure 3). Interaction with Name Space Each has its own local Name Space. As long as threads put information into their local space no additional communication is necessary. However, to keep a consistent view of a distributed application consisting of several s, information placed PE PE PE PE PE PE PT-PVM User Thread light-weight process Message Router handling messages for this PE Message Send (async. RPC) Message Receive heavy-weight process RPC-Sender PVMD RPC-Sender PVMD CPU CPU Figure 3: Overview of the runtime system into a local will be broadcast to all other s and cached locally. The distribution of information to the s is part of the runtime system of P T- PVM and therefore transparent to the user. 5. Graphical User Interface The P T- PVM programming environment includes a graphical user interface which allows users to monitor the message traffic between all threads running in the P T- PVM computation space. 5.1. Principles Colors are used to indicate whether a thread is blocked (waiting for a message, red), performing some computation (green) or has not even used the message passing facilities at all (color 0 / red/green). Also topology information can be displayed, if the communication behavior of the processes is predefined (like in the example of Figures 4, 5 where the interface is shown while running a distributed leader election algorithm). At a glance the following information is provided: The location of a thread: ; Amount of messages sent/received totally, and thread specific; Total amount of s and s involved in the application; Total number of threads running; Status of threads (blocked/running); Type of process (light weight, heavy weight); Topology information of the communication structure. Figure 4: Creation of the ring, thin circles represent light weight processes Figure 5: After creation of the ring, fat circles denote heavy weight processes 5.2. Centralized Approach In order to display global information, we use a centralized approach. A designated serves as the graphical user interface server and communicates via standard UNIX pipes with a display server (a Tcl/Tk application). Processes which are displaying information on the display server pass via the graphical user interface server using standard P T- PVM communication primitives. The special then forwards the request to the display server. 6. Performance Results Although performance issues were not our primary goal for the development of P T- PVM , we carried out two experiments: the first one compares message round trip times using the Ping-Pong program, and the second one compares the performance of two different implementations of the Sieve of Eratosthenes algorithm. 6.1. Point to point communication In the first experiment we compared two versions of the Ping-Pong program to obtain an idea of how much overhead is incurred by P T- PVM as compared to PVM. Figure 6 gives an overview of the obtained results for this program. The figure shows the difference of message round-trip times obtained using two PVM-processes and two P T- PVM threads running on two s on different s. As expected, we observe a considerable overhead for small messages (up to 40 %) which cuts down to about 20 % Performance results 100 ’compare.data’ Overhead in % 80 60 40 20 0 0 5000 10000 15000 20000 Message (bytes) 25000 30000 Figure 6: Performance results for relatively big messages 132 . The values have been obtained using the PvmDataRaw option and the PvmRouteDirect routing policy. Table 1 compares the time to create distributed processes/threads using either PVM directly or P T- PVM . For P T- PVM we consider either local invocation or distributed invocation. Clearly, the fast invocation of fine grained processes (threads) is one of the advantages of P T- PVM . All experiments have been carried out on a cluster of Sun SparcStations-5 running Solaris 2.4 threads[24]. Threads 1 32 128 PVM 58 5286 - P T- PVM (dist.) 18 496 2044 P T- PVM (local) 0.8 48 416 Table 1: Spawning of threads (msec) 6.2. Glenda and P T- PVM 4 This experiment has been carried out to compare P T- PVM with Glenda [23], a programming environment offering a Linda like programming model. We implemented a Glenda and P T- PVM version of a data parallel version of the Sieve of Eratosthenes algorithm 5 which computes the number of primes in an interval from 1 to MAX (see Figure 7, 8). For this experiment we used 4 SparcStation-20, each 6 Sometimes P T- PVM is even faster, but this is related to the process scheduling scheme of the UNIX platform and implementation aspects of the TCP/IP stack. 7 It is interesting to note, that while comparing two runs of the same pure PVM Ping-Pong program, we observed up to 50 % performance difference for round-trip times of messages. 8 This is a slightly modified version of an implementation published in [25]. Processes/Threads 4 8 32 256 Glenda 13.6 24.7 75.4 - P T- PVM 10.4 11.0 13.2 25.2 Table 2: P T- PVM and Glenda (secs) of them hosting a for the P T- PVM , and a worker process for the Glenda case, respectively. The algorithm is a classical master/worker application: one master process spawns N worker processes which perform the actual computation. The search space is divided into N stripes of memory, each worker will work only on MAX/N of the total search space. A special worker (#0) generates prime candidates which will be put in the Name Space of P T- PVM (using csBasePut()). Each worker 0 / #0 will get the prime out of the Name Space (using csBaseGet()), strikes out all multiples of the candidate in its portion of the search space and either waits for another prime, or computes the amount of primes which have been found in its local strip of memory and post it into the Name Space using csBasePut(). The master process finally will collect the partial sums to compute the result. There are three important points to note here: (1) the call to csBaseGet() for each worker process is non destructive, the prime candidate will be read not only by one worker process, but by all others as well, and (2) the worker #0 is not addressing a particular destination but an unknown set of destinations (uncoupled communication in space), and (3) the produced candidates remain persistent even after the worker #0 might have terminated already (uncoupled communication in time). Table 2 gives an overview of the achieved performance for a Glenda version and a P T- PVM version of the algorithm described above. Clearly P T- PVM out performances Glenda because of its distributed implementation of its Name Space. Glenda uses a centralized tuple server approach which induces a lot of communication between the worker processes: each time a worker process gets a candidate out of Glenda’s tuple space it has to contact the tuple server, whereas in P T- PVM each thread running in a has its own local Name Space and therefore locally cached information in a can be retrieved directly without any additional communication. 7. Conclusion and Outlook This paper gave an overview of P T- PVM , a communication platform suitable for multithreaded fine grained applications. P T- PVM is unique in that it offers a portable multithreaded programming environment with an advanced programming model. The facilities provided by P T- PVM include a Name Space for processes, services and mes- csThreadDecl(worker); /* Master process starts here */ void m_main(struct _proc_env *p) { int i, sum, partsum; char p_name[32], sum_name[32]; long start; struct _cs_corr new; for (i = 0; i < NUMPROCS; i++) { csSpawn_1(worker, &i, CS_THREAD, "DISTRIBUTED", &new, p); sprintf(p_name, "worker %d", i); csBaseRegister(p_name, new, EOF_SERVICE); } sum = 0; for (i = 0; i < NUMPROCS; i++) { sprintf(sum_name, "localsum %d", i); csBaseGet(sum_name, &partsum, 0); sum += partsum; } printf("[0, %d] total primes: %d\n", N, sum); csBaseShutdown(); } Figure 7: Sieve of Eratosthenes programmed using P T- PVM : master process sages as well as a multithreaded client/server model. Heterogeneous communication between heavy weight and light weight processes allow for open distributed applications. Portability is guaranteed through the use of two widely accepted standards for message passing and thread handling, namely PVM and POSIX threads [26]. The Software and Document Engineering group at the University of Fribourg uses P T- PVM as its basic process management and communication platform for the development of a structured document recognition system [27]. The system emphasizes on user intervention and is based around a cooperative distributed architecture. In this context P T- PVM is used to implement three different kinds of software agents: (1) data managers: each part of a document (pages, blocks, lines) is managed by a dedicated thread whose behavior ”drives” the recognition, (2) specialists: each domain expertise (OCR, segmentation, font recognition) is represented by a server (thread), which is invoked by data managers, and which can master a set of workers (as separate Unix processes, or as remote s), and (3) ”twins” as the counterpart of the computational agents which handle the interaction with the user. The P T- PVM system is fully operational and a first prototype has been implemented an a Cluster of SUN workstations running Solaris 2.5 and POSIX threads [26]. The prototype is available on ftp-iiuf.unifr.ch. There are still numerous open questions to improve the system: first of all, make PVM itself thread safe; #define N 10000000 #define NUMPROCS 32 #define SIZE N/NUMPROCS void worker(struct _proc_env *p, int id) { int localsum, i, candidate, ci, lowvalue; char prime[SIZE], l_sum_name[32], new_candidate[32]; for (i = 0; i < SIZE; i++) prime[i] = TRUE; /* Identification of worker #0 */ if (!id) prime[0] = prime[1] = FALSE; lowvalue = SIZE * id; candidate = 2; ci = 0; while (candidate * candidate < N) { if (!id) /* Worker #0 /* i = candidate + candidate; else { i = lowvalue % candidate; if (i) i = candidate - i; } for (; i < SIZE; i += candidate) prime[i] = FALSE; /* Generate a new name for the Name Space */ sprintf(new_candidate,"candidate %d", ci); if (!id) { /* Worker #0, searching for a new candidate while (!prime[++candidate]); */ /* Post a new candidate into the Name Space */ csBasePut(new_candidate, &candidate, sizeof(int)); } else { /* Get a new candidate out of the Name Space */ csBaseGet(new_candidate, &candidate, 0); } ci++; } localsum = 0; for (i = 0; i < SIZE; i++) localsum += prime[i]; /* Name your local sum and post it*/ sprintf(l_sum_name,"localsum %d", id); csBasePut(l_sum_name, &localsum, sizeof(int)); csBaseThreadExit(p); } Figure 8: Sieve of Eratosthenes programmed using P T- PVM : worker process improve the mapping of threads onto the #$% space. This could be done using a sophisticated scheduling scheme like work-stealing [28]; improve the message buffering and forward mechanisms of the router thread running on each ; dynamically create new s; support of thread migration and load balancing issues; receive message via interrupts and support active messages. Our plans for the future include a port of the system onto the Parsytec PowerExplorer to further develop our C O L A prototype. We expect a significant performance gain as soon as the thread safe version of PVM is available which will then allow us to remove the “send-server” and hence save one process context switch. References [1] T. W. Malone and K. Crowston. The Interdisciplinary Study of Coordination. ACM Computing Surveys, 26(1):87–119, March 1994. [2] N. Carriero and D. Gelernter. Coordination Languages and Their Significance. Communications of the ACM, 35(2):97–107, February 1992. [3] N. Carriero and D. Gelernter. Linda in Context. Communications of the ACM, 32(4):444– 458, 1989. [4] J.M. Andreoli, P. Ciancarini, and R. Pareschi. Interaction Abstract Machines. In P. Wegner G. Agha and A. Yonezawa, editors, Reserach Directions in Concurrent Object Oriented Programming. MIT Press, Cambridge Mass., 1993. [5] J. P. Banâtre and Le Métayer. The Gamma Model and its discipline of Programming. Science of Computer Programming, 15:55–77, 1990. [6] B. Hirsbrunner, M. Aguilar, and O. Krone. CoLa: A Coordination Language for Massive Parallelism. In Proceedings ACM Symposium on Principles of Distributed Computing (PODC), Los Angeles, California, August 14–17 1994. [7] O. Krone, M. Aguilar, and B. Hirsbrunner. P T- PVM: Using PVM in a multi-threaded environment. In 9;:< PVM European Users’ Group Meeting, Lyon, September 13–15 1995. [8] Frédéric Bapst, Rolf Brugger, Abdelwahab Zramdini, and Rolf Ingold. L’intégration des données dans un système de reconnaissance de documents assistée. In CNED’96, Nantes (France), 1996. to appear in CNED’96. [9] Thilo Kielmann and Guido Wirtz. Coordination Requirements for Open Distributed Systems. In PARCO’95. Elsevier, 1996. [10] C.A.R. Hoare. Communication Sequential Processes. Communications of the ACM, 21(8), August 1978. [11] V.S. Sunderam. PVM: A framework for parallel distributed computing. Concurrency: Practice and Experience, 2(4):315–339, December 1990. [12] David Gelernter. Generative Communication in Linda. ACM Transactions on Programming Languages and Systems, 7(1):80–112, 1985. [13] O. Krone and M. Aguilar. Bridging the Gap: A Generic Distributed Hierarchical Coordination Model for Massively Parallel Systems. In Proceedings of the ’95 SIPAR-Workshop on Parallel and Distributed Computing, Biel, Switzerland, October 1995. [14] M. Bever, K. Geihs, L. Heuser, M. Mulhauser, and A. Schill. Distributed Systems, OSF DCE and Beyond. In A. Schill, editor, International DCE Workshop, number 731 in LNCS, Karlsruhe, October 7–8 1993. Springer Verlag. [15] Marc Aguilar and Béat Hirsbrunner. Post: A New Postal Mail Delivery Model. In K.M. Decker R.M. Rehmann, editor, Programming Environments for Massively Parallel Distributed Systems, pages 231–237. IFIP WG 10.3, Birkhaeuser, April 1994. [16] A. T. Krantz, A. Zadroga, S. E. Chodrow, and V. S. Sunderam. An RPC Facility for PVM. In HPCN Europe ’96, LNCS. Springer Verlag, 1996. accepted for publication. [17] A. Ferrari and V.S. Sunderam. TPVM: Distributed Concurrent Computing with Lightweight Processes. Technical report, University of Virginia, 1995. [18] O. Krone, M. Aguilar, and C. Renevey. The C O L A Approach: a Coordination Language for Massively Parallel Systems. In Marc Aguilar, editor, Proceedings of the ’94 SIPARWorkshop on Parallel and Distributed Computing, pages 29–33. University of Fribourg, October 1994. [19] M. Christaller. Athapascan-Ob for PVM: Adding threads to PVM. In European PVM Users Group Meeting, Rome, October 9–11 1994. [20] Roman Marxer. Ein Kommunikations und Prozessmanagement fuer die Koordinationssprache C O L A. Master’s thesis, University of Fribourg, 1995. [21] T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schause. Active Messages: a Mechanism for Integrated Communication and Computation. Technical Report 92-675, University of California at Berkeley, March 1992. [22] O. Krone and B. Hirsbrunner. PT-PVM: A Communication Platform for the Coordination Language C O L A. Technical report, University of Fribourg, 1995. Available via ftp://ftpiiuf.unifr.ch/pub/pai/pt-pvm/ptpvm.ps. [23] Ray Seyfarth, Suma Arumugham, and Jerry Bickham. Glenda 1.0, 1994. Available via ftp://seabass.st.usm.edu/pub/glenda.tar.Z. [24] Sun Microsystems, Mountain View, California. SunOS 5.3 Guide to Multithread Programming, November 1993. [25] N. Carriero and D. Gelernter. Case studies in asynchronous data parallelism. International Journal of Parallel Computing, 22(2):129–149, 1994. [26] POSIX System Application Program Interface: Threads Extention [C Language] POSIX 1003.4a Draft 8. Available from the IEEE Standards Department. [27] Frédéric Bapst, Rolf Brugger, Abdelwahab Zramdini, and Rolf Ingold. Integrated multiagent architecture for an assisted document recognition system. In DAS’96, 1996. submitted to DAS’96. [28] R.D. Blumofe and C.E. Leiserson. Scheduling Multithreaded Computations by Work Stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, pages 356–368, Santa Fe, New Mexico, November 1994.