A Communication Library Support Concurrent Programming Courses* to Steve Carr, Changpeng Fang, Tim Jozwowski, Jean Mayo and Ching-Kuang Shene Department of Computer Science Michigan Technological University Houghton, MI 49931 Email: {carr, cfang, trjozwow, jmayo, shene}@mtu.edu Abstract c o m p l e x i t y a n d diversity of c o m m u n i c a t i o n interfaces. Students likely learn s e p a r a t e interfaces for synchronized c o m m u n i c a t i o n a m o n g t h r e a d s (lightweight processes), processes (heavyweight processes) on the same machine, and processes on different machines. A n o t h e r difficulty faced in temching networked c o m m u n i c a t i o n in p a r t i c u l a r is the i n t r o d u c t i o n of message loss in some controlled fashion. A n u m b e r of c o m m u n i c a t i o n libraries have been written to s u p p o r t c o n c u r r e n t p r o g r a m m i n g . For a variety of reasons, these libraries generally are not well-suited for use in u n d e r g r a d u a t e courses. We have w r i t t e n a c o m m u n i c a t i o n library uniquely tailored to an academic environment. T h e library provides two levels of c o m m u nication a b s t r a c t i o n ( t o p o l o g y and channel) and supp o r t s c o m m u n i c a t i o n a m o n g threads, processes on the same machine, and processes on different machines, via a unified interface. T h e routines facilitate controlled message loss along channels and can be i n t e g r a t e d w i t h an existing graphical tool t h a t s u p p o r t s visualization of the c o m m u n i c a t i o n t h a t occurs. A n editor has been developed for a u t o m a t i c code generation for a r b i t r a r y topologies via a graphical interface. All these tools r u n over Solaris, Linux, and Windows. 1 In order to address these issues, we have developed a library to s u p p o r t c o m m u n i c a t i o n a m o n g threads, processes on the same machine, or processes on different machines, via a unified interface. T h e s e routines implem e n t an a b s t r a c t i o n of the p r i m a r y overarching characteristics of I P C (interprocess c o m m u n i c a t i o n ) . T h e y facilitate the s t u d y of c o n c u r r e n t application design a n d can serve as a s t a r t i n g point for s t u d y of the implement a t i o n of I P C within a particular p a r a d i g m , threads, processes on t h e same machine, or processes on different machines. T h e library a b s t r a c t s the passing of messages at two levels: t o p o l o g y (the highest level) a n d channel. Additionally, t h e routines provide a m e c h a n i s m for int r o d u c i n g message loss in a controlled fashion. T h e s e routines can be i n t e g r a t e d into an existing visualization s y s t e m t h a t depicts t h e c o m m u n i c a t i o n t h a t takes place. A t o p o l o g y editor has been developed t h a t facilitates a u t o m a t e d g e n e r a t i o n of code for a r b i t r a r y topologies using a graphical interface. T h e routines, visualization system, and t o p o l o g y editor r u n over Solaris, Linux, and Windows. Motivation C o n c u r r e n t p r o g r a m m i n g is increasingly f u n d a m e n t a l to u n d e r g r a d u a t e C o m p u t e r Science e d u c a t i o n [1]. Correspondingly, courses dedicated to, or containing a comp o n e n t in, this area are m o v i n g ever earlier into the und e r g r a d u a t e curriculum. Yet this remains a very challenging subject to teach. Aside from the difficulty of t h e material, available tools generally are not tailored to an a c a d e m i c environment. In our experience, a significant hurdle to s t u d e n t understanding, especially a m o n g lower-level students, is the 2 RelatedWork *This work supported in part by the National S c i e n c e Foundation under grants DUE-9752244 and DUE-9952509. The fourth author was also supported by National Science Foundation C A R E E R award CC1:t-9984862. A r n o w developed the X D P message passing library for teaching d i s t r i b u t e d p r o g r a m m i n g [2]. T h e goals of this library were m o r e n a r r o w t h a n our goals in developing the tools described in this paper. T h e X D P library abstracts away some of the c o m p l e x i t y of the BSD socket interface, in order to reduce the course time required to cover a n e t w o r k p r o g r a m m i n g interface while requiring t h a t s t u d e n t s still address f u n d a m e n t a l problems such as buffering, race conditions, s y n c h r o n i z a t i o n , and reliability. T h e library does n o t a t t e m p t to provide multiple Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCSE '02, February27- March 3, 2002, Covington,Kentucky, LISA. Copyright2002 ACM 1-58113-473-8/02/0002...$5.00. 360 levels of abstraction, controlled message loss, or integrated visualization s u p p o r t . char msg[]="False pearls before real svine"; channell = ne~ AsynOnetoOneChannel(1, myID,dropSome(rand())); channell.send((void *)msg,sizeof(msg)); Other, c o m m e r c i a l l y used message passing libraries are available, e.g. P V M and M P I . M P I is p e r h a p s the m o s t widely used, and s t r u c t u r i n g our message passing library a r o u n d the M P I interface (adding s u p p o r t for visualization and m a i n t e n a n c e of vector t i m e over the M P I primitives) was considered. However, at the t i m e our d e v e l o p m e n t began, publicly available i m p l e m e n t a tions of M P I , like X D P , required t h a t t h e s a m e code be executed for each process comprising an application. T h i s m a d e it unsuitable for d i s t r i b u t e d or t h r e a d e d application development. Additionally, we did not want to m a k e installation and m a i n t e n a n c e of M P I or P V M a requirement for the use of our system. (a) Sender char msg[MSGLEN]; c h a n n e l O = ne~ AsynOnetoOneChannel(O,myID,O.5); channelO.recv((void *)msg,sizeof(msg)); (b) Receiver Figure 1: Message T r a n s m i s s i o n along A s y n c h r o n o u s Channel M c D o n a l d and K a z e m i have e x t e n d e d the P V M and M P I message passing e n v i r o n m e n t to s u p p o r t virtual process %opologies [10]. Several core functions have also been developed to enable a parallel p r o g r a m to request use of a s t a n d a r d process topology, to s p a w n and ins t a n t i a t e all tasks p a r t i c i p a t i n g in a topology, a n d to specify transmission, reception, and synchronization in t e r m s of logical c o m m u n i c a t i o n p a t t e r n s , eliminating, for example, the need for students to c o m p u t e process identifiers. T h e y also provide a graphical interface for specifying, verifying, and viewing topologies. Hence, their tools are similar to w h a t is achieved b y our topology classes a n d editor. O u r tools additionally provide controlled message loss and execution visualization. 3 T h e second class i m p l e m e n t s an asynchronous one-toone channel. Along these channels, sends are nonblocking; two receive primitives are provided, one blocking and one non-blocking [5]. Message loss can be introduced along any asynchronous one-to-one channel between processes. T h e loss can be specified either as a value b e t w e e n zero and one or via an integer function, w i t h a single integer input, at the t i m e the channel is created. W h e n the loss is specified as a value between zero and one, messages will be dropped, i m m e d i a t e l y prior to the point at which t h e y would b e sent along a totally reliable channel, b y the c o m m u n i c a t i o n layer according to a uniform distribution. W h e n loss is specified as a function, the function is evaluated at this s a m e point i m m e d i a t e l y prior to the send operation. If the r e t u r n value f r o m t h e user-supplied fnnction is greater t h a n or equal to one, the message is sent, otherwise t h e message is d r o p p e d . (Hence, the loss function is directional.) Figure 1 depicts code t h a t creates a channel between the processes with identifiers zero a n d one; process zero t h e n sends a message to process one. Messages sent by process zero are d r o p p e d according to the function d r o p S o m e ( ) (which is a s s u m e d to be defined elsewhere). H a l f t h e messages sent by process one (to process zero) are d r o p p e d according to a uniform distribution. Communication Library C o m m u n i c a t i o n is a b s t r a c t e d at two levels: channel and topology. T h e two a b s t r a c t i o n s are described, in turn, below. Channel T h e goal of the channel classes is to provide an a b s t r a c t i o n of c o m m u n i c a t i o n t h a t ties closely to t h a t encountered in t h e literature. T h r e e channel t y p e s have been i m p l e m e n t e d . T h e first class is a synchronous one-to-one channel. Along this channel, b o t h send and receive are blocking [5]. Addressing in this class, as in all the classes, a m o n g threads is by P T h r e a d s t h r e a d identifier 1 and a m o n g processes is b y integer identifier. Process identifiers are either assigned implicitly w h e n application processes are s t a r t e d b y a control process, described later, or can be assigned explicitly b y the user w h e n a channel is created. No a t t e m p t is m a d e to prevent deadlock caused b y application c o m m u n i c a tion patterns, a n d the routines will block indefinitely. Message loss c a n n o t be i n t r o d u c e d artificially into synchronous channels. T h e final class is a m a n y - t o - m a n y channel, and is currently available only for threads. This class essentially i m p l e m e n t s a b o u n d e d buffer. 1Pthreads refers to thread implementations that adhere to the POSIX standard PI003.1c. E a c h class has a m e t h o d t h a t allows a channel to be queried for d a t a available to read. T h e r e t u r n has a I m p l e m e n t a t i o n of this code using a BSD socket interface would require a p p r o x i m a t e l y twice as m a n y calls as t h e two ( c o n s t r u c t o r and send) required here. This e s t i m a t e neglects the additional, non-trivial, complexity required to s u p p o r t process addressing v i a integer identifier and to drop messages in a controlled fashion. 361 theGrid = new Grid(RQWS,CDLS,myID); if ((myID Z COLS) != 3) theGrid.send(RIGHT,(void *)&myID,sizeof(myID)); if ((myID Z COLS) != 0) theGrid.send(LEFT,(void *)~myID,sizeof(myID)); if (myID >= CQLS) theGrid.send(UP,(void *)~myID,sizeof(myID)); if (myID < (RDWS-1)*COLS) theGrid.send(DONN,(void *)~nnyID,sizeof(myID)); Figure 3: H i s t o r y W i n d o w F i g u r e 2: E x c h a n g e of ID A m o n g G r i d Neighbors f r o m each a p p l i c a t i o n process a n d stores t h e result in a specified location. T h e user specified function is t h e n r u n to reduce t h e collected d a t a . S c a t t e r , Gather, a n d Reduce c u r r e n t l y o p e r a t e on o n e - d i m e n s i o n a l arrays (consecutive storage). S u p p o r t for o p e r a t i o n s on s o m e n o n - c o n s e c u t i v e s t o r a g e in t w o - d i m e n s i o n a l arrays (e.g., s u b m a t r i x ) is u n d e r d e v e l o p m e n t . value of one w h e n d a t a is available, b u t has no effect on the channel itself. If no m e s s a g e is available, t h e r e t u r n value is zero. Vector t i m e is m a i n t a i n e d w i t h i n user applications. Vector t i m e is used to d e t e r m i n e t h e h a p p e n e d - b e f o r e relation [8] a m o n g events t h a t occur within a d i s t r i b u t e d c o m p u t a t i o n [9]. Users can query, a n d i n c r e m e n t t h e local c o m p o n e n t of, t h e current local vector t i m e w i t h i n an application. Implementation A p p l i c a t i o n s c o m p r i s e d of m u l t i p l e processes are s p a w n e d v i a a central control process. Specification of t h e user p r o g r a m s a n d t h e m a c h i n e s on which t h e y should execute t a k e s place either v i a the c o m m a n d line or f r o m user-specified configuration files. T h e control process also spawns t h e visualization process, w h e n requested. Visualization d a t a is p a s s e d along T C P channels b e t w e e n t h e user processes a n d t h e visualization process. W h e n channels (or topologies) are used, a T C P c o n n e c t i o n b e t w e e n t h e user processes is c r e a t e d to effect a channel. Hence, t h e control process is not involved in c o m m u n i c a t i o n s b e t w e e n user processes. T o p o l o g y O n e - t o - o n e channels can b e joined into topologies. T h e p r i m a r y function of the t o p o l o g y class is to facilitate creation of multiple channels via instant i a t i o n of a single class. Several s t a n d a r d topologies, derived f r o m the t o p o l o g y class, axe also p r o v i d e d including: fully-connected, star, linear array, ring, grid, a n d torus. Message sends arid receives axe r e s t r i c t e d to directly c o n n e c t e d nodes within t h e topology. For e x a m p l e , t h e center n o d e of a s t a r n e t w o r k is t h e only n o d e able to send to, a n d receive from, o u t e r nodes; o u t e r nodes c a n only send to, and receive from, t h e center node. Topologies are built w i t h reliable a~ynchronous channels. Figure 2 depicts e x c h a n g e of identifiers a m o n g all neighbors in a grid topology. N o t e t h a t m a c r o s RIGHT,LEFT,UP,DOWN axe defined within our system. Similar m a c r o s are defined as a p p r o p r i a t e to a given topology. 4 Topology Editor A t o p o l o g y editor has b e e n c r e a t e d to facilitate r a p i d d e v e l o p m e n t of c o m p l e x topologies v i a a graphical interface. T h e editor allows c r e a t i o n of connections a m o n g single nodes or a m o n g topologies. T h e editor o u t p u t is a file containing specification of a class derived f r o m the t o p o l o g y class. ( T h e derived class n a m e can optionally b e specified b y t h e user.) T h e interface for this class is e q u i w l e n t to t h a t for t h e t o p o l o g y class. T h i s file can b e included b y t h e user in her code to easily create t h e c o n s t r u c t e d topology. T h e derived class supp o r t s b r o a d c a s t , scatter, gather, and reduce functions for each c u s t o m topology. W i t h i n the t o p o l o g y class, and each s t a n d a r d topology, t h e m e t h o d s S e n d ( ) , R e c e i v e ( ) , B r o a d c a s t ( ) , Scatter(), G a t h e r ( ) , a n d R e d u c e ( ) are provided. T h e Send (Receive) routine sends (receives) a message to (from) a specified process. Broadcast effects a broadcast, to all processes, of a specified message from a specified source process. Scatter partitions an input block of data, from a specified source process, into a n u m b e r of pieces equal to the n u m b e r of processes and sends a unique piece to each of the application processes. Gather complements Scatter and collects data from application processes to the process with a specified identifier. A Reduce m e t h o d collects data 5 Visualization A visualization s y s t e m has b e e n d e v e l o p e d for visualizing s y n c h r o n i z a t i o n a m o n g t h e t h r e a d s of an e x e c u t i n g a p p l i c a t i o n [3]. T h i s s y s t e m h.as b e e n e x t e n d e d to depict t h e c o m m u n i c a t i o n t h a t occurs a m o n g t h r e a d s or a m o n g processes. T h i s visualization is linked to use of 362 t h e channel class, and hence is available w h e n the channel, topology, and specific t o p o l o g y classes are used. in the more complex client server exercise. This set of exercises was designed to d e m o n s t r a t e three f u n d a m e n t a l aspects of c o n c u r r e n t application design: (1) t h e use of synchronous versus asynchronous communication, (2) deterministic versus non-deterministic c o m m u n i c a t i o n , a n d (3) the use of a client-server architecture versus a fully d i s t r i b u t e d one. T h e basis of t h e exercises is S o u n d a r a r a j a n ' s C S P [7] i m p l e m e n t a tion [11] of a set p a r t i t i o n i n g p r o b l e m [6]. T h e p r o b l e m is to p a r t i t i o n a set of integers into two sets according to the element values. Each process (heavyweight or lightweight) initially has half of t h e set. One process P0 will end u p w i t h the lower half of the elements and the other process P1 ends up w i t h t h e u p p e r half of the elements. A History Graph window (see figure 3) depicts the sends and receives t h a t occur within each process or thread, and connects corresponding send and receive operations between threads or processes. Clicking on any channel in this window opens the Channel window. This window displays all recent activity along this channel, including channel type, messages in t h e channel, messages received since the window was opened, and current status, either Sending Message or Receive Message. 6 Experience T h e channel classes closely follow the a b s t r a c t i o n s of c o m m u n i c a t i o n found in the literature and t h e y are easily i n c o r p o r a t e d into existing assignments. T h e i r use provides the additional a d v a n t a g e of allowing students to visualize t h e c o m m u n i c a t i o n t h a t takes place. with1 - new T h e first exercise requires t h a t the p r o b l e m be solved using synchronous message passing. P0 sends the m a x i m u m value max(So) from its set So to P1 a n d removes max(So) f r o m So. Po t h e n waits to receive the m i n i m u m value rain(S1) f r o m t h e set $1 of P1. This continues until P0 receives a value t h a t is greater t h a n or equal to t h e one it sent. SynOneTnOneChannel(1,0); U p o n receiving a value f r o m P0, P1 adds t h e received value to $1. P1 t h e n sends rain(S1) to Po and removes rain(S1) f r o m its set. This continues until P0 notifies P1 t h a t the set is partitioned. A solution is given in figure 4. This exercise illustrates d e v e l o p m e n t of a simple application protocol. (Po and P1 m u s t agree on the f o r m a t of messages, and agree on a sentinel message t h a t lets -P1 know t h a t no further messages will be sent.) It also serves to familiarize students w i t h the (synchronous) message passing interface. neg.veLlue - mmx(mySet); oentValue~mog.valuo vi*hl-sund((vuid e)meKjs£zeof(msK)); Ee~ovo(mySe~,msK.value); w±~hl.recv((vuid *)ms~,s~zeofCmeg)) ; add(mySet,ms~.value); } while (sen~Vs/ue • ms~.velue); ms~.done~l; v*thl.send((votd *)msE.s±zeof(meE)); (a) Solution for Process Zero ~IthO - new SDQ~eToOneChaanel(O,l); ~)ms~,e~zeof(~s~)); if (mB.done "" 0){ add(mySet~s~.value); meE.value - min(mySet); ui~hO.recv((vnid ¢~x.~'~:~ r ~ l i L~,lmi~ . . . . . . . . . . . . . . . . . wSthO.eend((voSd *)msE,eiz~of(me~)); remove(mySe~,meE.value); } while (~s~.doneffi~O); i............~ 2 ~ - ~ : ! ! ~ : = : - - ~ ~ : = : : f ~ ' ~ .~j (b) Solution .for Process One ' ° ~ ................................................................... ~._~., I 1I Figure 4: Set P a r t i t i o n T I We present a sequence of exercises t h a t result f r o m our experience with teaching network p r o g r a m m i n g over the p a s t several years in an upper-level u n d e r g r a d u a t e course. T h e s e exercises were recently developed to serve as t h e first set of network p r o g r a m m i n g exercises. Initially, our first network p r o g r a m m i n g assignment was m o r e complex. We typically required an application t h a t contained multiple clients and a server for all client types, similar to t h a t described in [4]. While t h e y rep o r t large-scale success, we have found t h a t , while m a n y students are able to complete the assignment, a significant n u m b e r of s t u d e n t s have difficulty. We hope t h a t c o m p l e t i o n of these exercises will lead to greater success Figure 5: Simultaneous Synchronous Sends Window - History T h e second exercise requires a solution similar to t h a t for exercise one, w i t h the exception t h a t , at each step, Po and -Pl send their m a x i m u m and m i n i m u m values, respectively, simultaneously. W h e n P0 (P1) receives a value less t h a n (greater t h a n ) the one it sent out, the sent value is r e m o v e d f r o m its set a n d t h e received value is added. W h e n P0 (P1) receives a value greater t h a n (less t h a n ) the one it sent out, t h e p a r t i t i o n is c o m p l e t e a n d no set modifications are m a d e . We do not specify 363 References the use of a p a r t i c u l a r channel class. Development of a solution requires t h a t students come to the realization t h a t only asynchronous message passing facilitates sim u l t a n e o u s execution of a send by b o t h P0 and P1- If synchronous c o m m u n i c a t i o n is chosen by the student, the p r o b l e m quickly presents itself within the visualization system, as depicted in figure 5. T h e final exercise of this sequence incorporates N processes with portions of the set and requires a centralized solution. Students use a supplied non-deterministic receive function (or can create this function for themselves) t h a t listens for d a t a on any incoming channel. A server collects a set of integers from all clients, computes the set partition, and returns the resulting sets to the clients. 7 [1] ACM. C o m p u t i n g Curricula 2001 (Steelman Draft, August 1, 2001). http://~r~ra.acm.org/sigs/ sigcse/cc2001/steelman/, 2001. [2] Arnow, D . M . A simple library for teaching a d i s t r i b u t e d p r o g r a m m i n g module. In TwentySixth SIGCSE Technical Symposium on Computer Science Education (Nashville, Tennessee, Max.2-4 1995), pp. 82-86. [3] Bedy, M., Caxr, S., Huang, X., and Shene, C.-K. A visualization s y s t e m for m u l t i t h r e a d e d programming. In Proceedings off the 31st Annual SIGCSE Technical Symposium on Computer Science Education (Austin, TX, M a r c h 2000), pp. 1-5. [4] B r y a n t , R. E., and O'Hallaxon, D. R. I n t r o d u c i n g c o m p u t e r systems from a p r o g r a m m e r ' s perspective. In Proceeding of ~he Thirty-second SIGCSE Technical Symposium on Computer Sciense Educ~'ion (SIGCSE-01) (New York, Feb.21-25 2001), pp. 9O-94. [5] Coulouris, G., Dollimore, J., and Kindberg, T. Distributed Systems Concepts and Design, t h i r d ed. Addison-Wesley, 2001. [6] Dijkstra, E. W. A correctness p r o o f for networks of c o m m u n i c a t i n g sequential processes - a small exercise. EWD-607, 1977. Conclusions and Future W o r k We have developed a message passing library t h a t provides two levels of abstraction, channel and topology, for the c o m m u n i c a t i o n t h a t occurs among processes and threads. T i g h t l y i n t e g r a t e d visualization s u p p o r t is available, as is s u p p o r t for controlled message loss. A t o p o l o g y editor allows d e v e l o p m e n t of c u s t o m topologies via a graphical interface. We anticipate t h a t these c o m m u n i c a t i o n classes and associated tools will s u p p o r t the i n s t r u c t i o n of c o n c u r r e n t p r o g r a m m i n g b y reducing the overhead associated with learning message passing interfaces, by providing a uniform interface for communication b o t h among t h r e a d s and a m o n g processes, and by providing i n t e g r a t e d visualization s u p p o r t w i t h o u t t h e need for i n s t r u m e n t i n g user programs. [7] Hoaxe, C. C o m m u n i c a t i n g sequential processes. Commun. A C M ~1, 8 (1978), 666-677. T h e s e message passing classes are p a r t of a larger syst e m t h a t provides a class library for threads, t h r e a d synchronization, and message passing. The system c u r r e n t l y also has s u p p o r t for visualizing t h e synchronization of threads and the message passing t h a t occurs a m o n g t h r e a d s and processes. We axe c u r r e n t l y adding s u p p o r t for synchronization of processes (barrier and m u t u a l exclusion), implementing well-known parallel and d i s t r i b u t e d algorithms, adding s u p p o r t for d i s t r i b u t e d arrays, and adding additional visualization s u p p o r t specifically for parallel and distributed programming. We believe the tools can be used at any level in which students have the p r o g r a m m i n g sophistication and b a c k g r o u n d sufficient to cover concurrency. We have t a u g h t t h r e a d and network p r o g r a m m i n g in a course p o p u l a t e d p r e d o m i n a n t l y by sophomores and juniors. T h e s y s t e m has not been used at a lower level. Comprehensive, detailed i n f o r m a t i o n on our work is available at http ://www. cs.mtu, edu/-shene/NSF-3/ index, html. 364 [8] L a m p o r t , L. T i m e , clocks, and the ordering of events in a d i s t r i b u t e d system. Commun. A C M P1, 7 (1978), 558-565. [91 M a t t e r n , F. V i r t u a l t i m e and global states of dist r i b u t e d systems. In Parallel and Distributed Algorithms: Proceedings of the International Workshop on Parallel and Distributed Algorithms, M. C. et. al., Ed. Elsevier Science Publishers B. V., 1989, pp. 215-226. [10] McDonald, C., and Kazemi, K. Teaching parallel algorithms w i t h process topologies. In Thirtyfirst SIGCSE Technical Symposium on Computer Science Education (Austin, Texas, Max.7-12 2000), pp. 70-74. [11] S o u n d a r a r a j a n , N. A x i o m a t i c semantics of comm u n i c a t i n g sequential processes. A C M Transactions on Programming Languages and Systems 6, 4 (1984), 647-662.