distributed multiprocessor environments - Parent Directory

advertisement
DISTRIBUTED MULTIPROCESSOR ENVIRONMENTS
Timothy Rolfe
Computer Science Department
Eastern Washington University
202 Computer Sciences Building
Cheney, WA 99004-2412
(509) 358-2065
Timothy.Rolfe@mail.ewu.edu
http://penguin.ewu.edu/~trolfe/
The slide set from the presentation at CCSC-NW 2002 is available through this link
ABSTRACT
This paper discusses the material included in a seminar for seniors and graduate
students on distributed parallel processing offered as an experimental course at
Eastern Washington University in the Fall 2000 and Winter 2002 quarters, along
with the difficulties discovered as the courses progressed.
INTRODUCTION
Since the 1991 public release of PVM (Parallel Virtual Machine) [1] the teaching of
parallel processing has been an option, even in schools with a limited budget, provided that the
educational institution supports Unix or a Unix variant. [2] Recent developments in PC
workstations and freeware operating systems like Linux have greatly facilitated offering such
courses, as has the availability of freeware implementations of the more recent MPI (Message
Passing Interface) such as MPICH and LAM/MPI.
This paper will present the resources gathered together for a four-credit seminar course on
parallel processing offered at Eastern Washington University in the Fall 2000 and Winter 2002
quarters, in the hopes that they may be of use to others developing similar courses. The Fall
2000 class had one undergraduate senior, five registered graduate students, and one graduate
student auditor; the Winter 2002 class had two graduate and five undergraduate students.
The computers used all had x86/Pentium processors running under the Linux operating
system. The author (while at Dakota State University in Madison, SD) has also used similar
machines under the FreeBSD operating system as well as a Digital Equipment Corporation
Alpha computer running under Ultrix. Some of the EWU computers were dual-processor
machines, making speed-ups under a simple fork-based parallelism possible at the beginning of
the term before any of the message-passing systems were covered. (They could also have
allowed exploration of threads under Java or POSIX, though that was not done.)
The programming language used was C, with C++ elements added as convenient.
The Fall 2000 course used the following message-passing environments for distributed
parallel processing: PVM (version 3.4.3) and MPI as MPICH (version 1.2.1). The Winter 2002
course used MPI as LAM/MPI (version 6.4-a3) and PVM (version 3.4.4). The instructor had
hoped to include some explicit socket programming to show message passing under direct
Page 1
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 2
programmer control. Due, however, to his own inexperience, he was not able to develop that
component of the course within the time available. Highlighting the difficulties of socket
programming, a highly competent graduate student developed a socket example program for
demonstration in the Winter 2002 course, but the application failed to port from his computer’s
version of the Linux operating system to Linux as installed on the course computers.
The PVM Users’ Guide is available on-line in HTML format [3], as well as PostScript
format, [4] while there is no comparable free source for the MPI Users’ Guide [5] (at least that
the author is aware of). Consequently, the MPI Users’ Guide was assigned as the required text
for the Fall 2000 course. Towards the end of that quarter the instructor discovered Peter
S. Pacheco’s book Parallel Programming with MPI [6], and that was used as the text for the
Winter 2002 course. Now another book has come to light that looks even more attractive:
Parallel Programming: Techniques and Applications Using Networked Workstations and
Parallel Computers, by Barry Wilkinson and Michael Allen. [7]
PARALLEL PROCESSING BACKGROUND
Since the students could not be expected to have a background in parallel computing, the
course began with an overview of parallel programming based on journal and web articles.
 Michael Flynn’s original papers proposing the SISD/SIMD/MISD/MIMD taxonomy for
high-speed computers. [8]
 A very useful survey of parallel computing from the [IEEE] Computer magazine. [9]
 A more recent survey of parallel computing (1997) discovered on the World-Wide
Web. [10]
 Several papers on Beowulf Clusters —commodity off-the-shelf computers running a
Unix variant and networked for distributed parallel processing. One of these is from an
internal NASA publication, made available by one of the graduate students who had
worked the previous summer within NASA. [11] Among the articles referenced is one
from LinuxWorld that makes mention of the “Stone Souper Computer” — a name
conflating the folk tale of “stone soup” with the idea of cooperatively assembling
computing resources to generate a powerful parallel computational tool.
 “Queens on a Chessboard: Making the Best of a Bad Situation”, [12] the instructor’s
own paper that includes a discussion of parallel processing based on the Unix fork,
shared memory processing on a Silicon Graphic multiprocessor, and distributed
processing based on message passing under PVM.
 Examples of massively distributed and extremely loosely coupled processing can be
found in the on-going SETI-at-home project, as well as analogous projects for prime
number discovery and configurational studies of potential cancer medicines. [13]
 As the quarter progressed, further web articles were encountered on “grid computing”
and passed along to the students.
PROGRAMMING ENVIRONMENTS
The initial programming environment for the Fall 2000 course was the departmental Linux
computer lab, in which several computers have dual processors. These machines run with “rsh”
disabled, requiring access through “ssh.” This restriction caused some problems as the quarter
progressed. One of the class members noticed some surplus 486 computers stacked to one side
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 3
of a lab awaiting disposal, and suggested that the class build it’s own “Stone Souper Computer.”
Thus was launched the “Boat Anchor Armada” [14]: five 486-66 computers on a local network
running under Linux, one in which it was possible to allow use of “rsh” without security
problems. The generation of this system was addressed in a paper presented at the CCSC-NW
conference in 2001 by Stuart Steiner, one of the graduate students in the class. [15]
Thanks to a grant from the Washington State Higher Education Coordinating Board for the
Eastern Washington University Center for Distributed Computing Studies, four high-speed dual
processor computers were added to the EWU Armada (and nicknamed the “hydroplanes”) along
with a somewhat slower administrative computer to provide an external connection — the
administrative machine is the only one with access to the Internet. This amplified network was
used for the Winter 2002 course.
In the Fall 2000 course, the students generated their own message-passing environments by
installing first PVM and then MPICH in their own accounts. For the Winter 2002 course the
LAM/MPI environment was already installed on the EWU Armada computers. The environment
definitions for PVM, however, were removed and students installed their own copies of PVM in
their own accounts. This allowed each student to have the experience of bringing up a messagepassing environment as a totally unprivileged user.
FIRST ENVIRONMENT: UNIX FORK
The simplest parallel programming is for what are called “embarrassingly parallel
problems” [16] (requiring minimal interprocess communications) done on a computer with more
than one processor. On such a system, one may use the simple Unix “fork” to generate
multiple copies of the same program, all of them sharing the files open at the time of the “fork”
and usable as a communications channel. The actual parallel processing is then handled by the
operating system’s assignment of processes to available processors.
A paper on the “NOW Sort” [17] suggested to the instructor a particularly simple problem
that might be used as a teaching example — one so simple that the bulk of the code developed
would actually be related to the parallel processing rather than the problem solution. The NOW
sort partitions the data to be sorted into k segments (where, for all j from 1 to k–1, all data found
in segment j are less than any data found in segment j+1), after which those segments are sorted
in parallel. This suggests a preliminary problem: determination of the sizes of the partitions, a
problem that amounts to determining the values for a histogram.
In the context of a fork-based approach, the data are provided to the child process or
processes by the fork itself. The loop logic generating child processes and the values returned by
the fork function itself allow each process to determine which instance it is, and thus which array
segment it is responsible for. The child process or processes send their data back by means of a
shared binary output file in the /tmp directory (to avoid network overhead, a local disk is
preferable to a shared NFS disk). Once all child processes have terminated, the original parent
moves to the front of that file, and from it accumulates the sums for the k segments in the
histogram.
For the Winter 2002 course, Pacheco’s Chapter 4 example of numerical integration [18]
suggested to the instructor another embarrassingly parallel problem to exemplify fork-based
parallelism with a shared file as the data communication medium: something nicknamed the
“world’s worst way” to calculate the natural logarithm of N, namely the Monte Carlo integration
of the function “f(t) = 1/t” from 1 to N — and this program quite naturally leads to the “world’s
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 4
worst way” to calculate π, namely a Monte Carlo run counting the number of (x,y) pairs in a
uniform random distribution in the range (0..1, 0..1) that meet the constraint “x2 + y2 < 1.” For
both of these, if the child processes take steps to insure using different seeds for the random
number generator, their Monte Carlo runs are presumed to be independent of the each other’s
and the parent’s run for the purposes of this class. Data flow is extremely minimal, since all that
is required is the communication of the number of (x,y) points falling under the curve during the
integration. (Wilkinson and Allen provide a discussion of parallel random number generation to
guarantee independent sequences of numbers. [19])
If there is any interest, these simple examples are available on the World-Wide Web. [20]
That same link also provides an example of realistic use of fork-based parallelism: a program to
characterize two algorithms for one-time balancing of Binary Search Trees as compared with
AVL trees. It samples varying sizes, and uses a fork to have two processors simultaneously
generating search trees and accumulating statistics, each for a different tree size. The shared
output file then accumulates these results.
MESSAGE-PASSING ENVIRONMENTS (PVM [21] AND MPI [22])
In the Fall 2000 course, the first PVM program covered was the analog to the standard
“Hello, World” program — the master program starts the slave programs under PVM, sends each
one a message, and receives back a message from each one. It does, however, show all the PVM
components needed to do significant parallel programming. This was also used as the first PVM
program in the Winter 2002 course, following the consideration of MPI.
In the Winter 2002 course, Pacheco’s book provided the first specimen MPI programs,
which also show the MPI components needed to do significant parallel programming.
The histogram program developed earlier under “fork” was brought into the messagepassing environments. This allowed modeling the passing of arrays as messages since the
master/root process needs to send the data segments for processing to the other processes, and
they need to return the frequency-count arrays for their portions of the data array. The natural
development under PVM is as a pair of programs running in MIMD mode, while under MPI the
natural development is as a single program running in SPMD mode. Of course, an environment
allowing MIMD applications necessarily supports SPMD applications, and so the histogram
program was also developed as an SPMD application under PVM. It also provided a means of
exemplifying the use of “groups” under PVM to approximate the environment provided by the
MPI “communicator.” These various parallel implementations of the histogram calculation are
available on the World-Wide Web. [23]
To further exemplify cooperating processes, the instructor programmed an implementation
of the classical “Bakery algorithm” in the Winter 2002 course. While the MPI version was
initially developed as an SPMD application, it was then transformed into a MIMD application
since the LAM/MPI environment provides the “application schema” as a way of programming in
that fashion. As implemented, the application has four categories of cooperating programs: the
user interface (notifying the number server and clerk processes of the first number to be
dispensed and served, and then passing along pastry information to the customer processes), the
number server process, the clerk process, and then the various customer processes. For each
purchase, a customer process receives a number from the number server, and then uses that
number in its interaction with the clerk process to obtain its pastry.
Since the MPI version has a static number of processes, the MPI application was developed
with a fixed number of customer processes — one could say, a limited lobby area in which the
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 5
customers can wait. PVM, with its dynamic creation of processes, does not have this restriction,
and customer processes are immediately created for each pastry request.
These programs are also available on the World-Wide Web. [24]
STUDENT PARALLEL PROGRAMMING EXERCISE
The problem assigned to the students for their own programming was generated by taking
Dijkstra’s railroad problem of two-way travel on a single track (transformed by Tanenbaum into
baboons crossing a rope, [25] but transformed back to a railroad problem) and formulating it for
solution based on message passing rather than semaphores. The architecture suggested to the
students for the solution had three types of cooperating processes:
 The user interface, which under PVM acts as master process spawning the slave
processes. The user interface also determines passage of time (since a user
“synchronization” command denotes the end of each time unit).
 A single track controller that communicates with the user interface (receiving commands
and returning information) and with the individual train processes (receiving track entry
requests and track exit notifications, and sending track entry permission).
 A variable number of train processes, activated by the user interface and communicating
with the track controller to cross the guarded track segment. Once the train enters that
segment, it requires three time units to clear it. The train process is finished once it has
cleared the track segment and has notified the track controller. In the PVM environment,
the process is created at need, and can simply terminate. In the MPI environment,
however, the number of train processes is fixed — one is tempted to call them the
“switch engine” processes. In that case, the train process notifies the user interface that it
has completed its current assignment and that it is available for the next train request.
Some students in the Winter 2002 course chose difference allocations of process
responsibilities. For instance, one chose to consolidate the user interface and the track controller
functionalities into one process, and to have a second cooperating process handle all of the train
interactions.
The instructor’s implementations in both environments are available on the World-Wide
Web [26]. The MPI version is specific to the LAM/MPI environment since it takes advantage of
the “application schema” to develop a MIMD application. For the MPICH environment, there is
code available to use a small driver to start up the several processes, with the minor changes to
make the three processes callable from that driver.
SELF-CHOSEN STUDENT PROJECTS
To finish the quarter, the students formed programming teams to develop applications of
their own choosing as parallel applications, within the message-passing environment of their
own selection. In the Fall 2000 course both a three-person team and a two-person team chose to
do the graphical problem of ray tracing in parallel, while the remaining single person chose to
begin the development of an interactive combat simulation. The two ray-tracing teams chose to
develop their programs under PVM, while the combat simulation was based on explicit socket
communications among the cooperating processes. A mandated intermediate design presentation
did provide some useful cross-fertilization in the two ray-tracing projects.
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 6
The Winter 2002 student projects were all lost when the Armada system administrator
accidentally destroyed all user data from the Winter-quarter /home directory. It is only by
chance that the instructor’s own files were copied to a different location before the disk
partitioning that destroyed the earlier information.
SECURITY CONFLICT PROBLEM: RSH (REMOTE SHELL)
The PVM and MPI message passing systems accomplish their tasks in part by remotely
initiating processes on other computers. The classical Unix command for this is “rsh” — and
that is a major security hole within Unix; consequently it is commonly disabled on computers
connected to the Internet. (Since the Armada has computers communicating only on a local
network, there was no problem using “rsh” under that environment.) While an alternative is
available in “ssh” (secure shell), in 2000 that utility was not initially implemented on the
available computers. Even when “ssh” became available on the Internet-accessible computers
used for the course, we never have found a way of avoiding the requirement for a password when
issuing an “ssh” command — the documentation available from several sources does not in fact
work as indicated on the Linux lab machines as they are currently configured.
There is, however, a work-around for PVM, allowing use of those machines for PVM
applications. The PVM system is based on daemons running on each computer within the virtual
machine. Though these are typically started by “rsh” or “ssh” commands from the computer
starting the virtual machine, it is possible to use a maintenance option for starting the virtual
machine: the start option of “manual start,” whereby the user expressly starts each of the
daemons in the virtual machine based on a command string provided by the initial daemon.
MPI, at least under MPICH, did have a significant problem. MPICH initializes the
cooperating processes (without any intervening daemons) by issuing “rsh” or “ssh” commands to
the computers in use. Consequently the Armada can be used easily (since “rsh” is available
there), but the departmental networked computers need to be accessed through “ssh.” As
mentioned above, these connections require that the user provide passwords for each of the “ssh”
commands issued. A significant side effect of this is that C’s stdin and C++’s cin are not
available as communication channels, even for instance zero of the processes running under
MPICH. This was confirmed by developing a program attempting keyboard input that runs
without any problem in the Armada environment, but fails on the departmental networked
computers.
RECOMMENDATION
The rsh/ssh problem makes it highly desirable to teach the course using a private network
within which rsh is available. Those wishing to develop such a system may find Stuart Steiner’s
paper useful. [15] In addition, there is an entire book available on constructing a Beowulf
cluster, which can be used to supplement Stuart Steiner’s comments. [27]
Such a cluster can be constructed with surplus computers (as was the original Boat Anchor
Armada) and still provide a useable instructional environment. Parallel applications do not need
to be computationally intensive to allow students to learn how to generate them. Further, the
private network can provide a useful debugging environment. Once the parallel application has
been debugged, it can be moved to publicly networked computers using one of the ssh
work-arounds (either the “manual start” option under PVM, or avoiding dependence on keyboard
input under MPI).
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 7
Programs developed by the instructor for this course are available through the author’s web
site at Eastern Washington University. [28]
NOTES
Web copies of this paper, with hyperlinks for all URLs, are available:
MS Word: http://penguin.ewu.edu/~trolfe/CCSC2002/Distrib.doc
HTML: http://penguin.ewu.edu/~trolfe/CCSC2002/Distrib.html
RTF: http://penguin.ewu.edu/~trolfe/CCSC2002/Distrib.rtf
[1]
Al Geist and others, PVM: Parallel Virtual Machine — a Users’ Guide and Tutorial
for Networked Parallel Computing (MIT Press, 1994), p. xiv.
[2]
Timothy Rolfe, “PVM: an affordable parallel processing environment,” SCCS
Proceedings: 27th Annual Small College Computing Symposium (SCCS, 1994),
pp. 118-125. Available through http://penguin.ewu.edu/~trolfe/SCCS-94/SCCS-94.html
[3]
http://www.netlib.org/pvm3/book/pvm-book.html
[4]
http://www.netlib.org/pvm3/book/pvm-book.ps
[5]
William Gropp, Ewing Lusk, and Anthony Skjellum, Using MPI — Portable Parallel
Programming with the Message-Passing Interface (2nd edition; MIT Press, 1999).
[6]
Peter S. Pacheco, Parallel Programming with MPI (Morgan Kaufmann Publishers, Inc.,
1997). It is discussed in http://fawlty.cs.usfca.edu/mpi/ — and is an extensive revision and
expansion of A User’s Guide to MPI, available through
ftp://math.usfca.edu/pub/MPI/mpi.guide.ps.Z
[7]
Barry Wilkinson and Michael Allen, Parallel Programming: Techniques and
Applications Using Networked Workstations and Parallel Computers (Prentice-Hall,
Inc., 1999).
[8]
Michael J. Flynn, “Very High-Speed Computing Systems,” Proceedings of the IEEE,
Vol. 54, No. 12 (December 1966), pp. 1901-09.
Michael J. Flynn, “Some Computer Organizations and Their Effectiveness”, IEEE
Transactions on Computers, Vol. C-21, No. 9 (Sep 1972), pp. 948-960.
[9]
Ralph Duncan, “A Survey of Parallel Computer Architectures”, [IEEE] Computer, Vol.
23, No. 2 (Feb 1990), pp. 5-16.
[10] Thuy Trong Le and Tri Cao Huu, “Advances in Parallel Computing For the Year 2000 and
Beyond,” available at http://www.vacets.org/vtic97/ttle.htm — with HTML running title
“A Survey of Parallel Computing: From the Past to the Future”
[11] Thomas Sterling, Donald Becker, Daniel Savarese, et al., “BEOWULF: A Parallel
Workstation for Scientific Computation,” Proceedings of the 1995 International
Conference on Parallel Processing, Vol. 1 (August 1995), pp. 11-14. Available in
PostScript at http://www.beowulf.org/papers/ICPP95/icpp95.ps — the HTML version of
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 8
the paper at the same location appears unable to access the .gif files for its figures.
Jarrett Cohen, "Beowulf Lives On — As a Build-It-Yourself Computer", [NASA]
InSights, November 1998, pp. 2-9.
Rick Cook, "Fast and Cheap", LinuxWorld, April 2000 —
http://www.linuxworld.com/linuxworld/lw-2000-04/lw-04-parallel_p.html
The principle web site for the Beowulf Project is at http://www.beowulf.org/, and many
more resources are available through that site.
[12] Timothy Rolfe, “Queens on a Chessboard: Making the Best of a Bad Situation”, SCCS:
Proceedings of the 28th Annual Small College Computing Symposium (SCCS, 1995),
pp. 201-10. Available through http://penguin.ewu.edu/~trolfe/SCCS-95/SCCS-95.html
[13] SETI-at-home:
articles: http://www.computer.org/cise/articles/seti.htm
http://www.discovery.com/news/features/setiathome/setiathome.html
home page: http://setiathome.ssl.berkeley.edu/
GIMPS (prime number search):
articles: http://www.utm.edu/research/primes/mersenne/index.html
http://www.utm.edu/research/primes/notes/13466917/
home page: http://www.mersenne.org/
Cancer drug configurational studies:
articles: http://www.the-scientist.com/yr2001/may/hand_p1_010514.html
http://more.abcnews.go.com/sections/scitech/DailyNews/screensaver010524.html
home page: http://www.chem.ox.ac.uk/curecancer.html
[14] The instructor referred to the 486-66 computers as “boat anchors”; when the network was
generated, Prof. Steve Simmons suggested extending the nautical reference by using
“armada” as the private network name.
[15] Stuart Steiner, “Building and Installing a Beowulf Cluster,” The Journal of Computing
in Small Colleges [Proceedings of the Third Annual CCSC Northwestern Conference],
Vol. 17, No. 2 (December 2001), pp. 75-83.
[16] A complete chapter is devoted to “embarrassing parallel computations” in Wilkinson and
Allen, op.cit., pp. 82-106.
[17] Home page for NOW Sort: http://now.cs.berkeley.edu/NowSort/
Conference presentation at SIGMOD ’97 (Tucson, Arizona, May, 1997):
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M.
Hellerstein, and David A. Patterson, “High-Performance Sorting on Networks of
Workstations”, http://now.cs.berkeley.edu/NowSort/nowSort.ps (PostScript).
[18] Pacheco, op.cit., pp. 53 ff.
[19] Wilkinson and Allen, op.cit., pp. 99-100.
[20] http://penguin.ewu.edu/~trolfe/CCSC2002/ForkBased/index.html
Printed 07/Mar/2016 at 07:46
“Distributed Multiprocessor Environment”, T. J. Rolfe
Page 9
[21] The main web page for PVM is at http://www.epm.ornl.gov/pvm/. The current version is
available through http://www.netlib.org/pvm3/index.html
[22] Information on MPI itself is available through http://www-unix.mcs.anl.gov/mpi/.
Information on MPICH is available through ftp://ftp.mcs.anl.gov/pub/mpi.
Information on LAM/MPI is available through http://www.mpi.nd.edu/lam/.
MPI — The Complete Reference is available in HTML format through
http://www.netlib.org/utk/papers/mpi-book/mpi-book.html, and can also be downloaded in
PostScript format through http://www.netlib.org/utk/papers/mpi-book/mpi-book.ps.
The Joint Institute for Computational Science at the University of Tennessee (Knoxville)
has a very useful “Beginners Guide to MPI” in 22 pages. The PostScript version is
available at http://www-jics.cs.utk.edu/MPI/MPIguide/MPIguide.ps, while the entry point
for an HTML version is at http://www-jics.cs.utk.edu/MPI/MPIguide/MPIguide.html
(though there seem to be some problems with its figure files).
[23] http://penguin.ewu.edu/~trolfe/CCSC2002/Histogram.html
[24] http://penguin.ewu.edu/~trolfe/CCSC2002/Bakery.html
[25] Andrew S. Tanenbaum, Modern Operating Systems (Prentice-Hall Inc., 1992), p. 264.
[26] http://penguin.ewu.edu/~trolfe/CCSC2002/Train.html
[27] Thomas Sterling, John Salmon, Donald J. Becker and Daniel F. Savarese, How to Build a
Beowulf (MIT Press, 1999), 261 pp.
[28]
http://penguin.ewu.edu/~trolfe/CCSC2002/
Printed 07/Mar/2016 at 07:46
Download