InHouse | A User-Oriented Monitoring Approach G. Haring, G. Kotsis, S. Musil Institut fur Angewandte Informatik und Informationssysteme Universitat Wien Lenaugasse 2/8, A-1080 Vienna, Austria Abstract In this paper we present a new approach for monitoring and visualization of parallel processing systems. Instead of measuring the system and presenting to the user a variety of performance gures and visualization displays afterwards, in this user-oriented approach the events and features of interest may be selected before instrumentation and measurements. Dierent techniques for this a priori ltering of events are provided, ranging from a direct manipulative selection in a graphical, hierarchical representation of the program and the architecture, to a textual, rule based selection. Real-Life Metaphors in visualization make this approach particularly suited for so called novice (i.e. non-expert) users, thus suggesting teaching as one of the major elds of application. 1 Introduction Increasing availability of parallel processing facilities in recent years raises the question of eective and ecient development of parallel software in many elds of application. In the development process of parallel software the critical point is to tune the parallel application to the characteristics of the underlying architecture, so that maximum eciency is gained. This tuning requires several iterations on program development/modication on the one hand, and observing the program behavior during execution on the target architecture on the other hand. Hence monitoring is a proper methodology to support this performance oriented development of parallel programs. The monitoring process is to give the user insight into the performance behavior of the system, covering both hardware and software aspects. The methodological approach focuses on software monitoring, using This work is supported by the Austrian Ministry for Science, Research and Arts, (research grant GZ 613.542/1-26/92). event and time oriented instrumentation techniques. The event specication (both predened and ad hoc user dened) covers various levels of abstraction, ranging from high level language instructions up to the task level. Much has been published in the area of measuring and visualizing the performance and behavior of parallel programs and systems (see e.g. 2], 8]). The shortfalls and problems of existing (software) monitoring and visualization approaches can be formulated as follows: Instrumenting the program code by hand is an error prone and tedious task. Therefore in most monitoring tools the instrumentation is done automatically, but this causes a loss in exibility. The user typically has no inuence on what has to be measured. Selection is only done after monitoring in the visualization phase. The execution of parallel programs typically results in large traceles. Keeping the amount of information contained in the tracele as small as possible is crucial for fast visualization and analysis of data. The user interfaces of these tools very often seem like an unwanted add-on that is done quickly at the end of the development process. Very many of these interfaces are not well usable, especially for unexperienced users. The visualization techniques are not satisfyingly exploited. Often only two dimensional techniques are used to display multidimensional data. To overcome these shortfalls we propose a useroriented monitoring and visualization methodology which has been implemented in a tool called InHouse. InHouse diers from other monitoring tools in that it is directed towards a specic user group, namely the novices to or learners of parallel concepts. This user group calls for a dierent approach to monitoring and representing a parallel system on the screen. Intuitive displays are a must and an easy to understand representation of the parallel system ensures that the user will be able to use the tool and explore the concepts of parallelism very soon. Data displays using traditional visualization techniques like Gantt charts or Kiviat diagrams won't be clear for this kind of users as they use identical visual forms for conveying dierent kinds of information. Furthermore they are very abstract and do not at all help in getting to know the concepts of parallelism. Existing monitoring tools (17], ParaGraph 9], Traceview 16], ParMon 19]) are more directed towards the expert user who has a clear understanding of the parallel system. In this case traditional visualization techniques serve the purpose of conveying information to the user well, as they show a lot of information in a compressed form. The experienced user will know that the various displays show dierent information and interpret them accordingly. Still there are some monitoring or algorithm animation tools that make use of alternative visualization techniques. For example Brown 3] makes use of sound for algorithm animation. Dierent sounds mean dierent states of processors and empirical tests have shown that users are capable of distinguishing a lot of sound patterns after some time. Zabala 22, 23] uses visualization techniques called cave and multiple icons. Both techniques are good for displaying multidimensional data. Clear Description As the users, who as stated be- 2 Studying the User Demands Hierarchy Since the systematic approach to de- The learners of parallel concepts do not have a any or only a vague mental model 1] of a parallel system. They maybe have been told that a parallel system has many processors and that things happen faster, but still they lack a clear imagination of what this system will be like. The mental model is only loosely coupled to known concepts of computers. The only thing that these people know is that one can run programs on that kind of machines and that it is possible to measure certain things in order to check whether the program really runs faster than before. Therefore the users expect the monitoring tool to be usable in two respects: on the one hand the tool must give a clear description of a parallel system and on the other hand it must be easy to use, so that the users can concentrate on fullling their tasks and do not have to think about the usage of the system: fore are not familiar with parallel concepts, need to build up a model of the parallel system with the help of the tool it is important to provide them with a clear description of what is shown on the screen. One of the most intuitive ways to convey new information is by means of known concepts, which share the main features of the new information. By making comparisons and drawing analogies new concepts are learned with a minimum eort. Users want this description to be easily understandable and quickly giving insight. Nothing is more frustrating than not being able to use a tool after half an hour. Ease of Use The user is already occupied enough with learning the new concepts and selecting measures and objects. It absolutely must not be dicult to use the system. Navigation in the parallel system must be as easy as walking around in one's own at. The selection mechanisms must be intuitive and well known in order to reduce cognitive workload. Furthemore the elements of the user interface must be used consistently without any modes. Only an easy to use human-computer interface will ensure that the user can focus on the task and not does not have to learn to tackle the system rst. 3 Principles and Methods We have designed the InHouse system according to the following principles: velop parallel programs implicitly possesses hierarchical features the best idea is to unify all performance information in a regular structure which matches the structure of both the program and the development process. If these requirements are fullled then the proposed structure will allow fast access to the performance information as well as to the source code of the program and will support the tuning of the program. The identication of performance problems will be also facilitated. The hierarchy-based model has multiple advantages: it has a regular structure, provides multiple levels of abstraction, supports multiple views, allows an ease way of moving through the different levels. Architecture Independence In this project a transputer system has been used as parallel architecture and OCCAM as programming language, but the concepts are based on CSP 11, 12, 10] and can therefore be applied for many parallel programming languages and distributed memory multicomputers. The trace data le contains performance data in a tool independent format. This independence guarantees the tool conception to be proof for future developments and changes in visualization requirements by the user, or display techniques of the workstation equipment. It is even suitable to support the use of hardware monitors for collecting measurement data. The proposed information structure was designed to support an object-driven software monitor. The approach for solving the monitoring problem is timebased, i.e. after the run of the program, a set of time stamps corresponding to the program objects is available. By a simple mapping of the time stamps to the developed structure the dynamic behavior of the program can be revealed. This structure includes velevel hierarchy representation. Two basic data structures are used for representing the program on each of the hierarchy levels: graph and tree. The proposed structure is independent on the method of monitoring and may be used in software monitors as well as in hardware and hybrid monitors. It is also language and environment independent. well known to the user and which share the main features of the facts they shall explain. Empirical tests 21] have shown that for transporting new knowledge so called real world metaphors are suited best. A problem oriented user interface takes into account the tasks that have to be accomplished and the target users and oers the according representations and interaction techniques. User interfaces for novices will mostly dier from user interfaces for experts and not every application can be pressed into the corset of the desktop metaphor. Furthermore the input and output devices have to t the used visual representations. What good is a 3D graphics if used solely with a 2D input device and thus awkward to handle? Direct manipulation is the key interaction technique in Information Manipulation Environments. The user has to be able to interact with the things in the environment as directly as possible. This is the most intuitive way of dealing with the world and should be exploited in a computer human interface, too. Usability The tool must be easy to use. It must assist the user in understanding the parallel concepts and in accomplishing the tasks without being bothered by a complicated navigation or selection mechanism. Furthermore the look and feel must be consistent. An overview of a system decomposition hierarchy is shown in gure 1. Every level allows a given program to be examined at the level's resolution which is dened through the objects of interest. The system level is the highest level in the hierarchy. On this level the program is treated as a black box, i.e. the resolution on this level is the program as a whole. The network level is the level under the system level. On this level the system is dened through the set of processors and the set of their interconnecting physical links. Thus, the processors and their links dene the granularity on this level. All the details about the processes inside given processors are hidden from the user. The processor is treated as a black box, which may or may not communicate with its neighbors through its links. Only the static physical communication channels are represents (i.e. the topology of the network). On the process level the parallel program is represented in terms of communicating processes. The level resolution is dened through the processes, treated as black boxes, and their channels. The details about the process structure are hidden to the user. Groups of processes or individual processes are objects of interest on this level. In fact, this level implies Information Manipulation Environment This is a specic kind of user interface that has the following features 18]: it is highly visual, metaphor-oriented, application-specic and direct manipulative. An IME ensures the usability of the InHouse system with all its four features. A highly visual interface is easy to understand. The user will be presented with information in graphical form and will not solely see text on the screen. Graphic representation provide the biggest mental bandwidth 20]. The human brain is specialized for perceiving visual information and can therefore take up more information at once. Furthermore visual concepts are a very general concepts and are not constrainded by standard look and feel components. A metaphor oriented interaction environment is superbly suited for the purposes of a tool for learners. Metaphors 4, 5, 14, 15] work by explaining facts, which are completely new or rather unusual for the user, by means of some other facts, which are very 4 Description of InHouse 4.1 Hierarchical Concept Hotel ... System Processor 1 Processor 2 Process 1 Process 2 Statement 1 Statement 2 Processor n Process n Statement n has Rooms ... containing Suitcases ... containing Pieces Figure 1: System Hierarchy and Metaphorical Representation the conceptual solution of the application problem. At the lowest level, the command level , every process is a set of spatially ordered set of commands. The structure of the process representation on this level should reect this order. All the details about the process structure are visible to the user. Any command may be treated as an object of interest. Since this granularity would be too ne in most cases, several rules for \packing" sequences of commands into larger objects can be applied. For example, a sequential set of commands which does not include input or output command(s) may be treated as a single object in order to reduce the number of time measurements. 4.2 Scenario, Walk through Interaction with InHouse can be split into three phases: 1. Object Specication Phase: In this phase the users can select which parts of the program shall be monitored and which can be left out. Users shall be able to select the information they are interested in two ways (the meaning of the third way will be explained later). On the one hand there should be a possibility to navigate through a representation of the program and directly select the pieces of interest. On the other hand users should be allowed to select measures they are interested in without having to think about how this measure will be obtained. This will be done by the program. We will call the rst way of selection \object selection", the second one \measure selection". 2. Monitoring Phase: The source code is instrumented with the probes and run on the target system, thus producing a trace le for later analysis. 3. Visualization Phase: The trace le is analyzed and the results are shown in various displays. Dierent views on the program shall provide the deepest insights and allow for an easy detection of errors or bottlenecks. Again, users shall be allowed to select from a variety of options the displays that seem most useful to them. It is important to notice that only information that has been requested during object specication phase will be available in this phase. So it can happen, that some displays won't be available for a certain program run, as the necessary information for them is not available. Only two of these three phases call for user interaction, namely Object Specication and Visualization. In the following the InHouse system will be introduced Figure 2: Selection by Direct Manipulation from the user's point of view. First some general discussion on the user interface is given and then the two for the user most important phases will be explained in more detail. We decided to design a metaphor based, WIP (Windows, Icons, Pointing), 2 1/2 D IME for the monitoring tool. We used metaphors because we had our design target in mind: explaining something new or unfamiliar to users. As stated before, metaphors are useful in means of transporting new knowledge, if not the only way of learning something radically new. Our interaction environment is based on the precondition that visual information will be memorized more easily and is an ecient part of intelligent behavior. Therefore most of the system's functions are presented by visual means The goal was to use active and exible visual elements based on metaphorical semantics, which aren't constrained to specic sets of widgets. It should be possible to use the tool with what is nowadays a standard hardware equipment in order to allow most students to use the tool. A one button input device is used to provide for usage over a variety of platforms. We use 2 1/2 D only, because real 3D environments require a real 3D input device to be used appropriately. Using a 3D environment with a mouse is not an appropriate form. The rst step in designing the interaction environment was to nd an appropriate metaphor for the parallel system. We arrived at a pool of metaphors for representing this structure, which for example included things like tree, city, prison, school or hotel. We then concentrated on nding out in which metaphor our problem would be represented best and if and how we could bring in the missing parts of the functionality. After some internal discussions assisted by certain techniques and some informal interviews with future users, we arrived at choosing the hotel metaphor, because all other metaphors either lacked functions or had too many features, that were not part of the problem. The hotel was accepted, because it resembled the parallel system hierarchy best and because it allowed an easy and understandable metaphor in order to provide all the demanded functions. The hotel now stands for the complete parallel system, the rooms represent processors. The size and location of a room currently have no meaning, but could have. Size could show whether this is a processor with small or large computing capability, location on oors could help in representing groups of processors. Pro- cessors are connected via hard-links that allow them to exchange messages. The rooms are connected via inhouse telephones. Every room has a phone. During the selection phase a phone will automatically call all its connected partners when selected, thus showing the hardware layout. The contents of a room are suitcases, which represent the processes that are mapped onto a processor and will be executed there during runtime. We preferred suitcases to pieces of furniture, because furniture is normally placed into a room only once (in a while) whereas the processes (now suitcases) change their location more often. As there is no xed succession of execution of processes in our parallel system, we can distribute the suitcases randomly in the room and can also allow the user to move them around. Suitcases are currently all the same size, but we could use size for indicating the number of code-lines of a process. Processes are connected via softlinks. We represent the softlinks with suitcase labels of the sort, that are used at airports to send suitcases to the right place. The idea behind that is that suitcases with the same destination can talk with each other. A literature survey brought us to jigsaw puzzle pieces, that are successively used in visual programming 6, 7, 13], as being the contents of the suitcases. The decision of packing jigsaw pieces into the suitcases might seem strange at rst sight, but their characteristics make the choice obvious. They have a xed sequence and various shapes for indicating types of code constructs. As specied in the system architecture, there should also be a means to regulate the degree of detail that is shown in the code representation. This was also provided for with the puzzle pieces. The level of detail can be changed interactively in the program, but only for all puzzle piece windows at once. The sets will then be updated accordingly. The current level of detail is also shown in the iconized version of the detail selection panel, so that users always know on which level they operate. After having found a representation for the parallel system, the selection mechanisms had to be worked out. Object selection meant browsing through the representation of the parallel system and selecting the pieces of interest. Objects therefore had to provide two functions: Open (if necessary) and select. To open a room (entering it), the window blind must be clicked, to select the room the light has to be turned on inside by clicking on the lower part of the window. Suitcases are opened by clicking the lock and are selected by putting them into an upright position. The complete set of puzzle pieces is always shown in the suitcase window when the suitcase is opened. Users can zoom and move around in the set using a certain knob and arrows. Puzzle pieces currently can't be opened to see the underlying codelines. They are selected by simply clicking on them, which makes them blink for a certain time. As stated before, we purposely didn't apply any additional text and relied on colour coding only. We realize that for programs with more processes on more processors (massively parallel programs) labeling the items will be necessary, but we are interested in how long users are still able to remember the colours. Still, we help remembering by showing the mother node(s) (the according suitcase and/or room) in the left hand corner of each window. Telephones and suitcase labels can only be selected. The telephones have a lifted receiver then, the suitcase labels are highlighted. Figure 2 shows a typical screen of direct selection. Measure selection is done with a selection bar that works like a number lock for bycicles. Every measure is a combination of three items. The rst determines the type of the measure, which can be duration or count, the second determines the state of an object (if it can have more than one), which can be idle, waiting, active and communicating respectively. The third item determines from which object, or which group of objects this measure shall be calculated. If an object has to be specied more precisely, an extra window will pop up, that shows all options. For example if a measure for one process (suitcase) is selected all available suitcases sorted after processors will be displayed. Senseless combinations are detected by the system and can't be selected, as a red sign is over the checkbox. If a certain measure has been selected, a green mark will be over the checkbox when this combination is chosen. All items are represented by visual objects Measure selection and direct selection semantically overlap. Selecting an object directly is like selecting a lot of measures for this object via measure selection, whereas selecting an object via measure selection means selecting an object partially via direct selection. This overlapping has to be made visible to the users. We do this on the one hand by \locking" already selected objects in measure selection, which means that they are shown in the selected state, but can't be deselected. They are \locked" by a lock. On the other hand we show in an animation, which measures already have been selected for this or that object. The animation can be seen when clicking on the icon in the lower left corner of the desired object's window. This small icon will be in selected state if measures have been selected and in unselected state if not. A small Figure 3: Sample Screen of an InHouse Session separate window then pops up under the icon, which looks like the icon bar without the handles, and continuously displays the selected measures until exited. As for the visualization phase the visual elements were reused and thus the represenation for the parallel system. The displays should be as intuitive as possible, as informative as possible and should provide the users with as many displays as possible. It turned out that the displays could be categorized into ve dierent groups. Status Information: The status of an object at every time during the run will be displayed dynamically. The display of communication partners could also fall within this group. Duration Information: The duration of idle, waiting etc. periods of a certain object will be displayed in a graphical form, to be viewed statically. Counting Information: This kind of diplays gives information about how many objects of a kind were active at a time, again dynamically. Statistical Information: These displays inform the user about the average performance of an object. All data that has been gathered for an object, will be shown in numerical form, if useful assisted by a graphic form. These displays will be static ones. Load Information: These displays show how much a certain object was used, how large the passed messages were etc. These diplays will be both static and dynamic. The dynamic displays are regulated with the tape tool. This tool is used like a normal VCR control panel. A jog shuttle is used for regulating the speed of slow motion. The tape tool is an extension to our hotel metaphor, but doesn't spoil the eect, as VCRs can be expected to be found in a hotel nowadays. We have implemented the status displays. Figure 3 shows a sample screen with the tape tool in use and some status displays open. On the left hand side there are the icons for the ve display groups.The users can select status displays on all levels, grouped or single items and then replay the program as often as they want. 5 Conclusions This project has successfully developed a system for the monitoring of parallel programs that is centered around the user. Users have control over the whole monitoring and visualization process. They actively control what is measured, they control what they want to have visualized and they control the dynamics of this visualization. An alternative user interface approach has been chosen, that focuses on so-called novice users. Their needs are explicitly supported. Easy comprehension of the parallel concepts is provided by taking up a metaphor based approach, good remembrance by a small visual alphabet, that is used both in the selection and the visualization phase. Further activities in the realm of monitoring and visualization of results could include the development of an agent-based visualization system. The agent automatically detects bottlenecks and points the user to interesting displays and execution phases. Furthermore one could try to reuse the globally dened system architecture with another underlying parallel architecture. A design solution for massively parallel systems could also be developed. From the human computer interaction designer's point of view this project has shown, that there is still a need to dene design processes for non-standard, metaphor-based interaction environments, as is the user interface of InHouse. There are no real methods to be found, only some vague guidelines, and it would be a big challenge to work out such a design process, that not only shows what to do, but also how to accomplish the design goals. References 1] J.R. Anderson. Kognitive Psychologie. Spektrum der Wissenschaft, Germany, 1989. 2] H. Blaschek, G. Drsticka, A. Ferscha, and G. Kotsis. Visualization of parallel program behavior. Technical report, University of Vienna, Dept. of Appl. Comp. Sci., 1992. Internal Report. 3] Marc H. Brown and John Hershberger. Color and sound in algorithm animation. IEEE Computer, pages 52{63, December 1992. 4] J.M. Carroll. Metaphors and user interface design. In M. Helander, editor, Handbook of Human Computer Interaction, page ?? Elsevier Science Publisher, 1988. 5] T.D. Erickson. Working with user interface metaphors. In B. Laurel, editor, The Art of Human Computer Interface Design, pages 65{73. Addison Wesley, 1990. 6] E.P. Glinert. Out of atland: Towards 3-d visual programming. In E.P. Glinert, editor, Visual Programming Environments, Applications and Issues, pages 547{554. IEEE Computer Society Press, 1990. 7] E.P. Glinert, M.M. Blattner, and C.J. Frerking. Visual tools and languages: Directions for the 90ies. In Proceedings of the IEEE Workshop on Visual Languages, pages 89{95. IEEE Computer Society Press, 1991. 8] G. Haring and G. Kotsis, editors. Performance Measurement and Visualization of Parallel Systems, volume 7 of Advances in Parallel Computing, G. R. Joubert, Udo Schendel (Series Eds). North Holland, 1993. 9] Michael T. Heath and Jennifer A. Etheridge. Visualizing the performance of parallel programs. IEEE Software, 8(5):29{39, September 1991. 10] C. A. R. Hoare. Notes on Communicating Sequential Processes, volume 14 of NATO ASI Series F. Springer, Berlin, Heidelberg, August 1985. 11] Charles Anthony Richard Hoare. Communicating sequential processes. Communications of the ACM, 21(8), August 1978. 12] Charles Anthony Richard Hoare. Communicating Sequential Processes. Series in Computer Science. Prentice Hall International, UK, 1985. 13] C.M. Holt. Visualizing an algebra of objects. In Proceedings of IEEE Workshop on Visual Languages, pages 225{ 230. IEEE Computer Society Press, 1991. 14] B. Indurkhya. Metaphor and Cognition. Kluwert Academic Press, NL, 1992. 15] G. Lako and M. Johnson. Metaphors We Live By. University of Chicago Press, USA, 1980. 16] Allen D. Malony, David H. Hammerslag, and David J. Jablonowski. Traceview: A trace visualization tool. IEEE Software, 8(5):19{28, September 1991. 17] M. A. Mayoski, E. Luque, R. Suppi, J. Sorribes, and M. A. Senar. Simulation and visualization tools for link-based parallel architectures. In Proceedings of EUROMICRO 91, 1991. 18] S. Musil and M. Tscheligi. How to Design Non Standard Interaction Environments. CHI '94 Tutorial Notes, USA, 1994. 19] E. Pozzetti, G. Serazzi, and V. Vetland. Parmon: Un monitor per programmi paralleli. Technical Report 3/148, Politecnico di Milano, November 1994. 20] G. Rohr. Using visual concepts. In S. Chang, T. Ichikawa, and P. Ligmenides, editors, Visual Languages, pages 325{ 348. Plenum Press, 1986. 21] K. Vaananen. Metaphor based user interfaces for multimedia environments. Position paper, Interchi '93 Workshop on Spatial Metaphors for User Interfaces, January 1993. 22] E. Zabala and R.W. Taylor. Maritxu: Generic visualization of highly parallel processing. In Programming Environments, pages 171{180. North Holland, 1992. 23] E. Zabala and R.W. Taylor. Power tools: New generation data presentation tools. In Proceedings of HCI'93 People and Computers, page ?? Cambridge University Press, 1993.