A Software Model and Specification Language for Non-WIMP User Interfaces Robert J.K. Jacob Leonidas Deligiannidis Stephen Morrison Department of Electrical Engineering and Computer Science Tufts University Medford, Mass. Abstract We present a software model and language for describing and programming the fine-grained aspects of interaction in a non-WIMP user interface, such as a virtual environment. Our approach is based on our view that the essence of a non-WIMP dialogue is a set of continuous relationships—most of which are temporary. The model combines a data-flow or constraint-like component for the continuous relationships with an event-based component for discrete interactions, which can enable or disable individual continuous relationships. To demonstrate our approach, we present the PMIW user interface management system for non-WIMP interactions, a set of examples running under it, a visual editor for our user interface description language, and a discussion of our implementation and our restricted use of constraints for a performance-driven interactive situation. Our goal is to provide a model and language that captures the formal structure of non-WIMP interactions in the way that various previous techniques have captured command-based, textual, and event-based styles and to suggest that using it need not compromise real-time performance. CR Categories and Subject Descriptors: D.2.2 [Software Engineering]: Tools and Techniques—user interfaces; H.1.2 [Models and Principles]: User/Machine Systems—human -2factors; H.5.2 [Information Interfaces and Presentation]: User Interfaces; I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism—virtual reality; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs— specification techniques General Terms: Human Factors, Languages, Design Additional Key Words and Phrases: User interface management system (UIMS), interaction techniques, specification language, state transition diagram, virtual reality, non-WIMP interface, PMIW 1. INTRODUCTION “Non-WIMP” user interfaces, such as virtual environments, are characterized by parallel, continuous interactions with the user. However, most current user interface description languages (UIDLs) and software systems are based on serial, discrete, token-based models. This paper proposes and tests a two-component model for describing and programming the finegrained aspects of non-WIMP interaction. The model combines a data-flow or constraint-like component for the continuous relationships with an event-based component for discrete interactions, which can enable or disable individual continuous relationships. Its key ingredients are the separation of non-WIMP interaction into two components and the framework it provides for communication between the two. As will be seen in the paper, our model abstracts away many of the details of specific input devices and treats them only in terms of the discrete events they produce and the continuous values they provide. It thus provides a high-level framework that can reduce the current difficulties of integrating novel devices into virtual reality applications, provided the -3devices can be fit into our model of discrete events and/or continuous variables. Thus far we have not found one that does not fit. It will also be seen, particularly in our first example, that the model is applicable for describing elements of WIMP-style user interfaces that have continuous inputs and outputs, such as a scrollbar or slider. This paper discusses our model and our implementation of it as follows: • Our software model for capturing non-WIMP style interactions (Section 2), as first introduced in[35] • A user interface description language that embodies it (Section 3) • A programming environment we have developed for this language (Section 4) • Some examples to illustrate the expressiveness or usefulness of the language for describing non-WIMP interactions (Section 5) • Implementation issues and our use of constraints in virtual environments and similar high-performance interactive situations (Section 6). 1.1. Background “Non-WIMP” user interfaces provide “non-command,” parallel, continuous, multi-mode interaction—in contrast to current GUI or WIMP (Window, Icon, Menu, Pointer) style interfaces[19]. This interaction style can be seen most clearly in virtual reality interfaces, but its fundamental characteristics are common to a more general class of emerging user-computer environments, including new types of games, musical accompaniment systems, intelligent agent interfaces, interactive entertainment media, pen-based interfaces, eye movement-based -4interfaces, and ubiquitous computing[33, 47]. They share a higher degree of interactivity than previous interfaces: continuous input/output exchanges occurring in parallel, rather than one single-thread, discrete-event dialogue. Our goal is to develop and implement a model and abstraction that captures the formal structure of non-WIMP interaction in the way that existing techniques have captured commandbased, textual, and event-based interaction. Most current (WIMP) user interfaces are inherently serial, turn-taking (“ping-pong style”) dialogues with a single input/output stream. Even where there are several devices, the input is treated conceptually as a single multiplexed stream, and interaction proceeds in half-duplex, alternating between user and computer. Users do not, for example, meaningfully move a mouse while typing characters; they do one at a time. Non-WIMP interfaces are instead characterized by continuous interaction between user and computer via several parallel, asynchronous channels or devices. Because interaction with the new systems can draw on the user's existing skills for interacting with the real world, they offer the promise of interfaces that are easier to learn and to use. However, they are currently making interfaces more difficult to build. Advances in user interface design and technology have outpaced advances in models, languages, and user interface software tools. The result is that, today: previous generation command language interfaces can now be specified and implemented very effectively; current generation direct manipulation or WIMP interfaces are now moderately well served by user interface software tools; and the emerging concurrent, continuous, multi-mode non-WIMP interfaces are hardly handled at all. Most of today's examples of non-WIMP interfaces, such as virtual reality systems, have of necessity been designed and implemented with event-based models more suited to previous interface styles. Because those models fail to capture continuous, parallel interaction explicitly, -5the interfaces have required considerable ad-hoc, low-level programming approaches. While some of these are very inventive, they have made such systems difficult to develop, share, and reuse. We seek techniques and abstractions for describing and implementing these interfaces at a higher level, closer to the point of view of the user and the dialogue, rather than to the exigencies of the implementation. 1.2. Specifying the New Interfaces It is not difficult to see why current specification languages and user interface management systems (UIMSs) have not been applicable to non-WIMP interaction. Consider the characteristics of current vs. non-WIMP interfaces: • Single-thread input/output vs. parallel, asynchronous, but interrelated dialogues • Discrete tokens vs. continuous and discrete inputs and responses • Precise tokens vs. probabilistic input, which may be difficult to tokenize • Sequence, not time, is meaningful vs. real-time requirements, deadline-based computations • Explicit user commands vs. passive (“non command-based”) monitoring of the user. On each of these counts, the characteristic of the traditional interaction styles corresponds to a characteristic of non-interactive programming languages processed by compilers. Indeed, much of current UIMS technology is built around compiler technology—processing of a single stream of discrete tokens via a single BNF (Backus-Naur Form) or ATN (Augmented Transition Network) syntax specification. Non-WIMP interfaces violate each of these assumptions and thus are not well served by compiler-based approaches. -6For example, current UIMS technology typically handles multiple input devices by serializing all their inputs into one common stream. This is well suited to conventional dialogue styles but is less appropriate for styles where the inputs are logically parallel (that is, where the user thinks of what he or she is doing as two simultaneous actions). Parallel dialogues could still be programmed within the old model, but it would be preferable to be able to describe and program them in terms of logically concurrent (but sometimes interrelated) inputs, rather than a single serialized token stream. In a similar vein, it could be said that parallel processes can be programmed by explicitly time slicing each process. But it is unusual today to write parallel processes with explicit time slicing. Instead we write our parallel programs on top of the process abstraction, assuming parallelism. A separate layer then handles the transformation onto a single physical processor. We seek similar kinds of abstractions for user interface software. Our model combines the applicable aspects of constraint-based and event-based user interface description languages into a framework intended to match the fine-grained properties of non-WIMP dialogues. Existing techniques could have been extended in various ad-hoc ways to describe the unusual aspects of non-WIMP dialogues. However, the real problem is not just to find some way to describe the user interface (since, after all, nearly any programming language could do that), but to find a language that captures the user's view of non-WIMP interaction as perspicuously as possible. 1.3. Underlying Properties of Non-WIMP Interactions To proceed, we need to identify the basic structure of non-WIMP interaction as the user sees it. What is the essence of the sequence of interactions in a non-WIMP interface? We posit that it is a set of continuous relationships, most of which are temporary. -7For example, in a virtual environment, a user may be able to grasp, move, and release an object. The hand position and object position are thus related by a continuous function (say, an identity mapping between the two 3-D positions)—but only while the user is grasping the object. A scrollbar in a conventional graphical user interface can also be viewed this way. The y coordinate of the mouse and the region of the file being displayed are related by a continuous function (a linear scaling function, from 1-D to 1-D), but only while the mouse button is held down (after having first been pressed within the scrollbar handle). The continuous relationship ceases when the user releases the mouse button. Some continuous relationships are permanent. In a conventional physical control panel, the rotational position of each knob is permanently connected to some variable by a continuous function (typically a linear function, mapping 1-D rotational position to 1-D). In a cockpit flight simulator, the position of the throttle lever and the setting of the throttle parameter are permanently connected by a continuous function. The essence of these interfaces is, then, a set of continuous relationships some of which are permanent and some of which are engaged and disengaged from time to time. These relationships accept continuous input from the user and typically produce continuous responses or inputs to the system. The actions that engage or disengage them are typically discrete (pressing a mouse button over a widget, grasping an object). 2. SOFTWARE MODEL Most current specification models are based on tokens or events. Their top-down, triggered quality makes them easy to program. But we have seen that events are the wrong model for describing some of the interactions we need; they are more perspicuously described as -8declarative relationships among continuous variables. Non-WIMP interface styles tend to have more of these kinds of interactions. Therefore, we need to address the continuous aspect of the interface explicitly in our specification model. Continuous inputs have often been treated by quantizing them into a stream of “change-value” or “motion” events and then handling them as discrete tokens. Instead we want to describe continuous user interaction as a first-class element of our model. We describe these types of relationships with a data-flow graph, which connects continuous input variables to continuous application (semantic) data and, ultimately, to continuous outputs, through a network of functions and intermediate variables. The result resembles a plugboard or wiring diagram or a set of one-way constraints. Such a model also supports parallel interaction implicitly, because it is simply a declarative specification of a set of relationships that are in principle maintained simultaneously. (Maintaining them all on a single processor within required time constraints is an issue for the implementation and is discussed below, but it should not arise at this level of the specification.) Note that trying to describe the whole interface in purely continuous terms or purely discrete terms would be entirely possible, but inappropriate. For example: • In the extreme, all physical actions can be viewed as continuous, but we quantize them in order to obtain discrete inputs. For example, the pressing of a keyboard key is a continuous action in space. We quantize it into two states (up and down), but there is a continuum of underlying states, we have simply grouped them so that those above some point are considered “up” and those below, “down.” We could thus view a keyboard interface in continuous terms. However, we claim that the user model of keyboard input is as a discrete operation; the user thinks simply of pressing a key or not pressing it. -9• Similarly, continuous actions could be viewed as discrete. All continuous inputs must ultimately be quantized in order to pass them to a digital computer. The dragging of a mouse is transmitted to the computer as a sequence of discrete moves over discrete pixel positions and, in typical window systems, processed as a sequence of individual discrete events. However, again, we claim that the user model of such input is a smooth, continuous action; the user does not think of generating individual “motion” events, but rather of making a continuous gesture. Non-WIMP interactions (as well as some dragging interactions in WIMP interfaces, see Section 3.1) convey a sense of continuous interaction to the user, and our concern is capturing this continuous quality directly in the UIDL. At the implementation level, the hardware inputs and outputs are still realized as a series of discrete events. For example, grasping an object and moving it in 3-space appears to be a continuous interaction and ought to be specified that way in the UIDL; but, to the underlying software, it is ultimately implemented as a discrete series of input events. This leads to a two-part model of user interaction. One part is a graph of functional relationships among continuous variables. Only a few of these relationships are typically active at one moment. The other part is a set of discrete event handlers. These event handlers can, among other actions, cause specific continuous relationships to be activated or deactivated. A key issue is how the continuous and discrete domains are connected, since a modern user interface will typically use both. The main connection between the two in our model is the way in which discrete events can activate or deactivate the continuous relationships. Purely discrete controls (such as pushbuttons, toggle switches, menu picks) also fit into this framework. They are described by traditional discrete techniques, such as state diagrams and - 10 are covered by the discrete event handler part of our model. That part serves both to engage and disengage the continuous relationships as well as to handle the truly discrete interactions. Our contribution, then, is a model for combining data-flow or constraint-like continuous relationships and token-based event handlers. Its goal is to provide a language that integrates the two components and maps closely to the user's view of the fine-grained interaction in a nonWIMP interface. Our model and language are intended to be independent of the choice of constraint solving mechanism used to implement it. In fact, we have built several different solvers and can use them interchangeably, as discussed in Section 6. The model, UIDL, and examples given from here through Section 5 are intended not to depend on the solver (but see Section 5.5 for discussion of one potential type of solver dependency at the UIDL level). The basic model is: • A set of continuous user interface Variables, some of which are directly connected to input devices, some to outputs, and some to application semantics. Some variables are also used for communication within the user interface model (within or between the continuous and discrete components); and some variables are simply interior nodes of the graph containing intermediate results. • A set of Links, which contain functions that map from continuous variables to other continuous variables. A link may be operative at all times or may be associated with a Condition, which allows it to be turned on and off in response to other user inputs. This ability to enable and disable portions of the data-flow graph in response to user inputs is a key feature of the model. • A set of EventHandlers, which respond to discrete input events. The responses may include producing outputs, setting syntactic-level variables, making procedure calls to the - 11 application semantics, and setting or clearing the conditions, which are used to enable and disable groups of links. The model is cast in an object-oriented framework. Link, Variable, and EventHandler each have a separate class hierarchy. Their fundamental properties, along with the basic operation of the software framework (the user interface management system) are encapsulated into the three base classes; subclasses allow the specifier to define particular kinds of Links, Variables, and EventHandlers as needed. While Links and Variables are connected to each other in a graph for input and output, they comprise two disjoint trees for inheritance; this enhances the expressive power of the model. The model provides for communication between its discrete (event handlers) and continuous (links and variables) portions in several ways: • As described, communication from discrete to continuous occurs through the setting and clearing of Conditions, which effectively re-wire the data-flow graph. • In some situations, there are analogue data coming in, being processed, recognized, then turned into a discrete event. This is handled by a communication path from continuous to discrete by allowing a link to generate tokens which are then processed by the event handlers. A link function might generate a token in response to one of its input variables crossing a threshold. Or it might generate a token when some complex function of its inputs becomes true. For example, if the inputs were all the parameters of the user's fingers, a link function might attempt to recognize a particular hand posture and fire a token when it was recognized. - 12 • Finally, as with augmented transition networks and other similar schemes, we provide the ability for continuous and discrete components to set and test arbitrary user interface variables, which are accessible to both components. A further refinement expresses the event handlers as individual state transition diagrams. Such state diagram-based event handlers may be intermixed with other, arbitrary forms of event handlers. Using state diagram event handlers also leads to another method of integrating the continuous and discrete components. Imagine that each state in the state transition diagram had an entire data-flow graph associated with it. When the system enters that state, it begins executing that data-flow graph and continues until it changes to another state. The state diagram can then be viewed as a set of transitions between whole data-flow graphs. We have already provided the ability to enable and disable sets of links in a data-flow graph by explicit action. If we associate such sets with states, we can automatically enable and disable the links belonging to a state whenever that state is entered or exited. This is simply a shorthand for setting and clearing the conditions with explicit actions, but it provides a particularly apt description of moded continuous operations (such as grab, drag, and release) and will be the basis for an alternate form of our user interface description language. 3. USER INTERFACE DESCRIPTION LANGUAGE We have developed a language based on this model and are implementing it in several forms. The main form of the language is a visual one, for which we show an interactive graphical editor in Section 4. The language can also be used in an SGML-based text form (Section 4.2), which is intended both for user input and as an intermediate language for use by our graphical tools, or directly as a set of C++ classes. - 13 3.1. Expository Example To introduce the elements of the graphical version of our language, we begin by considering a simplified slider widget from a conventional WIMP interface. If the user presses the mouse button down on the slider handle, the slider will begin following the y coordinate of the mouse, scaled appropriately. It will follow the mouse continuously, truncated to lie within the vertical range of the slider area, directly setting its associated semantic-level application variable as it moves. We view this as a functional relationship between the y coordinate of the mouse and the position of the slider handle, two continuous variables (disregarding their ultimate realizations in pixel units). This relationship is temporary, however; it is only enabled while the user is dragging the slider with the mouse button down. Therefore, we provide event handlers in the form of a state transition diagram to process button-down and button-up events and enable and disable the continuous relationship. Figure 1 shows the specification of this simple slider in our visual notation, running on our graphical editor, VRED. The upper portion of the screen shows the continuous portion of the specification using ovals to represent variables, rectangles for links, and arrows for data flows. The name of each variable is shown under its oval, and, below that, in upper case letters, its kind. The kind can be one of: INPUT, OUTPUT, SEM, SYNT, CONST, or INT to indicate its role in the user interface as, respectively: an actual device input, a variable that affects the display directly, semantic data shared with the application, syntactic data shared among components within the user interface, a constant, or a random interior node of the graph. The name of each link is shown under its rectangle and, below that, in upper case letters, the name of the condition under which it will be activated or else ALWAYS, meaning it is always active. - 14 The lower portion shows the event handler in the form of a state diagram, with states represented as circles and transitions as arrows; further details of this state diagram notation itself are found in[30, 31]. The state diagram shows, inside each state, in upper case letters, the name of a condition that is activated when this state is entered; names in lower case letters are just state names not associated with conditions. Each arc has a token and, optionally, a Boolean expression that must be true to take this transition and an action that will be executed if the transition is taken. Figure 1. Specification of a simple slider, running in the VRED editor, to illustrate our graphical notation. The upper half of the screen shows the continuous portion of the specification, using ovals to represent variables, rectangles for links, and arrows for data flows. The lower portion shows the event handler in the form of a state diagram, with states represented as circles and transitions as arrows. - 15 There is additional information, such as the types of each of the variables, the different input and output slots of each link in case it has more than one, and the body of the link itself (which is usually written as several lines of C++ code to be evaluated on demand). This information is entered and viewed through dialogue boxes for each link, variable, flow state, and transition. The layout of the elements on the screen is at the user's discretion, much like the arrangement of white space in a conventional programming language. Only the topology or connectivity of the elements in the diagram is meaningful to the run-time system. The continuous relationship for this slider is divided into two parts. The relationship between the mouse position and the value variable in the application semantics is temporary, while dragging; the relationship between value and the displayed slider handle is permanent. Because value is a variable shared with the semantic level of the system, it might also be changed by the application or by function keys or other input, and the slider handle would still respond. The variable mouse is an input variable, which always gives the current location of the mouse; handlepos is an output variable, which controls where the slider handle is drawn on the display. The underlying user interface management system will automatically keep the mouse variable updated based on mouse inputs and the position of the slider handle updated based on changes in handlepos. The link mousetoval contains a simple scaling and truncating function that relates the mouse position to the value of the controlled variable; the link is associated with the condition name DRAGGING, so that it can be enabled and disabled by the state transition diagram. The link valtoscrn scales the variable value back to the screen position of the slider handle; it is always enabled. The discrete portion of this specification is given in the form of a state transition diagram, although any other form of event handler specification may be used interchangeably in the - 16 underlying system. In the start state (st) it accepts a MOUSEDN token that occurs while the mouse is within the slider handle and makes a transition to a new state, in which the DRAGGING condition is enabled. As long as the state diagram remains in this state, the mousetoval link is enabled, and the mouse is connected to the slider handle, without the need for any further explicit specification. The MOUSEUP token will then trigger a transition to the initial state, causing the DRAGGING condition to be disabled and hence the mousetoval relationship to cease being maintained automatically. (The condition names like DRAGGING provide a layer of indirection that is useful when a single condition controls a set of links; in this example there is only one conditional link, mousetoval. A link can also be associated with more than one condition; it would then be enabled when any of those conditions was enabled.) This very simple example illustrates the use of separate continuous and discrete specifications and the way in which the enabling and disabling of the continuous relationships by the state diagram provides the connection between the two. Abowd[1] and Carr[4, 5] also present specifications of sliders which separate their continuous and discrete aspects in different ways (see Section 7), and the Kaleidoscope constraint language[15] can support temporary constraints roughly similar to the one in this example. Figure 2 shows an alternative form of the visual language. (Unlike Figure 1, a visual editor for this form is not implemented; Figures 2 and 3 are illustrations to describe this language, rather than editor screendumps.) This form of the language unifies the two components into a single representation by considering each state in the state transition diagram to have an entire data-flow graph associated with it. When the system enters a state, it begins executing that data-flow graph and continues until it reaches another state. The state diagram can then be viewed as a set of transitions between whole data-flow graphs. As noted previously, this provides - 17 a particularly apt description of moded continuous operations like engaging, dragging, and releasing the slider handle. The nested diagram approach follows that of Citrin[7], although in this case it is confined to two levels, and each level has a different syntax. One obvious drawback of this type of language is that it is difficult to scale the graphical representation to fit a more complex interface into a single static image. For interactive use, a zoomable editor would address this problem, as would rapid continuous zooming, such as provided by the PAD++ system[2], or head-coupled zooming, as in the pre-screen projection technique[25]. Figure 3 shows the interface from Figure 2, zoomed in on the first state, with its enclosed data-flow diagram now visible for editing. Figure 2. The same slider as in Figure 1, illustrating an alternate form of our graphical notation. Here, the large circles represent states, and the arrows between them represent transitions. Each state contains a data-flow graph showing the data flows that are operational while the system is in that state. - 18 - Figure 3. The example interface from Figure 2 after zooming in to edit the data-flow graph within one of the states. 4. PROGRAMMING ENVIRONMENT 4.1 The VRED Editor The VRED editor provides a visual programming environment for our language. It implements both a data flow graph editor (also called a plugboard for its similarity to a digital logic prototyping bench) and a state diagram editor. Simple forms and menus are provided for the programmer to facilitate the entry of text based information such as element names and programmer annotations. Figure 1 as well as the UIDL figures below are all screendumps from the VRED editor, showing its base screen. The plugboard editor is the upper of the two drawing areas. Variables are represented by ellipses, links by rectangles, and data flows by arrows. Each flow also has three small rectangular handles for controlling the curvature of the arc. The first and third handle serve as smooth anchor points for splines while the middle handle is capable of - 19 generating a cusp in the arc. The state diagram editor is the lower of the two drawing areas. States are represented by circles (with resizing handles), and transitions by arcs, labeled with their tokens, conditions, or actions. A dialogue box can be brought up for a selected variable, link, flow, state, or transition. For a link, the dialogue box allows the user to enter a C++ code fragment as the link body; this becomes the body of the Evaluate() method. For a state transition, the dialogue box allows an optional Boolean condition (in the form of a C++ expression that returns true or false) and/or an optional action to be taken if the transition is made (as a C++ code fragment). New variables, links, flows, states, and transitions are added with menu commands and given placements (or routings, for arcs), which may be modified by dragging. When a new data flow is created, the user is prompted to select the start and end nodes node for the data flow. If one of these is a link with more than one input or output slot, a dialogue box will ask the user which of those slots the flow should be attached to. A smart delete facility automatically eliminates dangling elements. 4.2 Text Language While the main form of our language is a visual one, Figure 4 shows its text-based form. We also use this form for dumping and restoring the information from the graphical editor and for interoperating with other tools. The language uses SGML for its meta-syntax in order to avoid introducing yet another incompatible meta-syntax into the world, since it is reasonably human-readable, and is increasingly supported by parsing and editing tools. Figure 4 illustrates the SGML-based intermediate language by showing the same example as Figure 1 in that form. It defines the variables, links, data flows between them, and state diagram(s) that make up the interface in a fairly straightforward way. Figure 4 also shows some of the information that is entered via dialogue boxes and therefore not visible in the other figures. If the kind attribute of a - 20 link is “custom”, then the body of its Evaluate() routine is given in C++ directly, between the <body> and </body> delimiters. The kind field can also be the name of a predefined link, chosen from a library of links that perform common mathematical and geometric operations; in that case the <body> element would be absent. The actual contents inside a variable can be of various data types, as this example shows. The state transition diagram portion is expressed as a list of transitions from each state, similar to[30, 31]. This form of our language is translated into C++ code which is loaded along with base classes to implement the interface. The language also allows an optional <render> tag, not shown here, which can retain layout information generated and used only by the graphical editor; it does not affect the meaning of the non-visual language. <def_system> slider1.UID <def_var type=Pos kind=INPUT> mouse </def_var> <def_var type=float kind=SEM> value </def_var> <def_var type=Area kind=OUTPUT> handlepos </def_var> <def_link kind=custom enableflag=DRAGGING> mousetoval <in type=Pos> src <out type=float> dst <body> dst->SetI (Scale (0.250 - src->GetI().y, 0., 0.050, 0., 100.)); </body> </def_link> <def_link kind=custom enableflag=ALWAYS> valtoscrn <in type=float> src <out type=Area> dst <body> dst->SetI (Area ((dst->GetI()).x, 0.250 - Scale (src->GetI(), 0., 100., 0., 0.050), (dst->GetI()).w, (dst->GetI()).h)); </body> </def_link> <def_flow> <source> mouse <destination> mousetoval.src </def_flow> - 21 - <def_flow> <source> mousetoval.dst <destination> value </def_flow> <def_flow> <source> value <destination> valtoscrn.src </def_flow> <def_flow> <source> valtoscrn.dst <destination> handlepos </def_flow> <def_state> st <transition token=MOUSEDN condition=Inside(mouse,handlepos)> DRAGGING </transition> </def_state> <def_state> DRAGGING <transition token=MOUSEUP> st </transition> </def_state> </def_system> Figure 4. Specification of the slider from Figure 1, illustrating the SGML-based intermediate language, which is used both for saving and restoring files and can be input by the user. Our UIMS can also be used simply as a set of C++ classes (Link, Variable, etc.). This provides an object-oriented implementation of the underlying user interface management system. It can be used to build interfaces directly in C++ by subclassing to define the links needed for a particular interface and writing C++ code for their Evaluate() methods. Each form of the language, graphical, SGML, and C++, is ultimately translated into this C++ code, which runs on our PMIW user interface software testbed. The editor dumps the UIDL in its SGML-based text form. That form can be translated into C++ code that uses our UIMS classes and which can then be compiled and run directly. (Translation from SGML to C++ was formerly automatic, but at this writing it requires manual intervention, since we have updated the language but have not updated the translator to correspond; the examples below were generated, dumped, and then - 22 manually edited before compiling.) Our overall UIMS design is intended not to preclude runtime editing of the UIDL, though the current implementation requires a compile cycle after a change in the UIDL. The run-time UIMS actually allows adding links, variables, states, and transitions at any point during execution (and deleting them provided they are not currently in use). 5. EXAMPLES 5.1. Grabbing an Object Figure 5 shows the UIDL for a common, very simple interaction in VR: grabbing and dragging a (weird-looking) object with the hand in 3-D. Figure 6 shows the object and the hand cursor running under our PMIW system. The diamond-shaped cursor is permanently attached to the user's hand. The user can grab the object by holding Button 1 down (for simplicity, regardless of where the cursor is at the time; this is refined in the next example). While the button is held, the object position follows the cursor position; when the button is released, that relationship ceases, though the cursor continues following the user's hand. - 23 - Figure 5. Grabbing and dragging an object with the hand in 3-D, a common, simple interaction in virtual reality. The user can grab the object by holding Button 1 down (for simplicity, regardless of where the cursor is at the time; see Figure 7). While the button is held, the object position follows the cursor position because the DRAGGING condition is enabled. When the button is released, that relationship ceases, though the cursor continues following the user's hand. - 24 - Figure 6. The object and hand cursor of Figure 5, running under our PMIW user interface management system. The diamond-shaped cursor is permanently attached to the user's hand; the other object can be grabbed and moved in 3-D. The UIDL for this, in Figure 5, is quite straightforward. The hand (that is, the INPUT variable polhemus1, the first of the four Polhemus sensors) controls cursorpos, the position of the cursor, at all times; the link cursor is enabled ALWAYS. The cursor position, in turn, controls the position posn of the object with an identity function, but only when the condition DRAGGING is engaged. The condition is engaged by pressing Button 1, which causes a state transition and disengaged by releasing Button 1, which causes another transition. - 25 It is worthwhile to note how this and subsequent examples do not require the user to write code for maintaining the relationships defined in the links nor for responding to change-value events. Because we use constraints, the user simply provides a declarative specification of the desired relationship and, if applicable, indicates how it will be turned on and off. 5.2. Arm Figure 7 shows the UIDL for a simple movable arm, attached to a base column, and Figure 8 shows its appearance. The user can grab the arm and move it in three dimensions. The left end is always constrained to be fixed to the base column, as if it were attached by a doublyhinged joint, while the rest of the arm can pivot to follow the user's hand cursor. Linkc1 performs the calculations to relate the hand cursor position to the Performer rotation matrix for the movable portion of the arm. This link is active only while the user is grasping the arm; when the user lets go of the arm, the link ceases to operate and the arm remains where it was left. The state diagram shows the state change that occurs when the user grabs the arm (it activates the link) and releases the arm (deactivates the link). Unlike the simplified example in Figure 5, here the user must press the button while the hand cursor is touching the arm segment. - 26 - Figure 7. A simple movable arm, attached to a base column. The state diagram shows the state change that occurs when the user grabs the arm (it activates condition GRASPED1) and releases the arm (deactivates it). Linkc1 relates the hand cursor position to the arm position continuously and is active only while the user is grasping the arm. - 27 - Figure 8. The arm specified in Figure 7, running on our system. The user can grab the arm and move it in three dimensions. The left end is always constrained to be fixed to the base column, as if it were attached by a doubly-hinged joint, while the rest of the arm pivots to follow the user's hand cursor. The pivot1 variable is a constant that contains information about the size and shape of the arm; it was generated along with the initial geometry specification for the arm. Variable rot1 is the transform matrix that rotates the movable portion of the arm to follow the hand cursor. It is tagged as an OUTPUT variable because its value is directly reflected in the appearance of the screen—not necessarily because it is a sink in the data-flow graph (as will be seen in subsequent - 28 examples). The UIMS automatically obtains updated values of all output variables and periodically redisplays the scene using their values. That relationship between an output variable and its screen display is not represented in the visual language, but it can only be a very simple, straightforward relationship (anything more complicated should first be calculated via appropriate links and variables and then fed to a simple output variable). In this example, rot1 is simply the value of the actual 4 x 4 transform matrix contained in a DCS node of our Performer scene graph. Although they are all shown as simple data flows, the variables in this diagram may be of different data types. For example, polhemus1 and cursorpos are 3-D vectors giving (x,y,z) position as in Figure 5, and rot1 and pivot1 are 4 x 4 transform matrices. These types match the corresponding slots in the links to which they are connected. The variable type information and the names and types of the input and output slots in the links are entered and displayed in pop-up dialogue boxes in the editor, described in Section 4.1. 5.3. More Complex Arm Figure 9 shows a two-jointed arm, to give a more interesting use of the links and variables; Figure 10 shows its appearance. To reduce clutter, polhemus1 and cursor are not shown in Figure 9. As discussed below, they would be a candidate for encapsulation in a separate interaction object. The user can grab and move the first (proximal) segment of the arm as in the previous example. The second (distal) segment is attached by a hinge to the tip of the proximal segment. The user can grab the distal segment and rotate it with respect to its joint at the tip of the proximal segment. Linkc1 is active when the hand cursor is controlling the rotation of the proximal segment of the arm (GRASPED1 condition), and linkc2 is active when the hand controls the distal segment (GRASPED2). The diagram should clearly show that, depending on - 29 the state, the hand position controls rot1 at some times and rot2 at other times. Calculating the rotation of the distal segment of the arm requires knowledge of the rotation of the proximal portion of the arm. Note how rot1 is therefore no longer a sink in the graph, but it is still an OUTPUT variable from our point of view because its value directly drives an element of the graphic display. Figure 9. A two-jointed arm, using more links and variables. Here, linkc1 is active when the hand cursor is controlling the rotation of the proximal segment of the arm (GRASPED1 condition), and linkc2 is active when the hand controls the distal segment (GRASPED2). The figure should show that, depending on the state, the hand position sometimes controls rot1 and sometimes rot2. (To reduce clutter, polhemus1 and cursor are not shown in this and subsequent figures; they would be the same as in Figure 7.) - 30 - Figure 10. The arm specified in Figure 9. The user can grab and move the first (proximal) segment of the arm as in the previous example. The second (distal) segment is attached by a hinge to the tip of the proximal segment. The user can grab the distal segment and rotate it with respect to its joint at the tip of the proximal segment. The UIDL in Figure 11 gives the same behavior as that of Figure 9 but adds two new unused variables to the graph (and stretches the limit of what can be seen on a single screen; the sub-assembly mechanism discussed below would be called for here). The new variables, tip1 and tip2, are 3-D vectors that present the absolute location of the tips of this object's proximal and distal segments, respectively. This object makes them available for the use of other user - 31 interface objects, though they have no direct bearing its own visual appearance or calculations. The links that calculate these variables are always active; that is, the variables are, conceptually, always kept up to date. However if no other links in the user interface use them, our UIMS will not devote any resources to recalculating them until they are requested. The next example makes use of these “exported” variables. Figure 11. This object gives the same behavior as that of Figure 9 but also exports two additional variables, tip1 and tip2, which are 3-D vectors giving the absolute location of the tips of this object's proximal and distal segments, respectively. They are made available for the use of other user interface objects, as seen in Figure 12, but have no direct bearing the visual appearance or calculations of this object. (This figure obviously stretches the limit of what can be seen on the screen; the sub-assembly mechanism discussed in Section 8 is called for here.) - 32 5.4. World of Arms Note that the position input to the simple arm of Figure 7, cursorpos, is simply a 3-D position vector in world coordinates. It need not be generated by the hand cursor. For example, it could be the tip position variable produced by yet another arm. In that way, we could create an arm that tries, within the constraint of the joint at its base, to point to the tip of a different arm. We have used this in order to create a more complex world, for investigating performance, as discussed in Section 6.6. Figure 12 shows a virtual world with two instances of the two-jointed arms, each exactly like the UIDL in Figure 11. Each can be grabbed and released separately, using the hand cursor, and each generates its own rotations and its own tip1 and tip2 variables. We then provide a collection of 24 single-jointed arms in the background. Each of them has a link that takes one of the tips of the foreground arms as its input position variable instead of cursorpos. Half of the background arms point to tips of the rightmost foreground arm; the others point to the arm in the lower center foreground. Alternating background arms point to the proximal tip of their foreground arm, or to its distal tip. This could be seen by minute examination of Figure 12, but it is immediately apparent when one of the arm segments moves in the live application. - 33 - Figure 12. A world containing two instances of Figure 11 and 24 of Figure 7, with the input position variable slots of the latter set to point to the tips of the former. The two foreground arms can be grabbed and released separately, using the hand cursor, and each generates its own rotations and its own tip1 and tip2 variables. Each of the 24 single-jointed arms in the background takes one of the tips of the foreground arms as its input position variable instead of cursorpos. Some of the background arms point to tips of one of the foreground arms, some to the other. Alternating background arms point to the proximal tip of their foreground arm, or to its distal tip. The world shown in Figure 12 was created simply by instantiating two of Figure 11 and 24 of Figure 7 and changing the input position variable pointers of the latter. In addition, the background arms were modified to add an extra feature that is not apparent from the - 34 screendump. Each of the background arms has a state diagram that allows its linkc1 to be turned on or off with a different keyboard key, color-coded to the colors of the base columns of the arms. This is helpful for demonstrating the performance of our system, since turning the arms on and off can be done while a foreground arm segment is moving, and it substantially changes the number of constraints to be solved without changing the amount of rendering to be done. 5.5. Virtual Environment Figure 13 shows a more complex example of a small virtual world, containing various objects and widgets that the user can manipulate. Each object was specified individually in our UIDL and then loaded all together into the main program. Some of these objects have been described above: the cube cursor for the hand position at the upper right (Figure 5); a singlejointed arm in the foreground, far left (Figure 7); a two-jointed arm in the foreground (Figure 11); two more single-jointed arms (Figure 7) that point to two different parts of the two-jointed arm; and various grabbable objects, including the cylinders, cones, arrows, and most of the spheres visible on the display (Figure 5, with different geometry but identical UIDL). Some additional objects are introduced and discussed here: the large throwable ball at the center right of the display; the orbiting pyramid near the top of the display; and the 3-D slider, seen head-on in the lower foreground center. As we saw with the interconnected arms, objects can share constraint graph variables so that, for example, one object can control another. In this example, two of the single-jointed arms are connected to points on the two-jointed arm, and the slider is connected to control the speed of the orbiting pyramid. The 3-D slider highlights (by changing color) when the user's 3-D cursor touches it, and it further highlights (by extending its handle outward) when the user engages it and drags the handle. The UIDL for this is a straightforward extension of Figure 1. Its state diagram keeps - 35 track of the four possible highlighting states as well as turning the dragging on and off; the actual dragging is handled by the data flow graph, just as in Figure 1. Figure 13. Screendump from a more complex virtual world, including cube cursor for the hand position at the upper right; single-jointed arm in the foreground, far left; two-jointed arm in the foreground; two single-jointed arms that point to different parts of the two-jointed arm; grabbable cylinders, cones, arrows, and spheres; a throwable ball, center right; 3-D Slider, lower foreground center, viewed head-on; and an orbiting pyramid near the top. The large sphere is a throwable ball; the user can not only drag it but also toss it into the air. Once thrown, the object maintains its course and speed, without friction or gravity; the user can catch it and throw it again. The pyramid near the top of the frame orbits around the user. Its - 36 speed is continuously adjustable, controlled by the slider in the foreground. The orbiting pyramid and the throwable ball are simple examples of physical simulations, introduced here in order to discuss this topic. In both cases, the output variable we send to the graphics system must be the location of the object, not its speed or direction, but our user controls are in terms of speed and direction. Our links must thus perform the calculations to convert the available data into a location, in one of two ways. The Toss object illustrates the easier case, where the location of the object is a simple function of the current clock time and the position, velocity, and time at the moment it was tossed. From those inputs, we can always calculate the current location anew, without any need for integrating over other history data. If CPU time becomes scarce and we can only evaluate this link occasionally, the motion of the ball will become jerky, but its speed will be correct; that is, it will always jump to the correct position for the current time, as is usually the preferred degradation strategy in a virtual environment[17, 52]. The orbiting object is an example of the more difficult case, because its current location depends on integrating over the entire history of the previous settings of the slider that controls its speed. Unlike the ball, we cannot calculate its correct location from time plus initial position and velocity without this history. This provides us an example of one potential dependency on the choice of constraint solver used. A data-driven or forward-chaining solver, run at frequent intervals, will handle this situation without special provision. A demand-driven or lazy evaluation solver may not, particularly if the user looks away from the pyramid, so its output is not requested for some time period. At the UIDL level, we handle this by tagging those constraints that need this special forward-chaining treatment as instances of the subclass LinkStep. We discuss LinkStep further in Section 6.4, but note that it is a no-op unless demanddriven evaluation is being used. - 37 Figure 14 shows the UIDL for the orbiting object. The link linkrot is a subclass of LinkStep (though the graphical notation does not reveal this). It reads the current value of speed, the SYNT variable that is exported by the slider object, along with the current time, obtained from the timer input variable, which is connected to the system clock (see Section 6.2.5). The linkrot link contains internal variables that accumulate the inputs and integrate them over time, continuously generating an up-to-date rotation matrix, rot, which gives the transform to the correct current location for the orbiting pyramid. Figure 14. Specification for the orbiting pyramid, visible near the top of Figure 13. The speed variable is exported by the slider; the timer variable is connected to the system clock; and the rot variable contains a transform to the current location for the pyramid. This object has no discrete component. - 38 Figure 15 shows the UIDL for the Toss example; it also shows an additional use of LinkStep. This example shows two distinct modes, one (TOSSING) while the ball is flying through the air and the other (DRAGGING) while the user is grasping it with the hand cursor. While TOSSING, we can simply calculate the position (posn) of the ball anew at any time, from the current time (timer) and the initial conditions when it was last tossed (initPos, initTime, and velocity); forward chaining with LinkStep is thus not necessary during TOSSING. However, we also need to measure the velocity of the ball at the moment it is tossed, just before we make the transition from DRAGGING to TOSSING. The Polhemus sensor measures the hand position not its velocity. Therefore, whenever the user is DRAGGING the ball, we use a LinkStep (called savelast) to save the last two positions of the user's hand and their timestamps (in the four variables, last1Pos, last2Pos, last1Time, and last2Time, which are shared between the plugboard and the state transitions, and used to communicate between them). At the moment the user tosses the object, the state transition calls the StartToss() action, which uses those most recent two saved positions and times to determine and set the initial conditions for the next toss (in initPos, initTime, and velocity, which are also shared between the plugboard and the state diagram). Finally, not explicitly visible is the PMIW object that takes head position and orientation from the Polhemus sensor and controls viewpoint. It uses a simple set of links and variables to do this job, in the obvious way. This object is normally Update()d out of the usual sequence, at the last moment before rendering, as described in Section 6.3. It can also be interchanged with a mouse-driven viewpoint controller, simply by instantiating that object instead. Both produce the same output variables for viewpoint. - 39 - Figure 15. Specification for the throwable ball, visible in the center right of Figure 13. This interaction reveals two distinct modes, one (TOSSING) while the ball is flying through the air and the other (DRAGGING) while the user is grasping it with the hand cursor. While TOSSING, we continuously calculate the position (posn) from the current time (timer) and the initial conditions when it was last tossed (initPos, initTime, and velocity). While DRAGGING, we update the position of the ball based on the hand cursor. We also need to measure the velocity of the ball at the moment it is tossed. To do this, the savelast link saves the last two positions of the user's hand and their timestamps (in variables, last1Pos, last2Pos, last1Time, and last2Time); the state transition action, StartToss(), then uses them to determine the initial conditions for the next toss. 5.6. Interaction Techniques in Virtual Reality In addition to those presented above, we have experimented by sketching UIDL specifications for other interaction techniques in virtual reality and, thus far, they have turned out to be surprisingly straightforward in our UIDL, like the examples shown above. The geometrical - 40 calculations themselves within a link may be complex, but the interaction sequence or syntax, that is, what input actions cause what results, are thus far easy to express. Most grabbing or dragging interactions resemble the examples above. Some menu or selection interactions are almost entirely discrete, represented by the state transition diagram portion of the UIDL. Figure 16 sketches the UIDL for one of the more innovative interaction techniques for VR, the daisy menu, a new type of 3-D menu developed by Jiandong Liang and Mark Green[39]. (We have not implemented the graphics for this object, just the UIDL shown in Figure 16.) This menu pops up a sphere containing the command icons around the position of the Polhemus sensor held in the user's hand. “These primitives [i.e., the menu command items] can be chosen from the shell of a 3-D spherical menu, called a daisy, using the bat [a Polhemus sensor with 3 buttons attached].... Primitives are selected by rotating the menu until the desired primitive enters a selection cone that always faces the user.”[20] The state transition diagram shows how the menu is activated and deactivated by Button 3. While it is activated, the links shown in the figure are all enabled, causing the sphere and the highlighted selection cone to move. The data-flow diagram shows clearly how the menu sphere and the selection cone both follow the position of the bat in the user's hand (identity1 and identity2). However sphere also rotates with the bat (identity3), but the cone does not, it just keeps pointing toward the eye (calcvector). - 41 - Figure 16. Specification of a 3-D “daisy menu” developed by Jiandong Liang and Mark Green for selecting commands in VR. The menu pops up a sphere containing the command icons around the position of the Polhemus sensor held in the user's hand. The state transition diagram shows how the links are all activated and deactivated by Button 3. The data-flow diagram shows that the menu sphere and the selection cone both follow the position of the user's hand (identity1 and identity2). However sphere also rotates with the hand (identity3), but the cone does not, it just keeps pointing toward the eye (calcvector). The action for the BUTTON3DN transition would be: Show daisy and selection cone; For BUTTON3UP transition, it is: - 42 If (intersection of cone and daisy covers a menu item) { Select that item; } Hide daisy and selection cone; Even the daisy menu is fairly straightforward in our UIDL. This may reflect the power of our approach, but it also reflects the relative infancy of this area of non-WIMP interaction techniques. While WIMP interfaces seem to have stabilized around a set of standard widgets, to the point where developing new interaction techniques or widgets is a rare activity, non-WIMP interfaces are a long way from this type of codification. They provide a far richer range of mechanisms to engage and communicate with the user, and the search for the right set of interaction techniques in this realm has just begun[24, 60]. For this reason, languages to enhance programming of WIMP widgets are not a major practical concern, since so few genuinely new widgets are developed. (This may be a cause-and-effect problem: Programming new widgets with current tools is much more difficult than plugging in existing widgets, creating an incentive to make do with existing ones.) We believe there will be a growing need for usable languages to help facilitate inventing and experimenting with new interaction techniques for virtual environments. 6. IMPLEMENTATION ISSUES AND THE USE OF CONSTRAINTS 6.1. UIMS Implementation The software for our PMIW user interface management system is written in C++ for Unix. We use Silicon Graphics Performer software[51] for graphics and rendering (but not for input). PMIW is separated into portions dependent on and independent of the SGI Performer 3-D graphics system; the latter can be run on other flavors of Unix, by providing different means of - 43 drawing output on the screen. For example, the simple slider in Figure 1 uses X graphics only and runs on Sun Unix. We use the MR toolkit[52] for communicating with the Polhemus tracker. Our main loop reads X window input (from a conventional X window or a Performer GLX window) and then input from our own additional devices, dispatches the inputs (either to the plugboard as changes in its input variables or to the event handler as tokens), allows the plugboard to recalculate as needed, propagates changes in output variables to the Performer scene graph data, reads the head tracker, and then calls Performer to render the scene graph. The state diagram interpreter is based on one developed in previous work[31, 32]. The constraint solvers are discussed in this section. The constraint solver is required because the system is implemented on a conventional digital computer with serial input devices, where all inputs, processing, and outputs are ultimately discrete, not continuous. Our goal is to provide a model that allows the user interface designer to think of continuous variables and a language that allows the designer to program them as though they were continuous—because we posit that the user and designer will think of some aspects of the interaction as continuous. Our runtime system ultimately merges the continuous and event-based portions of the specification and runs them in discrete steps in a single thread main loop, which includes the constraint solver. The point is that the user interface specifier need not be concerned with this level (as long as we can provide sufficiently good performance). 6.2. Fundamental Classes Our runtime UIMS functionality is encapsulated in a set of base classes. These are described here, along with lists of the principal methods of their public interfaces. The discrete event handler is contained in the EventHandler class; the plugboard or constraint solver is - 44 contained in the Variable, Link, and Condition classes; the connection from input devices to events and plugboard variables is contained in DeviceBase; and the connection from output variables to the screen is provided (by the user) in the IO (interaction object) class. 6.2.1. Variable Variable (kind) GetE () SetE (value) GetI () SetI (value) These are the nodes in the plugboard; they represent continuous user interface Variables, some of which are directly connected to input devices or outputs. The Variable class provides a container for holding a value, with code for determining when it needs recalculation. Access to the value of a Variable is different depending on whether you are calling from inside a constraint solver (that is, from within the Evaluate() routine of a Link) or not. GetE() returns the value of the variable, and SetE(value) sets it. Inside the solver (that is, within the body of the Evaluate() routines), GetI and SetI are used instead. The external routine (GetE) always returns an up to date value, recalculated if necessary (but subject to the time management degradation features). The internal routine (GetI) simply accesses the value field without triggering further calculation; it is only used within a recalculation cycle. Internally, the implementation of Variable is divided into a VariableBase class containing common routines and a template for instantiating classes to hold different data types, - 45 such as Variable<int> and Variable<float> or aggregate data types such as Variable<pfVec3> and Variable<pfMatrix> The kind tag in the constructor indicates how this variable is used. It is currently mainly for documentation; the system makes limited use of this information. The choices are: • INPUT: Input coming directly from a device • OUTPUT: Output destined for a device (or for the Performer scene graph) • SEM: Semantic data, that is, data shared with the application or semantic layer, outside of the UIMS • SYNT: Syntactic data, that is data used within the UIMS to keep track of state or other user interface information • INT: Other intermediate node in plugboard • CONST: Constant data (could be semantic, syntactic, or intermediate). 6.2.2. Link Enable () Disable () static Recalc () abstract Evaluate () A Link is a part of a plugboard; it connects the nodes (Variables) and contains a function that maps from one or more variables to one or more other variables. The base constructor adds - 46 the Link to the plugboard by adding each instance of any of its subclasses to a (static, class variable) list of all Links to be evaluated when the solver is run. The base destructor removes it from this list. A library of common, specific kinds of Links is provided. For others, the user subclasses Link and overloads Evaluate(). The Evaluate routine accesses Variables in the plugboard via GetI() and SetI(). This syntactic sourness can be avoided by using the overloadings we provide for “=” and “*” for Variables or by using the SHADOW precompiler, mentioned in Section 6.7. A Link may be Enable()d or Disable()d by calling the respective methods; this is usually done via a Condition, described in the next section. Implementation is divided into the LinkBase class, which contains the basic functionality of Links, irrespective of the solver used, and Link, which adds a specific solver implementation. The solver is triggered by calling GetE on a Variable. For a forward-chaining solver, this causes a complete recalculation of all dirty nodes in the graph (via the static method Link::Recalc). For a backward-chaining solver, only those Links needed to satisfy the particular GetE request are recalculated, and there would be no Link::Recalc method. 6.2.3. Condition Add (link) Enable () Disable () Condition is one of the principal communication paths from the discrete to the continuous portion of the UIDL, it effectively re-wires the data-flow graph in response to user inputs. A Link may be associated with one or more Boolean flags or Conditions, which allow - 47 groups of Links to be turned on and off (usually in response to state transitions). A link may belong to any number of such groups, or none. Links may be added to a Condition with the Add method (or optionally in the Condition constructor). The Condition may then be Enable()d or Disable()d via the corresponding methods. Enable and Disable are not normally called by the user. Instead, a state in a state transition diagram may be associated with a Condition. Whenever the state diagram interpreter enters or exits that state, it will make the corresponding Enable() or Disable() call automatically. 6.2.4. EventHandler static IH (token) SendTok (token) IhIo (token) Each object that can receive tokens from discrete inputs inherits from this class. The UIMS sends each token to EventHandler::IH, which dispatches it to the IhIo methods of its individual instances. The user supplies an IhIo method for each instance, to receive tokens and respond to them. The responses may include making state transitions, setting syntactic-level variables, or making procedure calls to the application semantics. If the EventHandler is defined by a state machine, a precompiler generates the body of the IhIo method from the state transition diagram[30, 31]. Each EventHandler object remembers its state in its state transition diagram as an instance variable. Finally, EventHandlers, Links, and other objects can generate their own tokens by calling SendTok. 6.2.5. DeviceBase Read () - 48 Dispatch () Subclasses of this class receive inputs and dispatch them to EventHandlers and/or input Variables, as discussed in more detail in Section 6.3. Subclasses are provided for DeviceXWindow (reads mouse and keyboard events for regular X windows or Performer GLX windows, which share the same input mechanism), DevicePolhemus, and DeviceEye (ISCAN eye tracker over serial port). A DeviceTimer is also provided to feed the current time (read from the system clock) into the plugboard. 6.2.5. Interaction Object (IO) static UpdateAll () abstract Update () Objects that produce output Variables that must be propagated to the scene graph inherit from IO and supply the body of the Update method. The IO constructor maintains a list of all instances of its subclasses created; the IO::UpdateAll() static method then Update()s each of them. 6.3. Connecting with Inputs and Outputs Our constraint solvers receive input variables, process the constraints as needed, and ultimately generate output variables. We discuss here how the input variables are connected to the actual input devices and how the output variables are connected to outputs to the screen. These aspects are outside the UIDL and hence not visible to the user interface designer; they - 49 appear as basic capabilities hard-coded into the UIMS. There is also provision for extending the UIMS to include additional devices, by subclassing DeviceBase. Each input device (including input half of a window) is encapsulated in a subclass of DeviceBase. While our current input devices all communicate serially, some of them are fed into the UIMS as discrete events and others as continuous variables. Each of the subclasses of DeviceBase provides a Read() method and a Dispatch() method; the separation is to accommodate unusual devices that might require scheduling reading and dispatching separately. Read() is optional; it does any required periodic servicing of the device that might be needed. For example, if device streams data continuously, this routine could drain the serial port input queue and stash the data away for us to use later. Dispatch() then processes the input. It may read all the device data, just the latest value of the device data, or use data saved by Read(). If the input generates an event (keyboard key, mouse button), Dispatch simply converts it to a Token and calls EventHandler::IH(token), which processes the event synchronously as it is called. If, however, the input is destined for a continuous variable (mouse position, Polhemus position, eye point of regard), Dispatch takes the input and sends it to the SetE() method of the corresponding input variable. It follows that the mapping of input device data to our input Variables is coded in these Dispatch methods. Performing the SetE sets dirty flags as needed, but does not cause any other calculation until later, when the constraint solver is run. As noted in Section 6.4, there is also a provision for designated Links to be reevaluated for every new input value. The other half of the interface to the outside world connects our continuous variables to outputs on the screen or head-mounted display (these are our only output devices at present). Under Performer, this means each of our output variables ultimately controls some data element - 50 in the Performer scene graph. Again, since the underlying output hardware is digital, a continuous variable must ultimately be read out at a discrete time and sent to the frame buffer. Unlike the input variables, which are permanently connected to their corresponding hardware devices, the output variables differ for each application. For example, the Grab object (Figure 5) outputs the 3-D position of the grabbed object in world coordinates; Arm2 (Figure 9) produces a 4x4 rotation matrix for each of its two movable joints. For each object with output variables, the user must thus provide an Update() routine, which takes the values of its variables and sets the corresponding pieces of data in the Performer scene graph. Under our approach, this code should simply GetE() the variable and copy its value into the appropriate element of the scene graph (to which it has retained a pointer). If more substantial calculation were needed, it ought to be done in a link, where we can manage and schedule execution; the Update() simply copies the data to its final destination. Objects that need to be Update()d are all subclasses of IO, as described above. All that now remains is for the main loop to perform the following steps repeatedly. Each step is described in English and also shown in C++ below: • Call the Read() and then Dispatch() methods for each of the input devices you are using. Three devices are shown here: XWindow, for mouse and keyboard events; eye tracker, which includes the Polhemus data; and timer, a pseudo-device that reads the system clock. (See Section 6.4 for special callback argument to deviceEye->Dispatch): pfwindow-> GetDeviceXWindow()-> Read () deviceEye-> Read () deviceTimer-> Read () pfwindow-> GetDeviceXWindow()-> Dispatch () deviceEye-> Dispatch (&LinkStep::Step) - 51 deviceTimer-> Dispatch () • Process any LinkSteps specially (see Section 6.4). This iterates over all instances of LinkStep and allows them to recalculate as needed: LinkStep::Step () • Update() all the IOs. This iterates over all instances of IO and calls the Update method. These in turn trigger recalculation as a side effect: IO::UpdateAll () • Update the head position just before rendering the frame: pfSync () headCoupler-> UpdateManual () pfFrame () Observe that the Update() routines will call GetE() for any variables they need to output; this triggers the constraint solver to do its work. The main loop is written to run no more than once per video frame (via the pfSync() and pfFrame() calls), but may well run less frequently. As discussed below, there is also a special provision for LinkSteps that should be evaluated more frequently than once per frame (those that accumulate input history from streaming input devices). - 52 6.4. Constraints We find thus far that constraints are indeed a good element of a UIDL for virtual reality and other non-WIMP interfaces—with the following modifications: • They are restricted to simple one-way, non-ambiguous constraints, that is, the constraint graph is simultaneously solvable. (We will consider cases where, for execution speed, we do not solve it all; but given enough execution time the constraints are written so that they could all be satisfied simultaneously.) • Constraints are permitted to be temporary, that is, there is an efficient mechanism for turning constraints on and off, and it can be triggered by the discrete user interface. • The constraint specification is combined with an additional mechanism for discrete interactions and incorporated into our overall UIDL. While we introduced the plugboard constraint graph for its expressive power in describing parallel, continuous interaction, we found that, although constraints are often viewed as introducing performance penalties compared to conventional coding, our approach provides leverage for improving performance or interactive responsiveness. This is because our constraint-based formalism allows a separation of concerns between the desired interactive behavior and the implementation mechanism. The user interface designer can thereby concentrate on and express the former, in a high-level, declarative, continuous-oriented way, while the underlying runtime system can perform optimization, tradeoffs, and conversion into discrete steps independently, beneath the level of the UIDL. One could thus tailor the response speeds of different elements of the user interface (specifically, the individual constraints in the - 53 constraint graph) within the available computing resources from moment to moment[58]. All this would be specified separately from the user interface description; the UIDL need only describe the desired behavior (i.e., the behavior if infinite computing resources were available). By the same token, because the Links and Variables that comprise the continuous portion of our system are only a declarative specification of the desired user interface behavior, a run-time constraint solver is required to implement the interface. The language described up to this point is intended to be independent of the particular constraint solver used, and we see our principal contribution as this model and language, rather than as introducing a new constraint solver into the world. The semantics of our UIDL make relatively modest demands on a constraint solver: a set of non-ambiguous one-way constraints, executed once per video frame, or less frequently if necessary. We have therefore implemented several quite different constraint solvers. We will discuss some of them briefly here, but since we intend the language to work with other solvers as well, the details of our current solvers are not an integral ingredient of our approach. They implement the same semantics, but have different real-time performance characteristics. They are each implemented as alternate definitions for the Link and Variable subclasses, derived from the common LinkBase and VariableBase classes. The examples here can run on any of them, with no changes to the UIDL or to any other code shown here, except for actually loading the subclasses that contain the desired solver. Note that the tagging of certain links as LinkStep (Section 5.5) provides extra runtime information for backward chaining solvers, and is a no-op for forward chaining ones. Our first implementation is a correct but simple-minded forward-chaining constraint solver. Whenever the value of a variable is requested, it performs a complete recalculation of all dirty nodes in the graph. Setting a variable to a new value sets its dirty flag, indicating that links - 54 that depend on this variable must be recalculated. Recalculation iterates until all flags are cleared (or a maximum count is reached, to break infinite cycles). Of course, if several variables are requested in succession, and the inputs have not changed, only one full recalculation would be executed, because it would have cleared the dirty flags. Our next solver is LoVe a backward-chaining system, using incremental, lazy evaluation. Its algorithm is based on that of Hudson's Eval/Vite system[26, 28]. This is an optimal algorithm, which evaluates the minimal set of links for each request. LoVe accepts cyclic graphs, i.e., one can define a relationship x = x' + 4 where x' is the previous value of x. When the value of a variable is requested, LoVe recursively finds all the links that need to be re-evaluated and evaluates them in the correct order (so that each link is evaluated at most once). Because the algorithm is incremental, there may be some links that don't need to be re-evaluated because their dependencies are up-to-date. As LoVe finds links to be re-evaluated, it also marks them as visited so that it will know when a cycle exists in the graph, and cease recursion. The algorithm clears the dirty flag of the variables that depend on each link that is re-evaluated, to record the fact that these variables are now up to date. LoVe adds some new aspects to the algorithm of Eval/Vite, which are specific to our application, such as the ability to turn Conditions on or off without reinitializing and a set of hooks to accommodate time management features. LoVe is set up to allow its user to change the constraint graph at run time via Conditions, thereby redefining the relationships of the variables and modifying the behavior of the system dynamically. When it is done, the algorithm and the data structure remain intact; there is thus no penalty for changing the graph as often as desired at run-time. Despite the performance benefits of a demand-driven solver, there are a few cases where a formula is easier, or simply becomes possible, to express if a forward-chaining solver could be - 55 assumed. With lazy evaluation, we cannot be sure that a particular output variable will be calculated on every frame, and, therefore, that those constraints that feed it directly or indirectly will be evaluated every frame. We have seen examples of cases where it is more straightforward to express an algorithm if we can assume that certain constraints are called fairly regularly, regardless of the state of the demand-driven solver. As mentioned in Section 5.5, we support this by defining the LinkStep subclass; instances of this subclass are called more regularly than ordinary links, that is, regardless of whether the solver needs their output. There are three situations where this is necessary: • Input history-dependent calculations, such as the Orbit example in Section 5.5. • Simulations. For example, a simple physical simulation involving colliding objects is often easier to write if you do not have to predict ahead when the moving objects are going to touch, but just write code to step through the simulation at a fixed time step. • Recognizer links, which scan the input stream for patterns (gestures, eye fixations) and fire tokens when they are recognized. As seen in the main loop above, LinkSteps are evaluated explicitly once on each cycle. Finally, we allow for LinkStep processing that should be done more often than once per frame. Links that process streaming input and search it for patterns (such as gesture or eye fixation recognizers) or accumulate input history should be executed for every new input value that is handled. They are handled by the optional callback argument to Dispatch (note the call to deviceEye->Dispatch(&LinkStep::Step) above. Dispatch then calls the given routine on every new input value it handles; LinkStep::Step() evaluates the LinkStep links as needed for it. - 56 We are also experimenting with DLoVe, a solver that distributes the constraint solving workload over several computers to improve speed. As with the other solvers, it is designed to provide identical semantics but use parallel processing to improve performance. Finally, the SHADOW system[41, 42] incorporates its own solver along similar lines, along with features for time management and runtime decimation of link bodies. 6.5. Multi-user Interfaces The principal use of the distributed constraint solver, DLoVe, is to exploit the computational power of additional workstations to solve the Links and Variables faster. However, it can also be used to support multi-user, collaborative virtual worlds. Using DLoVe, the dataflow graph (that is, all Links, Variables, and Conditions) is automatically shared among all workstations. Updates made on one workstation will (eventually) be seen by all of them. This makes it easy to implement a multi-user interface. The UIDL consists of a single set of state diagrams and dataflow graphs, which together describe what happens in response to inputs from users on all of the devices and workstations. From the interface designer's point of view, the only difference is that the UIDL for a meaningful multi-user interface will typically need to refer to inputs from the different users individually. A mouse or Polhemus on one workstation must thus be named differently from those on another; the UIDL may refer to either or both as required to describe a two-person interaction. The UIDL code for multi-user interfaces should refer to the mouse of each user specifically. In some cases, the users may have different roles in the collaborative interaction (pitcher vs. catcher). To accommodate this, the (shared) dataflow graph thus contains an array of Variables, mice[0..N], one for the mouse of each workstation, and the tokens for mouse buttons on the different mice are named LEFTDN_1, LEFTDN_2, etc. For example, we have developed - 57 two-person versions of both Grab and Toss. In the Grab example, the state transition diagram is written so that while one user is dragging the object, the other user can grab it away. The state diagram keeps track of whose turn it is. For variety, in the two-person Toss example, when one user is dragging the object, the other user cannot grab it away. One user can throw the object and either can then catch it. These were simple extensions of the UIDL shown in Figures 5 and 15. No other special coding was required, other than handling the multiple input devices as described above and launching the distributed version of the system on at least two workstations. (Additional workstations may participate in the calculation workload, but this UIDL calls for only two mice, so any other mice would be ignored.) 6.6. Performance Many non-WIMP interfaces must meet severe performance requirements in order to maintain their perceptual illusions. For virtual reality, in particular, these requirements are the driving force behind the design of most current implementations[51]. We want to introduce higher level, cleaner user interface description languages into this field, but we must not compromise performance. We claim that our underlying model contains nothing that adversely impacts run-time performance—all penalties are paid at compile-time—because the links or constraints are one-way and the event handler technology is straightforward. Keeping the model conceptually simple leaves some degrees of freedom available to our run-time system to manage CPU resources in a specialized way tuned to the peculiarities of a video-driven VR system. The simple homogeneous system of links also allows us to build optimized constraint systems underneath it, providing the same semantics and incorporating optimizations invisible at the UIDL level, as seen here. We thus try to separate the concerns of interface modeling and runtime optimization. We obtained a rough subjective impression of performance under LoVe by - 58 modifying the world in Figure 12 to increase the complexity of its constraint graph, by making each arm point to its immediate neighbor in a chain instead of to one of the foreground arms. Running on an SGI Indigo2 Extreme workstation with a single 200 MHZ IP22 CPU, running IRIX 5.3 and Performer 1.2 and our non-distributed LoVe solver, we grasp one of the foreground arms with the cursor (attached to the mouse) and move it rapidly. We found that the frame rate (reported by pfDrawChanStats() call to Performer) varied between 24 and 36 Hz., and the subjective effect was of very rapid response. This rate was maintained with the input cursor stationary or moving rapidly and with some Conditions turned off or all turned on. The response to turning a Condition on or off seems subjectively instantaneous. 6.7. SHADOW: Scaling Up to Larger Worlds We are developing a more ambitious system based on this approach[41, 42]. SHADOW uses the same two-part model, combining one-way constraints for its continuous part with state transition diagrams for its discrete part. It adds further enhancements to the UIDL, particularly in the area of specifying a hierarchy of objects within other objects. It allows entire subsystems defined in their own UIDL specifications to be plugged into a larger constraint graph and activated or deactivated by its event handler. We have demonstrated its scalability by developing a nontrivial “rookery” VR world with it. This fairly complex world of penguins and ice floes required 24 visual program diagrams and a total of 4000 lines of C++ code in our language. Each diagram is a relatively simple, self-contained specification, containing an average of 2.3 states, 1.9 state transitions, and 4.1 constraints. Of the 4000 lines of code, 2000 were array initialization for the vertices of the graphic objects (which would normally have been generated with a 3-D modeling tool) and 1000 were library modules for device and window handling (which would not be written anew for another world). The remaining 1000 lines of “real” code were contained - 59 in relatively small, self-contained modules (average procedure body = 24.5 lines, average constraint body = 11.7 lines), and had an average Cyclomatic Complexity measure of 3.0 (where a lower value indicates better code maintainability, and values under 10 to 20 are generally considered good). 7. RELATED WORK The contribution our model makes for non-WIMP interaction is its separation of the interaction into two components, continuous and discrete, and its framework for communication between the two spheres—more than the internals of the two components themselves, which draw on existing techniques. We first separate non-WIMP interaction into continuous and discrete components, then, within each of the two spheres, we build on different threads of previous user interface software research. The discrete component draws on research in eventdriven user interface management systems[48]. The continuous component is similar to a dataflow graph or a set of one-way constraints between actual inputs and outputs and draws on research in constraint systems[26, 28]. The model provides the ability to “re-wire” the graph from within the dialogue. A variety of specification languages for describing WIMP and other previous generations of user interfaces has been developed, and user interface management systems have been built based up on them[45], using approaches such as BNF or other grammar-based specifications[4850, 53], state transition diagrams[29, 46], event handlers[13, 21], declarative specifications[48], frames[57], and others[14, 18, 36, 43, 54, 61]. For example, although BNF had been a good match for programming languages or batch command interfaces, interactive, moded graphical interfaces were perhaps better captured by state transition diagram-based approaches[30]; and modern modeless WIMP interfaces fit a coroutine-based model[32]. - 60 In the continuous domain, several researchers are using constraints for 2-D graphical interfaces[22, 23, 27, 44, 59]. Kaleidoscope[15] is a constraint-based language motivated by 2-D WIMP interfaces, and it explicitly supports temporary constraints. The CONDOR system uses a constraint or data-flow model to describe interactive 3-D graphics[37]. TBAG also uses constraints effectively for graphics and animation in the interface[11]. Gleicher provides constraints that are turned on and off by events[16]. Other recent work in 3-D interfaces uses a continuous approach[55] or a discrete, but data-driven approach[3]. Mackinlay, Card, and Robertson address the description of interfaces by continuous models by discussing interface syntax as a set of connections between the ranges and domains of input devices and intermediate devices[40]. While their focus is on widgets found in current WIMP interfaces, Abowd[1] and Carr[4, 5] both present specification languages that separate the discrete and continuous spheres along the same lines as this model. Both approaches support the separation of interaction into continuous and discrete as a natural and desirable model for specifying modern interactive interfaces. Carr provides an expressive graphical syntax for specifying this type of behavior, with different types of connections for transmitting events or value changes. Abowd provides an elegant formal specification language for describing this type of behavior, and uses the specification of a slider as a key example. He strongly emphasizes the difference between discrete and continuous, which he calls event and status, and aptly refers to temporary, continuous relationships as interstitial behavior, i.e., occurring in the interstices between discrete events. Kearney and Cremer[9, 10] also use an approach that combines discrete events with continuous data flows to program a sophisticated virtual environment automobile driving simulator. - 61 Other work from the formal specifications area is also relevant, such as that of Sufrin and He[56]; and Zave and Jackson[62], who provide a formal basis for combining multiple specification techniques to describe a single system in a coherent way. Myers' Interactors[34] also combine discrete state changes with what might be viewed as continuous actions. The continuous actions are handled as sequences of state transitions by providing a transition from a state back to itself, which accepts a “mouse-motion” or “value-changed” input token. Hill[21] and Chatty[6] have both provided UIDLs that handle parallel input from multiple input devices, particularly suited to two-handed input; their approaches are both based on discrete events. Software architectures for virtual reality interfaces have been developed by Feiner and colleagues[12] and by Pausch and colleagues[8]. Green and colleagues developed a toolkit for building virtual reality systems[52]. Most of this work has thus far concentrated on the architecture or toolkit level, rather the user interface description language. Lewis, Koved, and Ling, addressed non-WIMP interfaces with one of the first UIMSs for virtual reality, using concurrent event-based dialogues[38]. 8. CONCLUSIONS We have presented a software model for describing and programming the fine-grained aspects of non-WIMP style interactions (Section 2) and a UIDL that embodies it (3). It is based on the notion that the essence of a non-WIMP dialogue is a set of continuous relationships, most of which are temporary. The underlying model combines a data-flow or constraint-like component for the continuous relationships with an event-based component for discrete interactions, which can enable or disable individual continuous relationships. The language thus separates non-WIMP interaction into two components and provides a framework for connecting the two. To exercise our new model, we have then presented: - 62 • The VRED visual editor for this language (Section 4) • Some simple examples, running under our PMIW UIMS, to illustrate the applicability of the language for describing non-WIMP interactions (5) • Our use of constraints in a performance-driven interactive situation and our LoVe and DLoVe constraint systems (6). Software for virtual environments usually uses one or both of two models: event queues or device polling. Our constraint- or plugboard-based model is slightly different from both of these. Writing a constraint or data-flow graph is different in a small but important way from writing code that says “Whenever variable X changes, do the following calculations.” Instead, the constraint expresses something closer to: “Try to maintain the following relationship to input X, whenever you have time, doing the best you can.” The distinction becomes meaningful when the system is short on time, which is often the case in a virtual environment. We believe this is closer to the user's view of the situation and closer to what the programmer would like to express. Experience with our examples showed the straightforwardness of this declarative specification. The interface designer writes no code to maintain the relationship or handle the change-value events, but simply declares a relationship and then turns it on or off as desired. Our restricted use of such declarative specification does not hurt performance; in fact, it makes possible introducing time management and optimization designed specifically for the needs of a video-driven virtual environment. Finally, we note some areas of non-WIMP interfaces that our language specifically does not address, and how they would be connected to this work: - 63 • 3-D Modeling: We treat this as a separate issue, to be done offline from our system and imported into it (specifically, we can import into Performer from DXF, Open Inventor which is similar to VRML, and a variety of other formats). Figure 5 shows an example of this process using an object that we modeled in IRIS Inventor 1.0 and then imported. Figure 13 makes clear that our research focuses on interactive behavior, rather than attractive graphical appearance! • Animation: Our UIDL is not an animation language; it is a language for interaction with the user. These two aspects of the interface would be integrated by having the underlying animation system provide parameters or “levers” that the user's inputs can control. Our system then connects user actions to changes in these animation parameters or levers via variables in our graph or actions taken on state transitions. A user interaction might thereby start or stop an animation, change its acceleration, path, or destination. PMIW provides the user interface to the “levers;” the animation software provides the levers. • Physical Modeling: Many VR systems include extensive geometric and physical models and simulations, in contrast to typical WIMP applications. Most systems also require some capabilities that do not mimic the real world and cannot be described simply by their physical properties. There are usually ways to fly or teleport, to issue commands, create and delete objects, search and navigate, or other facilities beyond the physical world analogy. These behaviors are where our UIDL will be most useful, since they cannot be described by relying only on the real-world analogy. We therefore view physical modeling in a similar way to animation. That is, there might be an underlying physics engine or simulation engine with controllable parameters, and we provide the means for the user's inputs to control parameters of the physical simulation. (Our data- - 64 flow language might also be good for the sort of ad-hoc, purpose-built physical models as are found in many VR systems today.) In the future, we envisage a more general, separate system for handling physical simulation in much the same way that rendering 3-D geometry is now usually handled by a separate graphics system. Our UIDL will specify the aspects of the interface that do not follow directly from physical laws (fly, teleport, delete, pop-up a menu) and turn them into inputs to the simulation system; the normal action of the simulation will handle those aspects that follow directly from physical laws. • Sub-Assemblies: Our current system will obviously explode with complexity as we build bigger worlds; Figure 11 is already too cluttered. To address this, PMIW is designed to allow higher-level sub-assemblies to be defined in a straightforward way. Any set of links and variables can be packaged into a single component, with some of its inputs and outputs exposed and others encapsulated internally, much as an electronic sub-assembly encapsulates a set of components and provides a reduced set of input and output pins. This is currently available in our system, but not yet accessible from the editor. Note that such encapsulation has no bearing on run-time operation. At run time, the components are exploded back into their individual links and variables and executed by the constraint system; the higher level encapsulations have no performance impact. The Polhemus-tocursor function that appears in some of the figures is an example of a simple candidate for such an assembly. In fact, we have implemented it as a plug-compatible module which can be replaced by one that lets a moded mouse control the 3-D position of the cursor instead; both export the same cursorpos variable and thus can be used interchangeably - 65 Our goal is to provide a model and abstraction that captures the formal structure of nonWIMP dialogues in the way that various previous techniques have captured command-based, textual, and event-based dialogues. We seek to bring higher level, cleaner user interface description language constructs to the problem of building of non-WIMP interfaces. We have demonstrated how such a language can be used and implemented and shown that it need not compromise real-time performance. ACKNOWLEDGMENTS We thank Alan Bryant, David Copithorne, Ying Huang, Konstantina Kakavouli, Hansung Kim, Quan Lin, George Mills, Eric Reuss, Vildan Tanriverdi, and Daniel Xi, who worked with us; they are students and recent alumni of the Electrical Engineering and Computer Science Department at Tufts. And we thank Linda Sibert and James Templeman of the Naval Research Laboratory for valuable discussions about these issues. This work was supported by National Science Foundation Grant IRI-9625573, Office of Naval Research Grant N00014-95-1-1099, and Naval Research Laboratory Grant N00014-95-1G014. We gratefully acknowledge their support. REFERENCES 1. G.D. Abowd and A.J. Dix, “Integrating Status and Event Phenomena in Formal Specifications of Interactive Systems,” Proc. ACM SIGSOFT'94 Symposium on Foundations of Software Engineering, Addison-Wesley/ACM Press, New Orleans, La., 1994. 2. B.B. Bederson, L. Stead, and J.D. Hollan, “Pad++: Advances in Multiscale Interfaces,” Proc. ACM CHI'94 Human Factors in Computing Systems Conference Companion, pp. 315316, 1994. - 66 3. L.D. Bergman, J.S. Richardson, D.C. Richardson, and F.P. Brooks, “VIEW - An Exploratory Molecular Visualization System with User-Definable Interaction Sequences,” Proc. ACM SIGGRAPH'93 Conference, pp. 117-126, Addison-Wesley/ACM Press, 1993. 4. D. Carr, “Specification of Interface Interaction Objects,” Proc. ACM CHI'94 Human Factors in Computing Systems Conference, pp. 372-378, Addison-Wesley/ACM Press, 1994. 5. D.A. Carr, N. Jog, H.P. Kumar, M. Teittinen, and C. Ahlberg, “Using Interaction Object Graphs to Specify and Develop Graphical Widgets,” Technical Report ISR-TR-94-69, Institute For Systems Research, University of Maryland, 1994. 6. S. Chatty, “Extending a Graphical Toolkit for Two-Handed Interaction,” Proc. ACM UIST'94 Symposium on User Interface Software and Technology, pp. 195-204, AddisonWesley/ACM Press, Marina del Rey, Calif., 1994. 7. W. Citrin, M. Doherty, and B. Zorn, “Design of a Completely Visual Object-Oriented Programming Language,” in Visual Object-Oriented Programming, ed. by M. Burnett, A. Goldberg, and T. Lewis, Prentice-Hall, New York, 1995. 8. M. Conway, R. Pausch, R. Gossweiler, and T. Burnette, “Alice: A Rapid Prototyping System for Building Virtual Environments,” Adjunct Proceedings of ACM CHI'94 Human Factors in Computing Systems Conference, vol. 2, pp. 295-296, 1994. 9. J. Cremer, J.K. Kearney, and Y. Papelis, “HCSM: a Framework for Behavior and Scenario Control in Virtual Environments,” ACM Transactions on Modeling and Computer Simulation, vol. 5, no. 3, pp. 242-267, July 1995. 10. J. Cremer, J.K. Kearney, and Y. Papelis, “Driving Simulation: Challenges for VR Technology,” IEEE Computer Graphics and Applications, pp. 16-20, September 1996. - 67 11. C. Elliot, G. Schechter, R. Yeung, and S. Abi-Ezzi, “TBAG: A High Level Framework for Interactive, Animated 3D Graphics Applications,” Proc. ACM SIGGRAPH'94 Conference, pp. 421-434, Addison-Wesley/ACM Press, 1994. 12. S.K. Feiner and C.M. Beshers, “Worlds within Worlds: Metaphors for Exploring nDimensional Virtual Worlds,” Proc. ACM UIST'90 Symposium on User Interface Software and Technology, pp. 76-83, Addison-Wesley/ACM Press, Snowbird, Utah, 1990. 13. M.A. Flecchia and R.D. Bergeron, “Specifying Complex Dialogs in ALGAE,” Proc. ACM CHI+GI'87 Human Factors in Computing Systems Conference, pp. 229-234, 1987. 14. J. Foley, W.C. Kim, S. Kovacevic, and K. Murray, “Defining Interfaces at a High Level of Abstraction,” IEEE Software, vol. 6, no. 1, pp. 25-32, January 1989. 15. B. Freeman-Benson and A. Borning, “The Design and Implementation of Kaleidoscope'90, a Constraint Imperative Programming Language,” Proc. IEEE Computer Society International Conference on Computer Languages, pp. 174-180, April 1992. 16. M. Gleicher, “A Graphics Toolkit Based on Differential Constraints,” Proc. ACM UIST'93 Symposium on User Interface Software and Technology, pp. 109-120, AddisonWesley/ACM Press, Atlanta, Ga., 1993. 17. R. Gossweiler, C. Long, S. Koga, and R. Pausch, “DIVER: A Distributed Virtual Environment Research Platform,” IEEE Symposium on Research Frontiers in Virtual Reality, San Jose, Calif., October 25-26, 1993. 18. M. Green, “The University of Alberta User Interface Management System,” Computer Graphics, vol. 19, no. 3, pp. 205-213, 1985. - 68 19. M. Green and R.J.K. Jacob, “Software Architectures and Metaphors for Non-WIMP User Interfaces,” Computer Graphics, vol. 25, no. 3, pp. 229-235, July 1991. 20. M. Green and S. Halliday, “Geometric Modeling and Animation System for Virtual Reality,” Comm. ACM, vol. 39, no. 5, pp. 46-53, May 1996. 21. R.D. Hill, “Supporting Concurrency, Communication and Synchronization in HumanComputer Interaction-The Sassafras UIMS,” ACM Transactions on Graphics, vol. 5, no. 3, pp. 179-210, 1986. 22. R.D. Hill, “The Rendezvous Constraint Maintenance System,” Proc. ACM UIST'93 Symposium on User Interface Software and Technology, pp. 225-234, Atlanta, Ga., 1993. 23. R.D. Hill, T. Brinck, S.L. Rohall, J.F. Patterson, and W. Wilner, “The Rendezvous Architecture and Language for Constructing Multiuser Applications,” ACM Transactions on Computer-Human Interaction, vol. 1, no. 2, pp. 81-125, June 1994. 24. K. Hinckley, R. Pausch, J.C. Goble, and N.F. Kassell, “A Survey of Design Issues in Spatial Input,” Proc. ACM UIST'94 Symposium on User Interface Software and Technology, pp. 213-222, Marina del Rey, Calif., 1994. 25. D. Hix, J.N. Templeman, and R.J.K. Jacob, “Pre-Screen Projection: From Concept to Testing of a New Interaction Technique,” Proc. ACM CHI'95 Human Factors in Computing Systems Conference, pp. 226-233, Addison-Wesley/ACM Press, 1995. http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/dh_bdy.htm [HTML]; http://www.eecs.tufts.edu/~jacob/papers/chi95.txt [ASCII]. - 69 26. S. Hudson and I. Smith, “Practical System for Compiling One-Way Constraint into C++ Objects,” Technical Report, Georgia Tech Graphics, Visualization, and Usability Center, 1994. 27. S.E. Hudson, “Graphical Specification of Flexible User Interface Displays,” Proc. ACM UIST'89 Symposium on User Interface Software and Technology, pp. 105-114, AddisonWesley/ACM Press, Williamsburg, Va., 1989. 28. S.E. Hudson, “Incremental Attribute Evaluation: A Flexible Algorithm for Lazy Update,” ACM Transactions on Programming Languages and Systems, vol. 13, no. 3, pp. 315-341, July 1991. 29. R.J.K. Jacob, “Executable Specifications for a Human-Computer Interface,” Proc. ACM CHI'83 Human Factors in Computing Systems Conference, pp. 28-34, 1983. 30. R.J.K. Jacob, “Using Formal Specifications in the Design of a Human-Computer Interface,” Communications of the ACM, vol. 26, no. 4, pp. 259-264, 1983. Also reprinted in Software Specification Techniques, ed. N. Gehani and A.D. McGettrick, AddisonWesley, Reading, Mass, 1986, pp. 209-222. http://www.eecs.tufts.edu/~jacob/papers/cacm.txt [ASCII]; http://www.eecs.tufts.edu/~jacob/papers/cacm.ps [Postscript]. 31. R.J.K. Jacob, “An Executable Specification Technique for Describing Human-Computer Interaction,” in Advances in Human-Computer Interaction, Vol. 1, ed. by H.R. Hartson, pp. 211-242, Ablex Publishing Co., Norwood, N.J., 1985. 32. R.J.K. Jacob, “A Specification Language for Direct Manipulation User Interfaces,” ACM Transactions on Graphics, vol. 5, no. 4, pp. 283-317, 1986. - 70 http://www.eecs.tufts.edu/~jacob/papers/tog.txt [ASCII]; http://www.eecs.tufts.edu/~jacob/papers/tog.ps [Postscript]. 33. R.J.K. Jacob, “Eye Movement-Based Human-Computer Interaction Techniques: Toward Non-Command Interfaces,” in Advances in Human-Computer Interaction, Vol. 4, ed. by H.R. Hartson and D. Hix, pp. 151-190, Ablex Publishing Co., Norwood, N.J., 1993. http://www.eecs.tufts.edu/~jacob/papers/hartson.txt [ASCII]; http://www.eecs.tufts.edu/~jacob/papers/hartson.ps [Postscript]. 34. R.J.K. Jacob, J.J. Leggett, B.A. Myers, and R. Pausch, “Interaction Styles and Input/Output Devices,” Behaviour and Information Technology, vol. 12, no. 2, pp. 69-79, 1993. http://www.eecs.tufts.edu/~jacob/papers/bit.txt [ASCII]; http://www.eecs.tufts.edu/~jacob/papers/bit.ps [Postscript]. 35. R.J.K. Jacob, “A Visual Language for Non-WIMP User Interfaces,” Proc. IEEE Symposium on Visual Languages, pp. 231-238, IEEE Computer Society Press, 1996. http://www.eecs.tufts.edu/~jacob/papers/vl.txt [ASCII]; http://www.eecs.tufts.edu/~jacob/papers/vl.ps [Postscript]. 36. D.J. Kasik, “A User Interface Management System,” Computer Graphics, vol. 16, no. 3, pp. 99-106, 1982. 37. M. Kass, “CONDOR: Constraint-Based Dataflow,” Proc. ACM SIGGRAPH'92 Conference, pp. 321-330, Addison-Wesley/ACM Press, 1992. 38. J.B. Lewis, L. Koved, and D.T. Ling, “Dialogue Structures for Virtual Worlds,” Proc. ACM CHI'91 Human Factors in Computing Systems Conference, pp. 131-136, AddisonWesley/ACM Press, 1991. - 71 39. J. Liang and M. Green, “JDCAD: A Highly Interactive 3D Modeling System,” Computers and Graphics, vol. 18, no. 4, pp. 499-506, Pergamon Press, London, 1994. 40. J.D. Mackinlay, S.K. Card, and G.G. Robertson, “A Semantic Analysis of the Design Space of Input Devices,” Human-Computer Interaction, vol. 5, pp. 145-190, 1990. 41. S.A. Morrison and R.J.K. Jacob, “A Specification Paradigm for Design and Implementation of Non-WIMP User Interfaces,” ACM CHI'98 Human Factors in Computing Systems Conference, pp. 357-358, Addison-Wesley/ACM Press, Poster paper, 1998. 42. S.A. Morrison, “A Specification Paradigm for Design and Implementation of non-WIMP Human-Computer Interactions,” Doctoral dissertation, Tufts University, 1998. 43. B.A. Myers, Creating User Interfaces by Demonstration, Academic Press, Boston, 1988. 44. B.A. Myers, D.A. Giuse, R.B. Dannenberg, B. Vander Zanden, D.S. Kosbie, E. Pervin, A. Mickish, and P. Marchal, “Garnet: Comprehensive Support for Graphical, HighlyInteractive User Interfaces,” IEEE Computer, vol. 23, no. 11, pp. 71-85, November 1990. 45. B.A. Myers, “User Interface Software Tools,” ACM Transactions on Computer-Human Interaction, vol. 2, no. 1, pp. 64-103, March 1995. 46. W.M. Newman, “A System for Interactive Graphical Programming,” Proc. Spring Joint Computer Conference, pp. 47-54, AFIPS, 1968. 47. J. Nielsen, “Noncommand User Interfaces,” Comm. ACM, vol. 36, no. 4, pp. 83-99, April 1993. 48. D.R. Olsen, User Interface Management Systems: Models and Algorithms, Morgan Kaufmann, San Mateo, Calif., 1992. - 72 49. P. Reisner, “Formal Grammar and Human Factors Design of an Interactive Graphics System,” IEEE Transactions on Software Engineering, vol. SE-7, no. 2, pp. 229-240, 1981. 50. J.R. Rhyne, “Extensions to C for Interface Programming,” Proc. ACM SIGGRAPH Symposium on User Interface Software, pp. 30-45, 1988. 51. J. Rohlf and J. Helman, “IRIS Performer: A High Performance Multiprocessing Toolkit for Real-Time 3D Graphics,” Proc. ACM SIGGRAPH'94 Conference, pp. 381-394, AddisonWesley/ACM Press, 1994. 52. C. Shaw, M. Green, J. Liang, and Y. Sun, “Decoupled Simulation in Virtual Reality with the MR Toolkit,” ACM Transactions on Information Systems, vol. 11, no. 3, pp. 287-317, 1993. 53. B. Shneiderman, “Multi-party Grammars and Related Features for Defining Interactive Systems,” IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-12, no. 2, pp. 148-154, 1982. 54. J.L. Sibert, W.D. Hurley, and T.W. Bleser, “An Object-Oriented User Interface Management System,” Computer Graphics, vol. 20, no. 4, pp. 259-268, 1986. 55. M.P. Stevens, R.C. Zeleznik, and J.F. Hughes, “An Architecture for an Extensible 3D Interface Toolkit,” Proc. ACM UIST'94 Symposium on User Interface Software and Technology, pp. 59-67, Addison-Wesley/ACM Press, Marina del Rey, Calif., 1994. 56. B.A. Sufrin and J. He, “Specification, Analysis and Refinement of Interactive Processes,” in Formal Methods in Human-Computer Interaction, ed. by M. Harrison and H. Thimbleby, pp. 153-200, Cambridge University Press, Cambridge, UK, 1990. - 73 57. P. Szekely, “Modular Implementation of Presentations,” Proc. ACM CHI+GI'87 Human Factors in Computing Systems Conference, pp. 235-240, 1987. 58. S.L. Tanimoto, “VIVA: A Visual Language for Image Processing,” Journal of Visual Languages and Computing, vol. 1, no. 2, pp. 127-139, June 1990. 59. B. Vander Zanden, B.A. Myers, D.A. Giuse, and P. Szekely, “Integrating Pointer Variables into One-Way Constraint Models,” ACM Transactions on Computer-Human Interaction, vol. 1, no. 2, pp. 161-213, June 1994. 60. C. Ware and R. Balakrishnan, “Reaching for Objects in VR Displays: Lag and Frame Rate,” ACM Transactions on Computer-Human Interaction, vol. 1, no. 4, pp. 331-356, December 1994. 61. T. Yunten and H.R. Hartson, “A Supervisory Methodology and Notation (SUPERMAN) for Human-Computer System Development,” in Advances in Human-Computer Interaction, Vol. 1, ed. by H.R. Hartson, pp. 243-281, Ablex Publishing Co., Norwood, N.J., 1985. 62. P. Zave and M. Jackson, “Conjunction as Composition,” ACM Transactions on Software Engineering and Methodology, vol. 2, no. 4, pp. 379-411, October 1993.