SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts Sudarshan Murthy, David Maier Department of CSE, OGI School of Science & Engineering at OHSU {smurthy, maier}@cse.ogi.edu May 5, 2003 Abstract People often impose new interpretations onto existing information. In the process, they work with information in two layers: A base layer, where the original information resides, and a superimposed layer, where only the new interpretations reside. Referring to base information in this manner reduces redundancy and preserves context. Superimposed information management (managing layered information) preserves these layers of information. It facilitates superimposed applications in which people can interact with heterogeneous information sources. Superimposed application developers like to interact with base layers using a common interface regardless of base-layer types or access protocols. Also, users like to perform operations such as navigation and querying with any base layer. Unfortunately, base layers differ in their support for these operations. Appropriate architectural abstractions placed between these layers can ease communication between the two layers and make up for some of the deficiencies of base layers. They can also allow independent evolution of components in these layers. An architecture called SLIM (Superimposed Layer Information Management) has already been defined to demarcate and revisit information inside base layers. However, some superimposed applications need to do more; they also need to access content and context of information inside base layers. We have defined a successor to SLIM, called SPARCE (Superimposed Pluggable Architecture for Contexts and Excerpts), for this purpose. In this paper, we describe the goals, design and implementation of SPARCE. We also present results of evaluating SPARCE. Keywords: Information modeling, superimposed information, software architecture, hypertext, compound documents, excerpts, context, SLIM, SPARCE. 1. Motivation and Introduction The United States Department of Agriculture, Forest Service (USFS) routinely makes decisions to solve (or prevent) problems concerning forests. The public may appeal any USFS decision [6]. The appeal process begins when an appellant sends in an appeal letter. An appeal letter raises issues with a USFS decision and the decision-making process. It frequently cites documents such as the Decision Notice (DN) and Environmental Assessment (EA). A USFS Editor processes all appeal letters pertaining to a decision and prepares an appeal packet for a reviewing officer. An appeal packet contains all documents a reviewing officer might need to recommend a decision about issues raised in appeal letters. The RID letter (RID stands for Records, Information, Documentation) is one of the documents in an appeal packet. This letter lists the issues raised in appeal letters and a summary response for each issue. An Editor synthesizes a RID letter using information in documents such as the DN, EA, and specialists’ reports. In this letter, the editor presents information from other documents in a variety of forms such as excerpts, summaries, and commentaries. In addition, the editor documents the location and identity of the information sources used in synthesizing the letter. Composing a RID letter requires an editor to maintain large working sets. Since it is not unusual for an editor to be charged with preparing appeal packets for several decisions simultaneously, the editor may need to maintain several threads of organization. Though using documents in electronic form can be helpful, such use does not necessarily alleviate all problems. For example, the editor still needs to SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts document the identity and location of information. In using electronic documents, the editor has to cope with multiple document windows open at once. Also, locating information in large documents can be tedious (using traditional search utilities available in applications). We hypothesize that the USFS appeal process (and similar processes in other domains) can benefit from superimposed information management. Superimposed information is information placed over existing sources (called base layers). Word-processor documents, spreadsheets, presentations, databases, and web pages are examples of base layers. Delcambre and others [8] have described the Superimposed Layer Information Management architecture (SLIM) to manage superimposed information. SLIM facilitates addressing of information inside base layers, but it does not facilitate retrieval of content or context from base layers. Some superimposed applications (such as those that might be developed to assist in preparation of RID letters) need such information. We have defined the Superimposed Pluggable Architecture for Contexts and Excerpts (SPARCE) to allow access to content and context information in base layers. We have built two applications—RIDPad and Schematics Browser—to exercise this new architecture. RIDPad is intended to assist a USFS editor to collect and organize material to prepare RID letters. The Schematics Browser allows USFS personnel to view past appeal decisions as instances of entities and relationships in a structured data model. In this paper, we describe the goals for SPARCE, its design and implementation, and the two SPARCE applications. We also present results of architecture evaluation. The rest of this paper is organized as follows: Section 2 gives an overview of superimposed information and SLIM. Section 3 demonstrates the need for excerpts and contexts. In Section 4 we describe the goals, design, and an implementation of SPARCE. In Section 5 we describe two SPARCE applications and present results of evaluating SPARCE. We present related work, future work, and a summary in Sections 6, 7, and 8 respectively. 2. Superimposed Information and Applications Maier and Delcambre [11] define superimposed information as “data placed over existing information sources to help organize, access, connect and reuse information elements in those sources.” They call existing information sources base layers, and data placed over the base layers the superimposed layer. Figure 1 shows these two layers of information. An application that manipulates base information is called a base application; an application that manipulates superimposed information is called a superimposed application. Maier and Delcambre note that superimposed information has been used since before the advent of computers (for example, concordances and commentaries), but that there is a new need to reexamine this concept in the electronic setting. Improved addressability and accessibility of base layers, and increased digitization of information are some reasons for the renewed interest. The rest of this section briefly describes an earlier architecture and application for superimposed information management. Complete details are available in the work of Delcambre, Maier, and others [7, 8, 11]. 2.1 SLIM and SLIMPad Delcambre and others [8] have defined an architecture called Superimposed Layer Information Management (SLIM) for management of superimposed information and applications that use it. SLIM defines an abstraction called a mark to represent an occurrence of information inside base layer. Figure 1 shows how a superimposed layer uses marks to address base information. This figure is adapted from a description of SLIM [8]. Marks provide a uniform interface to address base information, regardless of base-layer type or access protocol. Several mark implementations, typically one per base-layer type, exist in SLIM. A mark implementation decides the addressing scheme appropriate to the base layer it supports. For example, a Page 2 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts mark for a selection in a spreadsheet might contain the row and column numbers for the first and last cell in the selection, whereas a mark for a selection in an XML document might contain an XLink. Superimposed Layer marks Base Layer Information Source1 Information Source2 … Information Sourcen Figure 1: Marks in SLIM architecture. Figure 2: SLIM reference model. Superimposed applications may represent superimposed information in any model, regardless of the base-layer types they interact with. They only need to store IDs of marks, which they can exchange for mark objects (from a Mark Management module as we will see shortly). Superimposed applications work seamlessly with any base-layer type due to the uniform interface supported by all marks. Figure 2 shows a reference model for SLIM as a UML Conceptual Diagram. The Mark Management module (subsystem) is responsible for creating and storing marks, retrieving marks (given a mark ID), and for navigating back to the base layer. The Superimposed Information Management module provides storage service to superimposed applications. Use of this module is optional. The Clipboard is used for inter-process communication (at mark creation time). SLIMPad is a superimposed application that employs SLIM. It allows users to create marks in a variety of base layers and paste them as scraps in a SLIMPad document. Scraps are superimposed information elements associated with a mark ID and a label. Users can also create bundles, which are named groups of scraps and other bundles. Figure 3 shows a SLIMPad document. The rectangles within the workspace denote bundles. The first label inside a bundle (left, top corner of the bundle) is its name. Other contents of the workspace are scraps. A user can activate (double click) a scrap to return to the corresponding base layer. Figure 3: A SLIMPad document. 2.2 Limitations of SLIM Mark management in SLIM consists of mark creation, retrieval, and resolution (navigation) operations. These operations are necessary, but not sufficient for superimposed applications such as those we envision for the USFS appeal process. As we will show in Sections 3 and 4, users sometimes like to see contents and context of a mark (such as the containing paragraph and section heading), from within a superimposed application. SLIM does not support these operations. The SLIM implementation also has some packaging and deployment limitations. For example, the mark management module must run within the process of any superimposed application. As a result, several instances of the mark management module may be loaded at once (one for each instance of a superimposed application) even though one instance would suffice. Some usability issues also exist in SLIM implementation. For example, the mark creation process is modal. That is, a user must create a scrap immediately after creating a mark in a base layer. This limitation is mainly due to the way SLIM uses the shared Clipboard service. The USFS editors we have Page 3 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts interacted with would like to create multiple marks in a base layer, without having to use them immediately. Our representative users have also asked for new facilities (such as reusing marks) in superimposed applications. Providing some of these facilities requires new architecture and application capabilities. 3. Excerpts and Contexts An Excerpt is the content of a marked base-layer element. An excerpt can be of various types. For example it may be plain text, formatted text, or an image. An excerpt of one type could be transformed into other types. For example, formatted text in a word processor could also be seen as plain text, or as a graphical image. Context is information concerning a base-layer element. Presentation information such as font name and color are examples of context. Information such as the containing paragraph and section heading are also examples of context. Because we use the same mechanism to support both excerpts and contexts, we will often use the term “context” to refer to both kinds of information about a mark. 3.1 Need for excerpts and contexts A USFS editor excerpts contents from base layers in the process of preparing a RID letter. In addition, the editor sometimes examines features related to excerpts. For example, the editor may want to determine the heading of the section an excerpt is in or examine the information (such as previous or next sentence) surrounding an excerpt in the base layer. USFS editors would like to see both excerpts and contexts from within a superimposed application. Fulfilling some user needs might require superimposed applications to access context information that a user might not explicitly access. We demonstrate such needs using an example. We have noted some variations in how editors use excerpts in RID letters. An editor sometimes reproduces excerpts exactly as they are in base layers. At other times, the editor retains only parts of the original formatting, or no formatting at all. Compliance with documentation guidelines, consistency, and personal preferences are some of the factors that influence the choice. The need to display an excerpt (formatted exactly or partly as in its base layer), requires a superimposed application to examine the excerpt’s context. Consider the fragment of a web page as displayed by a web browser shown in Figure 4(a). The highlighted portion of the fragment is the marked region. Figure 4(b) shows three possible HTML markups corresponding to the marked region. The first markup specifies the font name and size, the second markup specifies only the font size, and the third markup does not specify any font characteristic. Font characteristics not specified are inherited from an enclosing element (transitively) in the second and third markups. If no enclosing element specifies the necessary font characteristic, the web browser uses a default value, possibly from an application setting. That is, a superimposed application might need to examine markup all the way up to the start of the web page, or even examine an application setting to display the marked region. <font name=”Times New Roman” size=”3”>Cheatgrass, &nbsp;<i>Bromus tectorum</i>,&nbsp;grows near many caves in this project area.</font> <font size=”3”>Cheatgrass, &nbsp;<i>Bromus tectorum</i>,&nbsp;grows near many caves in this project area.</font> Cheatgrass,&nbsp;<i>Bromus tectorum</i>,&nbsp; grows near many caves in this project area. (a) Web-page fragment. (b) Three possible HTML markups for the marked region. Figure 4: A fragment of a web page and some possible markups. Page 4 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts 3.2 Architectural considerations in supporting excerpts and contexts We observe that several kinds of context are possible for a mark. The following is a representative list of context kinds along with examples for each kind. We also note that the context of a mark may include elements of many kinds. Content: Text, graphics. Presentation: Font name, color. Placement: Line number, section. Sub-structure: Rows, sentences. Topology: Next sentence, next paragraph. Container: Containing paragraph, document. Application: Options, preferences. Contexts can vary across base-layer types. For example, the context of a mark to a region on a painting (in a graphics-format base layer) includes background color and foreground color, but it does not include font name. However, the context of a mark to a selection in a web page includes all these three elements. Contexts can also vary between marks of the same base-layer type. For example, an MS Word mark to text situated inside a table may have a “column heading,” but a Word mark to text not situated in a table does not include that kind of context. Lastly, the context of a mark itself may change with time. For example, the context of a mark to a figure inside a document includes a “caption” only as long as a caption is attached to that figure. Thus, architecture to manage excerpts and contexts must support a variety of excerpt types, variety of contexts among and within base-layer types, and the variable context of a mark. In addition, we predict that superimposed application developers prefer a uniform interface to context of any base-layer element. We predict this preference from our experience in architecture design in general, and that of our colleagues (specifically with mark management in SLIM). 4. SPARCE We have designed and implemented the Superimposed Pluggable Architecture for Contexts and Excerpts (SPARCE) to facilitate mark and context management. The desirable qualities1 of SPARCE are: Functionality: Manage marks, contexts, and excerpts. Reusability: Many superimposed applications must be able to use the same SPARCE implementation. More than one superimposed application must be able to run simultaneously on the same computer, and be able to interact with multiple base layers. Modifiability: We must be able to improve SPARCE and its applications independently with minimal affect on each other. Extensibility: We must be able to support new base-layer types and context elements without affecting existing applications. Usability: The implementation must use familiar metaphors, and follow relevant development and user-interface conventions. It must also aid usability of applications developed. Package flexibility: We must be able to customize deployment units (process packaging) to meet application and user needs. Testability: The implementation must aid verification and validation of itself, and that of its applications. The reference model for SPARCE is derived from the model for SLIM (shown in Figure 2). Figure 5 shows the SPARCE reference model as a UML Conceptual Diagram. The Context Management module is the major addition to the model. Superimposed applications depend on this module to obtain the 1 An overview of qualities of software architectures is available in the work of Bass and others [Bass 1998]. Page 5 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts context of a mark. This module depends on Mark Management module to locate information inside base layers. It then interacts with base layers to determine contexts. The Clipboard module is similar to the Clipboard in operating systems such as Macintosh and MS Windows. An implementation might just reuse the available Clipboard object or emulate a clipboard using a shared data structure. SPARCE differs subtly from SLIM in its use of the Clipboard. SLIM does not utilize a base application’s ability to copy information to the Clipboard, but SPARCE does. This change improves usability in some cases (as illustrated in Section 4.1). Figure 5: SPARCE reference model. Figure 6: Two mark creation scenarios. The Superimposed Application, Mark Management, and Superimposed Information Management modules function just as they do in SLIM. SPARCE reuses the Superimposed Information Management module of SLIM to provide storage service to superimposed applications. Consequently, we do not describe that module in this paper. In the remainder of this section, we describe the design of other modules of SPARCE and their implementation. 4.1 Mark management The Mark Management module supports three operations for marks: creation, retrieval, and resolution. Mark creation is the operation of generating a new mark corresponding to a selection in a base layer and storing the mark in a repository. The retrieval operation returns a mark from the mark repository. Resolution is the operation of navigating to a location inside the base layer. Mark creation is the most complex of these operations due to a variety of creation means and base application constraints. Consequently, we dedicate the rest of this sub-section to that operation. The mark creation process consists of two steps: generating the address of base information (and other auxiliary information), and using the information generated to create a mark object in the mark repository. The information generated in the first step is called mark fodder. The address contained in mark fodder depends on the addressing mechanism needed to support the base-layer type. For example, the fodder for a mark to information inside a PDF document contains the starting and ending word indexes; the fodder for a mark to a selection in a spreadsheet contains the row and column numbers for the first and last cell in the selection. There are many possible scenarios for creation of marks, but we describe only two due to space constraints. Figure 6 depicts the two scenarios in a UML Use Case Diagram. The boxes in the figure denote system boundaries. The broken arrows denote object flows. In both the scenarios depicted, a user starts mark creation in a base application and completes it in a superimposed application. The superimposed application in which the user completes mark creation retrieves the mark fodder from the Clipboard, and passes it to the Mark Management module. The Mark Manager creates a mark object from the fodder. It assigns a unique ID to the mark object and stores it in the mark repository. The Copy use case in Figure 6 demonstrates an ideal scenario. Ideally, a user should be able to select base information in a preferred base application, copy it to the Clipboard, and complete mark creation in a superimposed application. This scenario is ideal because it does not require the user to learn any new application, tool, or process to create marks. However, supporting this scenario requires cooperative base Page 6 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts applications: base applications must either copy sufficient information to the Clipboard (to generate mark fodder), or they should allow other applications to hook into the copy process. Microsoft Word and Excel [13] are examples of such cooperative applications. Some base applications do not facilitate tapping into Clipboard operations, but they might provide mechanisms (such as plug-ins or add-ins) to extend their environments. A special mark creation tool can be inserted in to the user interface of such an application. The user operates this special tool to copy mark fodder to the Clipboard, and then completes mark creation in a superimposed application (as in the first scenario). The Mark use case in Figure 6 demonstrates this scenario. Early versions of Adobe Acrobat and Netscape Navigator are examples of base applications in this category. 4.2 Context management SPARCE uses a set of classes and interfaces for context management. Figure 7 shows them in a UML class diagram. Table 1 provides a brief description of the classes and interfaces. SPARCE supports context retrieval for three classes of objects: marks, containers, and applications. SPARCE uses the classes Mark, Container, and Application respectively to model these elements. A Container is an abstraction for a base document (or a portion of that document). An Application is an abstraction for a base application. SPARCE also defines the interface Context-Aware Object to any baselayer element that supports context. The classes Mark, Container, and Application implement this interface. As we mention in Section 3.2, architecture for contexts must accommodate the variety of contexts both across and within base layers. Also, the context of any base-layer element must be accessible through a common interface. SPARCE fulfils these requirements by treating context as a property set (a collection of name-value pairs). In SPARCE, context is the entire set of properties of a base-layer element and a context element is any one property. For example, the text excerpt and font name of a mark are context elements. SPARCE uses the class Context to model a property set and the class Context Element to model a property. Figure 7: Classes and interfaces for context management. The class SPARCE Manager is at the root of SPARCE. It maintains a repository of marks. Superimposed applications use this class to create new marks from mark fodder, and to retrieve existing marks. This class can also provide information about containers and applications where marks are made. The interface Context Agent helps SPARCE achieve its extensibility goal. A class that implements this interface takes a context-aware object and returns its context. That is, SPARCE does not access base-layer elements or their contexts directly. It uses external agents to do so on its behalf. However, SPARCE is responsible for associating a context-aware object with an appropriate context agent. SPARCE obtains the name of the class that will be the context agent for a mark from the fodder used to create the mark. The same agent is used to retrieve context for the mark’s container and application2. SPARCE instantiates the 2 Sharing the same agent can reduce development time because the code needed to retrieve context of a mark, its container, and its application tend to be similar. Page 7 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts context agent class by name whenever a superimposed application accesses the context of a context-aware object. Class/Interface Description Mark A mark to base-layer information. Container The base document (or a portion of document) in which a mark is made. Application The base application in which a mark is made. Context-Aware Object (interface) Interface to any base-layer element that is able to provide its context. The classes Mark, Container, and Application implement this interface. Context The context of a context-aware object. It is a collection of context elements. Context Element A single piece of context information about a context-aware object. Context Agent (interface) Interface to any base-layer. An implementation will retrieve context from a context-aware object. There is usually one implementation per base-layer type. SPARCE Manager Creates, stores, and retrieves marks; associates context-aware objects with appropriate context agents. Word Agent The context agent for MS Word. PDF Agent The context agent for Adobe Acrobat. Table 1: Classes and interfaces for context management. 4.3 Implementation We have implemented SPARCE for Microsoft Windows operating systems. The implementation currently includes support for the following base applications: MS Word, MS Excel, and Adobe Acrobat (PDF files). The agents for these base applications support the following kinds of context: content, presentation, containment, placement, sub-structure, topology, document, and application. We used the following tools to implement SPARCE and support the base applications mentioned above: MS Visual Studio 6 (C++ and Visual Basic), MS XML Parser 4.0, MS Office XP, and Adobe Acrobat 5 SDK. We used ActiveX technology [12] to implement the classes in Figure 7. In addition to implementing the classes shown in Figure 7, we have also implemented the following reusable view facilities. All these facilities, except the Clipboard Viewer, are superimposed applications. Context Browser: A window that displays the current context of a context-aware object. Context Viewer: A window that displays the value of one context element of a context-aware object. Clipboard Viewer: A window that lists mark fodders (available on the Clipboard) that could be used to create new marks. With this facility, users can start many instances of the mark creation process in base layers, and complete those instances later in one or more superimposed applications (as described in Section 4.1). Property Pages: Tabbed windows to display properties of existing context-aware objects. Selection Dialogs: Dialog windows to select existing context-aware objects. For example, a dialog can be used to select a mark already in the marks repository for reuse in a superimposed application. Package Type Package Type Mark and context management ActiveX EXE Context Browser and Viewers ActiveX DLL Mark fodder generation and Clipboard monitoring ActiveX EXE Adobe Acrobat plug-in Regular DLL Clipboard Viewer ActiveX DLL MS Office and PDF context agents ActiveX DLL Clipboard access helpers Regular DLL MS Office add-in ActiveX DLL Table 2: SPARCE packages. Page 8 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts The implementation resulted in eight physical packages (deployment units or process packages). Table 2 lists the packages and their types. ActiveX EXE servers run out of process, whereas ActiveX DLL servers and regular DLLs both run within a process. We note that the packaging shown in Table 2 might need to be changed based on deployment and application needs. We mention some packaging considerations in Section 5.3. 4.4 Supporting new base-layer types and contexts The SPARCE implementation includes the infrastructure needed to interact with any base-layer type. Enabling support for a new base-layer type is as easy as following the five steps shown below. 1. Study the base layer to understand support for marking. This study should include understanding the addressing scheme for the base layer. 2. Decide which mark creation scenarios to support. 3. Implement mark fodder generation. Mark fodder generation is done when mark fodder is copied to the Clipboard. 4. Decide what context elements to support. 5. Implement a context agent for the base layer. The context agent must be packaged in an ActiveX server and registered with the operating system for the current SPARCE implementation. We have also implemented some testing aids. A context agent that does nothing is included so that the development process can be incremental. We recommend using this agent until Step 3 is completed successfully. This proxy agent can be later replaced with the agent developed in Step 5. We have also included a superimposed application (presented in Section 5.1) that can be used to test support for new base-layer types. This application can display properties and contexts of any context-aware object. Changing the context elements supported for a base-layer type only requires changing the property set population in its context agent. We note that adding support for new base-layer types or changing context elements supported does not require any change (or recompilation) to the SPARCE implementation or superimposed applications. However, the context agent will need compilation. 5. Evaluation We have evaluated SPARCE by implementing it and by building superimposed applications to use that implementation. We have also added support for a few popular base-layer types. Table 3 shows the procedure we followed to evaluate SPARCE. The table also shows the qualities each step was aimed at evaluating. The qualities desirable in SPARCE are described in Section 4. We have already described the implementation of SPARCE in Section 4.3. In the remainder of this section we describe the superimposed applications we have developed and the results of evaluation of SPARCE. Evaluation Step Qualities Evaluated Implement SPARCE, support one base-layer type (MS Word) Functionality, usability Implement one superimposed application (RIDPad) Functionality, usability Support new base-layer types and new context elements (MS Excel, Adobe Acrobat) Modifiability, reusability, extensibility Implement viewers Functionality, usability, reusability, testability Implement second superimposed application (Schematics Browser) Functionality, reusability Change deployment scenarios Package flexibility Table 3: Evaluation procedure. Page 9 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts 5.1 RIDPad RIDPad is a superimposed application for the USFS appeal process (described in Section 1). A USFS editor can use this application to collect and organize information needed to prepare a RID letter. A RIDPad instance is a collection of items and groups. An item is a superimposed information element associated with a mark. It has a name and a description. The name is user-defined and the description is the text excerpt from the associated mark. A group is a convenient collection of items and other groups. RIDPad uses a simple information model to store mark IDs and text excerpts. Figure 8(a) shows a RIDPad instance with information concerning the “Road 18 Caves” decision (taken in the Pacific Northwest Region of USFS). The instance shown has eight items in four groups. The group titled “Environmental Assessment” contains two groups. The information in the instance shown comes from three distinct base documents in two different base applications. (The item labeled “Comparison of Issues” contains an MS Excel mark; all other items contain MS Word marks.) All items were created using base-layer support included in the implementation of SPARCE. (a) A RIDPad instance (b) A mark resolved to its base layer. (c) Context of a mark. Figure 8: A RIDPad instance, a base layer, and the Context Browser. RIDPad affords many operations for items and groups. A user can create new items and groups, and move items between groups. The user can also rename, resize, and change visual characteristics such as color and font for items and groups. With the mark associated with an item, the user can examine its properties and browse its context from within RIDPad. The user can navigate to the base layer if necessary. When examining context, the user can copy a context element’s value to the Clipboard and paste it into other applications. For example, a USFS editor can copy information to a RID letter being composed in a word processor. (We are in the process of developing an application to build RID letters based on RIDPad instances.) Figure 8(b) shows the base layer opened for the mark associated with the item titled “FONSI.” The highlighted region corresponds to the mark. Figure 8(c) shows the context browser for the same mark. On the right, the browser shows the text of the mark with full formatting. The hierarchy of the mark’s context is displayed to the left. 5.2 Schematics Browser Appeal letters from different appellants in the USFS appeal process tend to share features. They all contain appellant names and addresses, refer to a DN, and raise issues. Such similarities suggest a schema for appeal letters. A superimposed schematic is a schema superimposed over base information in a model similar to the Entity-Relation model [4]. The Schematics Browser is a superimposed application developed by our colleague Shawn Bowers to demonstrate the use of superimposed schematics. We altered this application to use SPARCE to access base layers. We also enhanced this application so a user can view contexts of base-layer elements. Page 10 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts The Schematics Browser lists superimposed schematics and allows a user to browse an instance of a selected schematic. (At present this application does not support creation of new schematic instances.) Entities and attributes in a schematic instance may be distributed over any number of base layers. Marks can be associated with entities and attributes. The Browser stores mark IDs and other information related to schematic instances in a relational database. Figure 9 shows an instance of a USFS appeal decision (decision on issues raised in appeals) schematic opened in the Schematics Browser. The numbered circles in the figure are annotations used to describe the figure and not part of the interface itself. Window 1 lists instances of the appeal decision schematic. Window 2 displays the schematic instance selected in Window 1. Window 3 displays details of an entity instance. 1 2 3 Figure 9: Schematics Browser. The “1997 Ranch House Timber Sale” appeal decision schematic instance is shown in Window 2. Although a single schematic instance is displayed in this window (and we are able to browse information as if it comes from a single source), we note that the information currently displayed in Window 3 comes from three distinct base documents. The Issue entity is currently selected in Window 2 and the details of the first Issue entity is displayed in Window 3. The attribute desc in Window 3 is the description of the issue. It has a mark associated with it (indicated by a line under the attribute name). When an entity or an attribute has a mark associated, a user can either visit the base layer or choose to see the context from within the Browser. 5.3 Results We have both qualitative and quantitative results of evaluation of SPARCE. We first present the qualitative results which show that SPARCE has the desirable qualities we listed in Section 4. We then present some quantitative results that provide a measure of time to develop superimposed applications and support new base-layer types. Functionality and reusability: The SPARCE implementation allows superimposed applications to access base-layer elements and their contexts across a variety of base-layer types. We have implemented two superimposed applications using the SPARCE implementation. We did not make any changes in the SPARCE implementation specifically for either of these applications. We have been able to run both superimposed applications simultaneously on the same computer. Modifiability and extensibility: We have been able to modify the SPARCE implementation, the two applications, and the various context agents independently of each other. Upon modification, we have only recompiled the source code for the package modified. That is, we have been able to evolve components of SPARCE and its applications independently. We have added support for new base-layer types and context elements without affecting existing applications or the SPARCE implementation. We have successfully followed the process outlined in Section 4.4 to support new base-layer types. Usability: The mark creation process requires users to perform only operations that are familiar and natural to them. Also, the mark creation process is similar across base-layer types. The process is nonmodal, and it allows users to create many marks in base layers before using them. The Context Browser and Viewers use simple and familiar user-interface elements. The same is true for the Clipboard Viewer. We predict that a user familiar with one superimposed application will not need much learning to use other superimposed applications (if the display services are reused). Ease of development of superimposed applications is also a measure of usability of the architecture implementation. We have Page 11 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts observed that operations with marks and contexts are usually done with a few lines of code (between three and eight lines of code in MS Visual Basic). Package flexibility: Any of the ActiveX servers listed in Table 2 (Section 4.3) can be implemented as an in-process or an out-of-process server. We have verified that reconfiguring packages in this manner does not affect the functionality of SPARCE or its applications. However, we caution that it is important to clearly understand the effect of making such changes. For example, speed of performance is better when operations are performed within a process. However, operations performed outside a process provide resilience to both processes involved (one process aborting does not affect the other). Instantiating a class on a remote computer implicitly requires an out-of-process server. Lastly, the patterns of usage and the technology used to implement the services should be considered in packaging. Testability: The reusable view facilities (listed in Section 4.3) serve as testing aids for superimposed application developers and users. The do-nothing context agent (explained in Section 4.4) and RIDPad are both testing aids when adding support for new base-layer types. We have not described the testability of the SPARCE implementation itself in this paper due to space constraints. We have some statistics that give a feeling for the time to develop superimposed applications and to add support for new base-layer types. We developed the RIDPad application in approximately 30 hours. We altered the existing Schematics Browser in approximately six hours to use SPARCE. We added support for MS Word and Excel in about 7 hours combined. We added support for Adobe Acrobat in about 12 hours (of which six hours were spent in exploring alternatives and understanding the Acrobat SDK). We realize our evaluation process is mostly qualitative. As a result, our results and conclusions have an air of subjectivity. Nevertheless, our observations of the behavior of the SPARCE implementation and its applications show that we have successfully met our goals. With SPARCE, we feel that superimposed application developers will be able to focus mostly on their application needs. We feel similarly about adding support for new base-layer types. 6. Related work Visions of imposing new interpretations onto existing information existed decades ago. Vannevar Bush hypothesized that due to advancements (after 1945) in the field of photocells, microfilms, and photography, researchers would be able to easily record their observations [5]. He also hypothesized that being able to record easily would increase the volume of scientific records and that there must be an efficient means of consulting them. He posited Memex, an imaginary device to store and consult information efficiently. Memex offers the following features to organize information: browsing, navigation, annotations, indexing, associations, and sharing. T.H. Nelson proposed a file structure called the Evolutionary List File (ELF) to meet requirements of a system he was developing for personal filing and manuscript assembly [14]. He termed the file structure evolutionary because it needed to support evolving contents and arrangements. He also determined that list structures were needed to support categories of information. Hypertext is a system of laying out information for non-linear media [14]. In hypertext, a user creates networks of information by establishing relationships (links) among different pieces of information (nodes). The World Wide Web is an example hypertext system. NoteCards, Intermedia, and Dexter [9, 16, 10] are hypertext systems closely related to SPARCE. A compound document is a document composed using information from disparate sources. It allows a user to create one document in which data from many applications can be composed and manipulated, often without leaving the document. OpenDoc and OLE 2 [2, 12] are compound document systems closely related to SPARCE. Weck has built a user interface around a compound document editor (based on a system called Oberon), but that work is applicable only in the domain of symbolic computation [15]. Page 12 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts Hypertext systems, compound document systems, and superimposed information management systems are all attempts to (intentionally or not) realize the visions of Bush and Nelson. However, these systems differ significantly in how they realize those visions. Hypertext systems and compound document systems are useful to layout information for non-linear media and linear media respectively. However, SPARCE can be used to build applications to layout information for either kind of media. Hypertext systems have historically been application-oriented, while compound document systems have historically been architecture-oriented. SPARCE is similar to compound document systems in that it too provides architectural support to build superimposed applications. Domain-specific applications are easier to build with architectural support. Hypertext systems do not manage contexts, and compound document systems manage only content context. SPARCE can manage many kinds of context. NoteCards Intermedia Dexter OpenDoc OLE 2 SPARCE Base type 2 (text, graphics) 3 (text, graphics, timelines) Any Any Any Any Base location Proprietary File system Proprietary Any Any Any Base granularity Document Sub-document Both Both Both Both Link cardinality 2 2 Any 2 2 2 Link traversal Directed Undirected Configurable Directed Directed Directed Dependency Lisp MacDraw None CORBA COM None Operating system Any Macintosh Any Any MS Windows Any Context kinds None None None Content Content Many Table 4: Comparison of SPARCE with related systems. Table 4 compares SPARCE with the related systems mentioned earlier in this section. From the table it is clear that SPARCE combines the strengths of hypertext systems and compound document systems. Like compound document systems, SPARCE can address either whole base layers or just parts. Like Dexter, SPARCE is independent of any programming language or operating system. 7. Future work Our work so far on SPARCE has demonstrated the feasibility of using architectural support to manage marks and contexts for use by superimposed applications. There is a lot of future work possible in improving, validating and applying SPARCE. The following is a list of possible future directions: Validate implementation with assistance from representative users. Benchmark SPARCE against alternative approaches. Evaluate alternatives for context caching. Explore approaches to sharing and distributing superimposed information. Explore modeling constructs for superimposed information. For example, sets and lists. Develop support for base-layer types such as databases, emails, and multimedia. Consider application to other domains. For example, medical information management. Explore additional operations on marks and contexts. For example, mark containment comparisons and queries over contexts. Explore means to manage changing base layers (contents and location). 8. Summary We have demonstrated the need for contexts and excerpts. We have presented SPARCE, architecture to support contexts and excerpts in superimposed applications. We have demonstrated, using example applications, how different superimposed applications reuse the same architecture implementation. We Page 13 of 14 SPARCE: Superimposed Pluggable Architecture for Contexts and Excerpts have also shown that new types of base layers and contexts can be supported without affecting existing superimposed applications. We have highlighted the differences between SPARCE and other related systems. We have also mentioned possible future directions for our work. 9. Acknowledgements We thank Professor Lois Delcambre, the SLIM team in the Database and Object Technology group, and all reviewers. We thank John Davis for helping us understand the USFS appeal process and for providing sample appeal packets. We thank Shawn Bowers for sharing the Schematics Browser’s source code. This work was supported in part by the National Science Foundation Grant IIS-0086002. 10. References [1] Adobe Systems Incorporated. Acrobat Software Development Kit. Available online <http://partners.adobe.com/asn/developer/> Accessed 2003; Apr. 27. [2] The OpenDoc Technical Summary. 1994. In: Apple World Wide Developers Conference Technologies CD; April; 1994; San Jose; CA. [3] Bass L, Clements P, Kazman R. 1998. Software Architecture in Practice; Addison-Wesley; 1998. [4] Bowers S, Delcambre L, Maier D. 2002. Superimposed Schematics: Introducing E-R Structure for In-Situ Information Selections. In: Proceedings of ER 2002; Pages 90–104; Springer LNCS 2503; 2002. [5] Bush V. 1945. As We May Think. In: The Atlantic Monthly; 1945; July. [6] Appeal of Regional Guides and National Forest Land and Resource Management Plans. 2000. Code of Federal Regulations, Title 36, Chapter 2, Part 217; U.S. National Archives and Records Administration; 2000; July 01. [7] Delcambre L, Maier D. 1999. Models for Superimposed Information. In: Proceedings of ER Workshops 1999; Pages 264-280; 1999; November 15-18; Paris, France; Springer; 1999; ISBN 3-540-66653-2. [8] Delcambre L, Maier D, Bowers S, Weaver M, Deng L, Gorman P, Ash J, Lavelle M, Lyman J. 2001. Bundles in Captivity: An Application of Superimposed Information. In: Proceedings of ICDE 2001; Pages 111-120; 2001; April 2-6; Heidelberg, Germany; IEEE Computer Society; 2001; ISBN 0-7695-1001-9. [9] Halasz F.G, Moran T.P, Trigg R.H. 1987. NoteCards in a Nutshell. In: Proceedings of the ACM CHI+GI Conference; 1987; Pages 45-52; ACM Press; New York, NY. [10] Halasz F.G, Schwartz F. 1994. The Dexter Hypertext Reference Model. In: Communications of the ACM; Volume 37; Issue 2; 1994; February; Pages 30-39; ACM Press; New York, NY. [11] Maier D, Delcambre L. 1999. Superimposed Information for the Internet. In: Informal Proceedings of WebDB ’99; Pages 1-9; 1999; June 3-4; Philadelphia, Pennsylvania. [12] Microsoft Corporation. 1995. The Component Object Model Specification; 1995; Oct. 24. [13] Microsoft Corporation. Microsoft Office Development Resources. Available online <http://msdn.microsoft.com/> Accessed 2003; Apr. 27. [14] Nelson T.H. 1965. A File Structure for The Complex, The Changing and the Indeterminate. In: Proceedings of ACM 20th National Conference; 1965; Aug. 24-26; Cleveland, OH; Pages 84-100. [15] Weck W. 1996. Document-Centered Presentation of Computing Software: Compound Documents are Better Workspaces. In: Proceedings of the Fourth International Symposium on Design and Implementation of Symbolic Computation Systems; Karlshue, Germany; Springer LNCS 1128. [16] Yankelovich N, Haan B.J, Meyrowitz N.K, Drucker S.M. 1988. Intermedia: The Concept and the Construction of a Seamless Information Environment. In: IEEE Computer 21; 1; 1988; January; Pages 81-83, 90-96; IEEE. Page 14 of 14