Florian Campora (00208108) Hugo Questroy (00077267) Andrea Dini (00245003) Adel Barkallah (00399422) Laurent Garambois (00208019) All in BEng4 (Hons) Computer Networks & Distributed Systems DATA VISUALISATION & VISUAL QUERYING ODB Coursework 10/05/2002 Contents Contents ...................................................................................................................................... 2 Table of Figures ......................................................................................................................... 3 I. Data Visualisation ................................................................................................................... 4 1.1 Need ................................................................................................................................. 4 1.2 Concept............................................................................................................................. 5 1.3 Type of Information ......................................................................................................... 6 1.3.1 1-Dimensional ........................................................................................................... 6 1.3.2 2-Dimensional ........................................................................................................... 8 1.3.3 3-Dimensional ........................................................................................................... 9 1.3.4 Temporal ................................................................................................................. 11 1.3.5 Multi-Dimensional .................................................................................................. 13 1.3.5 Multi-Dimensional .................................................................................................. 13 1.3.6 Tree.......................................................................................................................... 15 1.3.7 Networks ................................................................................................................. 16 1.4 Visual Information seeking mantra ................................................................................ 18 1.4.1 Focus & Context...................................................................................................... 18 1.4.2 Zooming and Filtering ............................................................................................. 23 1.4.2 Details-on-demand .................................................................................................. 25 II. Visual Querying ................................................................................................................... 27 2.1 Need ............................................................................................................................... 27 2.2 Concept........................................................................................................................... 27 2.3 Form-based representation ............................................................................................. 28 2.4 Diagram based data querying ......................................................................................... 31 2.5 Icon based data querying ................................................................................................ 32 2.6 Hybrid based data querying............................................................................................ 34 Conclusion ................................................................................................................................ 34 References ................................................................................................................................ 35 Table of Figures Figure 1 : Stock Price Graph ...................................................................................................... 4 Figure 2 : Information Visualizer ©Copyright Rolf Daessler, 1995 [4]. ................................... 5 Figure 3 : Inxight Table Lens ..................................................................................................... 7 Figure 4 : Document Lens .......................................................................................................... 7 Figure 5 : ArcExplorer 2 screenshot .......................................................................................... 8 Figure 6 : AVS/Express – Medical solutions [9] ....................................................................... 9 Figure 7 : WebBook screenshot ............................................................................................... 10 Figure 8 : WebForager screenshot ........................................................................................... 10 Figure 9 : Microsoft Project 2000 ............................................................................................ 11 Figure 10 : LifeLines ................................................................................................................ 12 Figure 11 : alternate interfaces ................................................................................................ 12 Figure 12 : FilmFinder ............................................................................................................. 13 Figure 13 : VisDB screenshot .................................................................................................. 14 Figure 14 : Inxight Star Tree Viewer screenshot ..................................................................... 15 Figure 15 : NSFNET11 traffic ................................................................................................. 16 Figure 16 : SeeSoft screenshot ................................................................................................. 17 Figure 17 : GraphVisualizer3D screenshot .............................................................................. 17 Figure 18 : General concept of Focus & Context ..................................................................... 19 Figure 19 : Fisheye view .......................................................................................................... 20 Figure 20 : Perspective wall view ............................................................................................ 21 Figure 21: Cone tree visualisation ............................................................................................ 22 Figure 22 : Simple view of zooming in Pad++ ........................................................................ 23 Figure 23 : Dynamic queries approach .................................................................................... 25 Figure 24 : SpotFire Pro screenshot ......................................................................................... 25 Figure 25 : Forms representation compared with others query languages ............................... 28 Figure 26 : Basic representation of a form-based query .......................................................... 28 Figure 27 : form based query system ....................................................................................... 29 Figure 28 : Query with VOODOO ........................................................................................... 30 Figure 29 : Another query with VOODOO .............................................................................. 30 Figure 30 : Example of diagram ............................................................................................... 31 Figure 31 : SUPER (QBD software example) ......................................................................... 31 Figure 32 : Example of a QBI software ................................................................................... 32 Figure 33 : An application of a QBI software .......................................................................... 33 Figure 34 : Marmotta screenshot .............................................................................................. 33 I. Data Visualisation 1.1 Need The increasing information glut and the advances in data access on digital networks have created a demand for new concepts to retrieve information. Data itself does not represent a resource easily understandable to retrieve useful information. The data has first to be transformed in a way providing directly to the user the information he/she is looking for, especially if users are novices in computer sciences. Text based data require a lot work from the user before retrieving usable information. Using the fact that humans have remarkable perceptual abilities in order to visually analyse, recognize, and detect images (colours, shape, size, texture) [1], investigations to present data visually to the user have been conducted. So, the need was to transform data into information before displaying it to the user. A new research area was then created concerning the Information Visualisation (InfoVis) [2]. Information Visualisation enables people to deal with all this information by taking advantage of their innate visual perception capabilities. For instance, scanning down a long list of numerical records is less understandable and more difficult to retrieve useful information than analysing a graphical representation of the records’ list as shown in figure 1. Date Price Date Price 8/1 104 ¾ 8/18 104 8/4 106 ¼ 8/19 107 15/16 8/5 106 ½ 8/20 108 8/6 107 7/8 8/21 105 ¾ 8/8 105 ¼ 8/22 106 3/8 8/11 103 8/25 105 8/12 103 7/16 8/26 103 5/16 8/13 104 3/8 8/27 103 5/16 8/14 103 5/8 8/28 101 1/8 8/15 99 15/16 8/29 101 3/8 Figure 1 : Stock Price Graph Because of this ability to analyse these visual features so readily, good information visualisation systems can offer to users a way to perceive information more easily, but also to present him/her more information at one time. 1.2 Concept A lot of work have been realized in this area, to try and propose standards methods to create information visualisation systems. Lots of bad designs have been implemented, because of the lack of rules imposing a standard structural scheme to follow during the development of such a system. Nevertheless, there are many interesting questions that can, and should be asked when designing or selecting an appropriate information visualisation system. For instance [3]: What are users currently doing to understand the information that is not presented as a visualisation ? How does the new visualisation help ? When comparing visualisations, do users perceptions of improved speed and/or better accuracy in understanding the information hold up to testing, or do users like the appearance of the information visualisation, but in fact, it does not help users understand the information being presented or reduces their performance. The following scheme is a representation of the basic concept of an information visualisation system. Indexing Index Document Database Visualization Browse Information Space Navigation Information Visualizer Figure 2 : Information Visualizer ©Copyright Rolf Daessler, 1995 [4]. To visualize information, it is necessary to define an information space which is abstract and differ from physical data spaces that typically have a spatial mapping. The document database is where data is stored and index contains terms that describe the document contents. These terms are selecting from manual or automatic indexing. An information visualisation system has to be relevant to the type of information treated, in order to create an efficient information space. It also has to provide several tasks available to the user in order to interact and navigate with it. These two points lead to two different classifications of existing information visualisation systems. The two following parts will focused on the available InfoVis systems classified by the type of information treated and the kind of interaction provided. 1.3 Type of Information In order to be efficient, a data visualisation system has to be relevant to the data type to represent. For instance, maps will not be represented visually the same way as hierarchical trees. Schneiderman introduced a data’s taxonomy [5] in which he describes seven different type of data : 1-Dimensional 2-Dimensional 3-Dimensional Temporal Multi-Dimensional Tree Networks According to him, when designing an information visualisation system, developer has to understand what kind of data he/she has to represent. Thus, the following subsections will describe these data types and examples of systems for each one. 1.3.1 1-Dimensional This is, generally, linear data types including textual documents, program source code, and lists of names in sequential order. Typically, large data sets are formatted in tables or spreadsheets. This makes it impossible to view the data without scrolling up and down and left to right. The user has no way of getting his/her "hands" around the data. And, in order to do any kind of analysis, the user must know what the data contains. To improve the 1Dimensional data representation, use of visual aspects such as colour, and/or shapes, helps to display information only in one screen size, and to attract user’s attention on specific records. Systems such as Inxight Table Lens [6]. Inxight Table Lens™ technology provides graphical displays of tabular data that represent a new way to explore datasets - even those that are too large to view and comprehend in table or XY plot form. By creating Table Lens objects, users can interact with their data to find patterns and trends for further exploration. Viewing a graphical representation of over 100 columns and 65,000 rows of data - all on the same screen - makes data trends and correlations jump out. These patterns often represent significant insights and conclusions that are not normally discovered without extensive, time-consuming data analysis. Figure 3 : Inxight Table Lens On the above screenshot, the whole data set is displayed, but some specific records are highlighted on the same screen, which is what the user is looking for with this type of information. Another example of system which display one-dimensional data visually is Document Lens [7]. This kind of system maps multiple, reduced pages of text onto a three-dimensional shape, so the user is enabled to find more quickly the information and to navigate to a particular page. Figure 4 : Document Lens 1.3.2 2-Dimensional Two-dimensional data is data that consists of two primary attributes that are represented in a space. Width and height represent the size of an item, for example, and the placement of an item on an x-axis and a y-axis represents a location in space. A geographic map with city locations, a floor plan of a building, and clusters of related documents in a document collection are all two-dimensional visualizations. On the other hand, two-dimensional data do not imply that the data items are only defined by two attributes. The representation will be focused on two primary attributes but other properties of each item will also be reachable. The most common type of applications that use this kind of data visualisation are the so-called geographic information systems (GIS). The software ArcExplorer 2 developed by the ESRI company [8] provides an easy way to perform GIS functions. It is used for a variety of display, query, and data retrieval applications and supports a wide variety of standard data sources. It can be used on its own with local data sets or as a client to Internet data and map servers. Figure 5 : ArcExplorer 2 screenshot 1.3.3 3-Dimensional Basically, this kind of data is used to represent real-world objects, such as molecules, the human body, buildings, and so on. With the recent development of virtual reality programming languages, such as VRML (Virtual Reality Modelling Language) or Java3D, new facilities are available to create such representations of data. On the other hand, users of these 3D systems have to cope with the navigation through them. Indeed, new concepts, such as position or orientation, are introduced and some users could not feel comfortable using them. It could then be more complicated to retrieve useful information with this representation than with initial text-based records. Nevertheless, such representations are very useful, especially in the medicine imaging or space domains. Figure 6 : AVS/Express – Medical solutions [9] Representation of real-world objects is not the only domain in 3D representation is useful. Indeed, other applications use three-dimensional data to provide a means of exploring and navigating information in a “natural” way, and in doing so enable the user to reach the information he/she is looking for more easily. The WebBook and WebForager [7] applications use three-dimensional representations of documents and Web pages to improve the process of accessing information on the Web. WebBook collects related Web pages and displays them in a connected, three-dimensional view in which the user can see more than one page, flip through the pages, and directly access a desired page. WebForager uses the same principle as WebBook but provides several books, in a threedimensional space, that the users can choose and navigate through using the WebBook system. These both applications could lead to an entire three-Dimensional representation of the Internet content Figure 7 : WebBook screenshot Figure 8 : WebForager screenshot 1.3.4 Temporal In temporal data types, time lines are widely used and accepted. It is quite difficult to differentiate temporal data from 1-D data, as both are based on long lists of records. Nevertheless, some properties of temporal data allow to implement a representation quite different. Indeed, the fact that temporal events can be either simultaneous or overlapping and the underlying multiple scales of temporal data which require both precise and gross measurements, differentiate temporal data from simple lists. Common user tasks of this type of data are, for example, finding all events before, after, or during some time period. A widely used system which is based on this type of representation is Microsoft Project 2000 [10], which allows the user to manage a full range of projects, to schedule and closely track all tasks, and to exchange project information with other members of the project’s team. This tool uses a timeline to enable the user to see at a glance the duration of events, when events occur in relationship to each other, and which events have dependencies on other events. Figure 9 : Microsoft Project 2000 Another example which shows the power of visual timelines can be illustrated by the LifeLines [11] system developed at the University of Maryland. LifeLines provides an interface for visualizing biographical or personal history information. The following figures are examples from a medical application of this tool. In this application, the complete medical history of a patient is entered into a database. The LifeLines interface provides an overview of a patient's history on a timeline and provides tools for changing scale and focusing on specific details. Events, attributes, and relationships from the patient's entire available medical history are indicated by icons, horizontal lines, colour, and line thickness. Figure 10 : LifeLines Figure 11 : alternate interfaces The interface used by LifeLines is only one application of a framework that is designed to be applied to many types of personal histories, such as court records and professional histories, where the relationships between events are more complicated than can be represented by the simpler timelines used in project management. 1.3.5 Multi-Dimensional Basically, all the previous data representation could be considered as subsets of the MultiDimensional representation. This kind of representation is aimed for data items that own more than tree attributes, as less than three attributes can be represented in 1D, 2D or 3D. Contrary to two-dimensional data representation where two attributes are more important than the other, in this kind of representation all the attributes are more or less equals. So, multidimensional representation can be used, for instance, for an application which allow the user to classify data items according of the values of any of their attributes. An early example of applications to visualize multi-dimensional data is FilmFinder introduced by Christopher Ahlberg and Ben Schneiderman [12] Figure 12 : FilmFinder In this system, user is able to look for films according to several attributes. Thanks to the scroll bars on the right hand side of FilmFinder, users can modify interactively their query and the results appear immediately on the main window. An extrapolation of this system is VisDB for multidimensional data visualisation proposed by Keim and Kriegel in 1994 [13]. In this system, the number of data items that can be visualized is much higher than in other approaches. This technique uses each pixel of the display to represent one data value. this means that the number of data values which can be visualized at one point of time is only limited by the number of pixels of the display. The generated visualisations are querydependent. Query-dependency means that not only the data items fulfilling the query are visualized, but also a number of data items that only approximately fulfil the query. Figure 13 : VisDB screenshot 1.3.6 Tree Tree, or hierarchical, data is data that has inherent structure in which each item, or node, has a single parent node (except for the root node). Hierarchical structures are quite common and are perhaps those that users use without being aware of it. Business organizations, computer data storage systems, and genealogical trees are all example of hierarchical data organized in a tree structure. The most common form used by everyone is the Windows Explorer present in all the Windows versions since 3.1 (95, 98, NT, 2000, …). It provides to the user a visual structure of the file system present on the computer, allowing him/her to quickly understand location of their files. It also improve the way that users deal with these files (copy, delete, move, …) than with a text-based interface such as the earlier version of Microsoft Operating System, DOS. On the other hand, systems like Windows Explorer become quite irrelevant when the amount of data to represent increases. Indeed, the user has to collapse nodes or to use scroll bars to view the entire data set. A solution to this problem is the Inxight Star Tree Viewer [6], which represents the entire hierarchy at a very small scale, ensuring that the user can see the entire structure, but magnifying the 10 to 30 nodes at the centre of the screen,. Thus, also, it enables the user to see details. Figure 14 : Inxight Star Tree Viewer screenshot 1.3.7 Networks Network data refers to items (in some instances called nodes) that have relationships (links) to an arbitrary number of other items. Because nodes in network data sets are not restricted to a limited number of other nodes to which they link to (unlike hierarchical nodes, which have a unique parent node), there is no inherent hierarchical structure to network data, and there can be multiple paths between two nodes. Both items and the relationships between them can have a variable number of attributes. Network visualisation is an area in which this type of representation can be very useful. Indeed, it is very difficult to display on a same space all the relationships between several nodes. Some companies like NCSA (National Center for Supercomputing Applications) have tried to apply this kind of representation do describe for instance the network traffic in the USA [14]. The following figure is a visualisation study of inbound traffic measures in billions of bytes on the NSFNET11 backbone for September 1991. Figure 15 : NSFNET11 traffic Other applications use the network representation. Indeed, such data visualisation can be used to represent relationships between software components. This was a research project of the University of New Brunswick called GraphVisualizer3D. In contrast to the one-dimensional representation of SeeSoft [15], GraphVisualizer3D uses network diagrams to illustrate how files, classes, variables, and function interrelate. Figure 16 : SeeSoft screenshot Figure 17 : GraphVisualizer3D screenshot These seven data types describe previously are an abstraction of the reality. However, this classification is useful only if it eases discussion and leads to useful discoveries. Some idea of missed opportunities emerges in looking at the tasks. These tasks are described in the following section. 1.4 Visual Information seeking mantra It is well-known that a picture is worth many written words of definition. Moreover, as computer speed and display resolution increase, information visualisation and graphical interfaces are likely to have an expanding role. However, never before users were confronted with such an information glut. In a near future, nearly all information contained in databases, digital libraries and other massive data collections will be available on the Internet. The question of how to benefit from these new technologies leads to the question of how to find specific information. Interaction techniques can bring an answer. Interaction facilities are essential in information visualisation. They provide techniques that allow users to focus on interesting parts of information stored in the database. There are rich and various set of information visualisations that have been proposed in recent years and it would certainly be boring to details most of those tasks. Instead of this, a useful starting point for designing advanced graphical user interfaces is the Visual Information Seeking Mantra presented by Ben Shneiderman: “Overview first (focus and context), zoom and filter, and then details-on-demand”. [5] 1.4.1 Focus & Context The basic idea with focus and context visualisations is to enable users to have the object of primary interest presented in detail but having an overview or a context available at the same time. In order to be able to use different situations of usage to discuss and redefine focus and context visualization, we first need to clarify what focus and context visualisations are. As an introduction, it might also be useful to see how the terms “focus” and “context” are defined in general. “Focus” has been defined as a centre of activity, attraction, or attention, and a position, or condition, of sharp definition of an image. “Context” has been defined as the interrelated conditions in which something exists or occurs and the parts of a discourse or treatise that precede and follow a special passage and may fix its true meaning. There exists a definition of focus and context visualisation that researchers agree upon, it is implicit in the literature. However, the following description of focus and context techniques comes rather close to being such a definition: “Focus and context” starts from three premises: First, the user needs both overview (context) and detail information (focus) simultaneously. Second, information needed in the overview may be different from that needed in detail. Third, these two types of information can be combined within a single (dynamic) display, much as in human vision.” [16] Figure 18 : General concept of Focus & Context Focus plus context screens offer regions of high resolution and regions of low resolution. Image contents preserve their scaling, even when their resolution varies. The geometry of the displayed content, i.e. the ratio between lengths in the image, is thereby preserved. Fisheye approach Fisheye view, an approach proposed by Furnas in 1986 [17], provides a context and detail in one view. This display mode is based on the fisheye-lens metaphor where objects in the centre of the view are magnified and objects further form the centre are reduce in size. [18] A fisheye view uses a "degree of interest" function to determine what should be displayed in a complex graph or space. The function states that the degree of interest the user has in viewing a certain object increases with object importance and decreases with distance from the user. The importance and distance factors might be described as size, level in the hierarchy, or some other characteristic of objects in the space or graph. The end effect is that detail is only present in the position where the user presently is browsing. Only the most important objects are shown at a distance. [17] This method assumes that nodes closely related to each other will be locally spaced. This is not necessarily the case. The relation between any two nodes is at the discretion of the user. There may be many ways to rate importance among nodes based on what the user might be interested in. In essence, the degree of interest function is likely to change for each user and that is the problem. Indeed, multiple users must see the same graph from their own perspective. This would change the graph entirely. Also, based on what the degree of interest function was chosen to be, the view might still be far too cluttered with information. Figure 19 : Fisheye view Advantages: The user can control the distortion factors. Disadvantages: Always keep all the information on the screen, which gets cluttered. The view of a graph is changing according to the user perspective. Perspective Wall and Cone Tree approaches Perspective Wall and Cone tree are metaphors for 3-D visualization of linear and hierarchical abstract data respectively. Both visualization techniques use interactive animation to explore dynamically changing views of information structures. They are included in the commercial software product Visual Recall© (XSoft, 1994) which supports the management of large file collections. The main problem in visualisation of linear information structures is the accommodation of the extreme information aspect ratio on the computer monitor. A common technique for this particular problem is the integration of a detailed and contextual scalereduced views. Figure 20 : Perspective wall view The perspective wall has additional features for user interaction. The wall moves a selected item into the centre panel with a smooth animation. The user can adjust the ratio of detail and context and a document browser allows for the inspection of each item keeping the context view of the selected item. Hierarchical data models represent appropriate structures for visualization and navigation. Cone trees are hierarchies laid out uniformly in three dimensions to minimize the size of the visualized structure and to enable a view of the whole data structure. Figure 21: Cone tree visualisation The cone tree visualisation shows the classification of the documents according to the structure of the file system. A typical search will find all items related to a selected item. The user can rotate the tree, to bring a special item to the front. A document browser allows for the inspection of selected items. Cone trees are a common technique to visualise hierarchical structures of abstract information in 3-D. [19] Advantages: Good for evenly-balanced trees. Disadvantages: Large 2D trees become too cluttered. currently limited to roughly 1000 nodes, 10 layers, branching factor of 30. large computational requirements. incapable of displaying multiple hierarchical data structures. [20] 1.4.2 Zooming and Filtering Sometimes the quantity of information available makes it undesirable to display all of it. This might occur for any of the following reasons: the quantity of information is such that the system cannot displays data in a reasonable amount of time, the data has so many dimensions that it is impractical to display all of it at once in a 2- or 3-dimensional display, and the user knows he or she is interested in only a particular subset of the data. In these cases, filtering the information can be a need in some way. If this filtering takes the form of selecting a subset of the data along a range of numerical values of one or more dimensions, it is a filtering zooming technique. Filtering and zooming work by reducing the amount of context in the display; this distinguishes them from the focus and context techniques, which attempt to retain all the contextual information even if it must drawn so small as to make it virtually invisible. Pad++ and Jazz : Zoomable User Interfaces (ZUIs) ZUIs are an interface technique that present a huge canvas of information on a traditional computer display by letting the user smoothly zoom in to get more detailed information, and out for an overview. Information can be clustered to show what goes together, and users can intuitively grasp what information is accessible. Pad++ is a multiscale interface that changes the view of what is displayed in the information space by zooming in or out. Users can change what is in view by panning or zooming. The panning style in Pad++ works similar to panning modes in many drawing applications such as Adobe Photoshop. In Photoshop, a small hand appears when the user wants to drag the picture across the screen. This dragging style is analogous to the dragging that Pad++ utilizes. Whereas many applications implement zooming in discrete jumps by clicking on a zoom tool, the zooming in Pad++ is smooth, animated and can operate over orders of magnitude. Space-scale diagrams are a method used to represent multiscale interfaces. The Pad++ zoom interface is easily modeled using a space-scale diagram. Figure 22 : Simple view of zooming in Pad++ Jazz is a Java 2 toolkit that supports the development of 2D structured graphics programs in general, and Zoomable User Interfaces (ZUIs) in particular. It is built entirely in Java and runs on all platforms that support Java 2. It uses the Java2D renderer because of its clean design and focus on high-quality 2D graphics, and is organized to support efficient animation, rapid screen updates, and high quality stills. Jazz makes it easy for Java programmers to build their own animated graphical applications with zooming, multiple cameras, layers, images, etc. Dynamic queries approach Dynamic queries applied to the items in the collection are one of the key ideas in information visualisation. [21] [22] Dynamic queries are a novel approach to information seeking that may enable users to cope with information overload. They allow users to see an overview of the database, rapidly (100 msec updates) explore and conveniently filter out unwanted information. Users fly through information spaces by incrementally adjusting a query (with sliders, buttons, and other filters) while continuously viewing the changing results. It appears that one preference of users for dynamic queries is the control they can have over the database. They quickly perceive patterns in the data, fly through the data by adjusting sliders, and generate new queries in 100 msecs based on what they discover through incidental learning. By contrast, most database queries are specified by typing a command in keyword-oriented language such as SQL, DIALOG, or FOCUS and the result is a tabular list of tuples containing alphanumeric fields. This traditional approach is appropriate in many problem solving tasks, but formulating queries by direct manipulation and displaying the results graphically has advantages in many situations. For novices, learning to formulate queries in a command language may take several hours and then they must deal with the high level of errors in syntax and semantics. Many projects have demonstrated that visual information seeking methods can be helpful in formulating queries and graphical results in context, such as on a map or a wall, aid comprehension. For experts, the benefits of visual interfaces may be still greater since they will be able to formulate more complex queries and interpret intricate results. Example: Geographic applications emerge naturally as candidates for dynamic queries. The test is based on a system for real estate brokers and their clients that allowed them to locate homes by changing sliders for the price, number of bedrooms, distance from work, etc. Each of the 1100 homes satisfying the query appeared as a point of light on a Washington, DC map. Users could explore the database to find neighbourhoods with high or low prices by moving a slider and watching where the points of light appeared. Figure 23 : Dynamic queries approach Geographic queries were supported by allowing users to mark the locations where they and their spouse work. Then users could modify the sliders on distance to the work places to give intersecting circles of acceptable homes. 1.4.2 Details-on-demand The “details on demand” task is the last one of the Visual Information Seeking Mantra. It allows the user, after filtering and zooming, to select an item or group and get details when needed. Once a collection has been reduced to a few dozen items it should be easy to browse the details about the group or individual items. The usual approach is to simply click on an item to get a pop-up window with values of each of the attributes. In Spotfire, the details-ondemand window can contain HTML text with links to further information. Figure 24 : SpotFire Pro screenshot Once a set of item is extracted, detail-on-demand is available for further manipulation. Spotfire (http://www.spotfire.com/) is a commercial visualization package that originated in Ben Shneiderman's Human Computer Interaction Laboratory and is an example of the model of overview, then zoom and filter, and finally details on demand. With its ability to overview a large collection of data quickly and then to select and filter on data values with easy to use sliders, Spotfire seemed like it would offer a powerful tool for discovering previously unknown patterns and gaps in their collections. The ability to zoom in from abstract representations of an entire collection to specific items also makes Spotfire a useful exhibition and acquisition planning tool. A drawback of Spotfire is the complexity of the interface. This is particularly true with the current commercial version of the product, which has been enhanced to "deliver a variety of web-enabled applications through its decision analytics workspace called Spotfire.net The way of representing data and navigate through this information is only one step in the data visualisation process. As this kind of representation is based on the fact that humans have remarkable perceptual abilities, it seems natural that queries which will be expressed on the database be visual as well. In the following section, visual querying process will be exposed. II. Visual Querying 2.1 Need It is obvious today that databases are designed and modelled by professionals but many different people access the database to retrieve information; they use a query language to achieve it which is composed by formal operators. [24] So far, most systems use linear query representation such as SQL (Structured Query Language) [23] or programming languages like C++ and Java which require a specialised knowledge of the language syntax. As these languages do not explicitly represent the meaning of data, many systems had to adapt their interfaces to make them usable by the majority. As a result, the use of direct manipulation languages has spread, characterised by the visibility of objects of interests and the substitution of the command language syntax by the direct manipulation of objects. Visual querying systems are an example of these direct manipulation languages: they provide a language to express the queries to the databases in a visual way. These systems are very useful for novice users who can learn and run queries in a more intuitive way, without needing to memorise the database schema or the syntax of the query language. Thus, different types of users with different technical backgrounds can benefit from the assets of databases. These visual query systems are now part of databases and their concept will be now described. 2.2 Concept The major advantages of visual querying systems come from their graphical query representations: many of them are implemented under WIMP environment (Windows, Icons, Menu and Pointers) [24] as it represents the close relation between users and computer applications; in fact, visual formalisms include familiar objects such as tables, diagrams or icons. Icons can for example refer to the objects present in the database while relationships among them can be represented by the links in a diagram and collections of instances can be shown in a form. These new querying methods are very useful for users who don’t need to know how the structure is, or which query language is used at the back. They also may help users in formulating the queries by answering incomplete ones during the process. Another advantage that these kinds of systems can offer is the stability of the query process as users cannot make syntactic erroneous queries. There are three main basic visual querying representations; first, the form-based representation which is composed by cells and sometimes buttons; then the diagram-based representation which is a graphic that encode the information using position and size of geometrical objects; the icon-based representation which uses visually segmented objects to inform about a message or an information such as a concept, a state or a function; and finally the hybrid representation which combine the precedent visual formalisms but often one of them is more used than the others. 2.3 Form-based representation An important issue that must provide a database management system is to allow users to express ad-hoc queries which are usually done by using a textual language such as SQL. But because of the difficulty of this language for novice users, some designers decided to let the users express directly the query by filling in forms with an example of the requested data. Figure 25 : Forms representation compared with others query languages A form is by definition a frame which is composed by cells and/or buttons. These cells may contain data specified by users; filling in a cell or clicking on a button generates a data manipulation. Figure 26 : Basic representation of a form-based query An example of this querying method is QBE (Query-By-Example) which was first introduced by Zloof in 1975; QBE is an improvement of the “form fill-in query system” by including expressions describing the separates attributes. An example of a form-based querying system is “Enviro Data” which is a program for managing data; this software uses Microsoft Access as a front-end user interface and any ODBC-compliant database as a back-end database server. [26] Figure 27 : form based query system This system allows the users to select data from drop-down lists of station names (wells, borings, etc.), parameter names, etc., enter the dates and/or depths you are interested in, and then retrieve the requested data. Visual queries are not much developed in object-oriented databases as some researchers think that as OODB queries are complex [25], the query formulation process has to be complex too. VOODOO (Visual Object Oriented Database language for ODMG OQL) is a tool that allows to make a visual query of the object-oriented database. This method extrapolates simple queries from complex ones. The VOODOO query uses the tree form which represents the structure of the database’s schema where every class or type reference is expanded. The root of the tree is the persistent root and each tree node represents a class or a structure. The tree nodes are developed on demand so the structure can stay clear. The formulation of the query specified from the tree leaves to the root by mapping each tree node to some value, which is then passed to the parent node to form the value of the parent tree node. There is enough type information that are displayed by the system to help the users constructing the query. VOODOO also uses type information to prevent the users from constructing invalid queries. The figures below show some examples of queries with VOODOO: Figure 28 : Query with VOODOO “find the name of the department whose head is Smith” Figure 29 : Another query with VOODOO “find all the names and addresses of all instructors in the CSE department who earn more than 100K” 2.4 Diagram based data querying In comparison with form based data querying, query by diagram is a more “user friendly” system thank to which non expert users can understand and easily extract information from a database [24, 27]. In the specific area of visual query systems, diagrams are composed of non complex shapes in order to represent the data and possible actions (display, print, delete…); relations between data are represented by linking those shapes together. This entire graphical environment leads to a noticeable facility of use. The navigation in those environments are most of the time done by a mouse or a tactile screen thus the use of keyboard is reduce to its minimum. Figure 30 : Example of diagram We can distinguish two main types of QBD (Query By Diagram) “user environment”: The top down browsing and schema transformation. In top down browsing, it is relatively easy to locate fundamentals concepts links and subschema of interest [27]. With a schema transformation method, diagrams can be modify in order to better cope with the query. Due to its simplicity, diagram based data querying is one of the most used techniques in the Visual Query Languages. Here an example of a QBD software : SUPER [28] Figure 31 : SUPER (QBD software example) 2.5 Icon based data querying In this method, users interact as well with a database, thanks to a graphical environment. They build their query by manipulating icons which represent not only real objects but also abstract concepts, action or processes. A good definition of an icon is: “the icon is a visually segmented object which tells the viewer about an inside message or information (concept, function, state, mode, etc.) assigned by the designer” [29]. An important preoccupation of Query By Icons (Q.B.I.) designers is that they have to be sure of their icons signification. For example, different peoples can have an different understanding of the same picture. Some developers are trying to write a standard for icon representation but at this time nothing is really defined. Relationships between object are more implicit when using an icon based data querying rather than with a QBD where objects are linked together. This lead the user to use a kind of metaphor when he / she is interacting with the database. Here an example exposed in [31] using QBI : The query is : “Show the weight of Hi-Fi VCRs with price equal to or cheaper than $600; select the lightest one among them”. To create this query the user only need to move the VCR icon onto the adapted ones (i.e.: price and after weight) and the query in automatically generated. Figure 32 : Example of a QBI software QBI systems provide a good alternative for rapid database access even if you are a non expert user or a database specialist. Another domain takes advantage of using QBI: mobile computing. Indeed, most of mobile computers don’t have a keyboard so accessing a database just by moving icons is an important advantage. The following example is a QBI software for accessing a radiological database [30]. Figure 33 : An application of a QBI software The last example about QBI is the software Marmotta which “allows its users to formulate a query by means of icons; it then translates the query into a format which can be handled by a HTTP server as it came from a form within a web client and then sends the query. When Marmotta gets the results, it translates them into an inner format allowing them to be easy managed, inspected and manipulated by users [32]. Figure 34 : Marmotta screenshot 2.6 Hybrid based data querying Hybrid based data querying systems is a combination of different VQSs (Visual Querying Systems). Those different systems can be used in the same interface or at different stages of the application. For example you can use a form based data querying to display detailed information of an object in the database, such as its property or instances or to present your results thanks to it. Diagrams are commonly implemented to show relation between objects of a database and who they are linked together. Icon based data querying systems, are good for representing actions, classes of object or objects them self when it is possible (you can’t represent by using icons all the content of a large data base.). Figure 35 : Semantic Knowledgeable Interface (SKI) Conclusion : In order to extract useful information from a database, the data has often to be treated before read. This process can be very difficult for a person who is not used with computers. Using the fact that humans have remarkable perceptual abilities, data visualisation and visual querying systems meet these users’ requirements. Nevertheless, a standard system or solution does not exist as each problem ha to be treated separately. So, a lot of rich and various visual systems appeared but each one answering a specific problem. It can be very disconcerting for these users to choose the adapted solution to their situation. References [1] Dr Keith Andrews, March 6th 2000. Information Visualisation, Lecture Notes. http://www.iicm.edu/ivis/ivis.pdf [2] Official web site of Information Visualisation Group. http://www.infovis.org [3] Jefferey S Saltz, Jonathan M Steinbach, June 25th 19997. CODATA Euro-American Workshop. Way Finding In Layered Information Worlds. Visualization of Information and Data : Where We Are and Where Do We Go From Here [4] Professor Rolf Daessler, 1995. University of Applied Sciences in Potsdam. Visualization of Abstract Information. http://fabdp.fh-potsdam.de/daessler/paper/uom0595/tut.html [5] Ben Schneiderman, July 1996. The Eyes Have It : A Task by Data Type Taxonomy for Information Visualizations [6] Inxight Products http://www.inxight.com [7] Stuart K. Card, George G. Robertson, and William York, 1993. The WebBook and the WebForager: An Information Workspace for the World-Wide Web. [8] ArcExplorer. ESRI. http://www.esri.com [9] AVS – Advanced Visual Systems. Medical Solution http://www.avs.com [10] Microsoft Project 2000. http://www.microsoft.com [11] LifeLines. http://www.cs.umd.edu/hcil/lifelines/ [12] Christopher Ahlbery and Ben Schneiderman. Visual information seeking: Tight coupling of dynamic query filters with starfield displays. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI'94), pp. 313-317. Addison-Wesley, April 1994. [13] Keim D.A., Kriegel H.-P.: “VisDB: Database Exploration using Multidimensional Visualization”, Computer Graphics & Applications , Sept. 1994, pp.40-49. [14] Visualisation Study of the NSFNET backbone. NCSA Company. http://archive.ncsa.uiuc.edu/SCMS/DigLib/text/technology/Visualization-StudyNSFNET-Cox.html [15] Stephen G. Eick, Memeber, IEEE, Joseph L. Steffen, and Eric E. Summer, Jr. IEEE Transactions on Software Engineering, Vol. 18, No 11, November 1992. SeeSoft – A Tool For Visualizing Line Oriented Software Statistics. [16] Card, S.K., Mackinlay, J.D., and Shneiderman, B. (Eds.) Readings in Information Visualization: Using Vision to Think, pp. 1-34, Morgan Kaufmann Publishers, San Francisco, California, 1999. [17] Furnas, George W. "Generalized Fisheye Views," Proc. ACM SIGCHI 1986, April 1986, 16--23. [18] Manojit Sarkar and Marc H. Brown (1992) Graphical fisheye views of graphs. In Proceedings of ACM CHI'92 Conference on Human Factors in Computing Systems, pp. 83-91, Monterey, California, May 3-7, ACM Press. [19] Robertson, G. G., Mackinlay, J. D. and Card, S. K. (1991) Cone trees: animated 3d visualizations of hierarchical information. In Proceedings of ACM SIGCHI'91 Conference on Human Factors in Computing Systems, pp. 189-194, New Orleans, Louisiana, April 28-May 2, ACM Press. [20] Pearson & Steinmetz. The JOVE project: 3D tree-mapped universes and data sphere navigation, a method to increase the density of hierarchical information visualization in a finite display space. 1993. [21] Ahlbery Christopher, Williamson Christopher, and Shneiderman Ben. Dynamic queries for information exploration: An implementation and evaluation, Proc. ACM CHI’92: Human Factors in Computing Systems, ACM, New York, NY (1992), 619626. [22] Williamson Christopher, and Shneiderman Ben. The Dynamic HomeFinder: Evaluating dynamic queries in a real-estate information exploration system, Proc. ACM SIGIR’92 Conference, ACM, New York, NY (1992), 338-346. Reprinted in Shneiderman, B. (Editor), Sparks of Innovation in Human-Computer Interaction, Ablex Publishers, Norwood, NJ, (1993), 295-307. [23] Hui Lui. Visual Interface for Querying a CASE repository. [24] Tiziana Catarci, Maria F Costabile, Stefano Levialdi, and Carlo Batini. Visua Query Systems for Databases : A Survey. [25] Leonidas Fegaras. VOODOO : A Visual Object-Oriented Database Language for ODMG OQL. [26] The Science Software Group. http://www.scisoftware.com [27] Query By Diagram: A Graphical Environment For Querying Databases (1994) Tiziana Catarci, Giuseppe Santucci [28] SUPER Visual Interaction with an Object-based ER Model (1992) Lausanne , Annamaria Auddino, Yves Dennebouy, Yann Dupont, Edi Fontana, Stefano Spaccapietra and Zahir Tari [29] Fujii, H. and R. R. Korfhage, "Features and a Model for Icon Morphological Transformation," Proceedings 1991 IEEE Workshop on Visual Languages, Kobe, Japan, 1991 [30] Supporting Mobile Database Access through Query by Icons (1996) Antonio Massari,et al. [31] Visual Strategies for Querying Databases C. Batini, T. Catarci, M. F. Costabile, S. Levialdi [32] Progressive HTTP-based Querying of Remote Databases within the Marmotta Iconic VQS, Fabrizio Capobianco, Mauro Mosconi, Lorenzo Pagnin