Automatic Layout of Structured Hierarchical Reports The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Bakke, Eirik, David R. Karger, and Robert C. Miller. “Automatic Layout of Structured Hierarchical Reports.” IEEE Trans. Visual. Comput. Graphics 19, no. 12 (2013): 2586–2595. As Published http://dx.doi.org/10.1109/tvcg.2013.137 Publisher Institute of Electrical and Electronics Engineers (IEEE) Version Author's final manuscript Accessed Thu May 26 20:57:03 EDT 2016 Citable Link http://hdl.handle.net/1721.1/87067 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Automatic Layout of Structured Hierarchical Reports Eirik Bakke, David R. Karger, and Robert C. Miller Fig. 1. Interactive adaptation of the layout of the data to be displayed, based on the available horizontal width of an on-screen window. Abstract—Domain-specific database applications tend to contain a sizable number of table-, form-, and report-style views that must each be designed and maintained by a software developer. A significant part of this job is the necessary tweaking of low-level presentation details such as label placements, text field dimensions, list or table styles, and so on. In this paper, we present a horizontally constrained layout management algorithm that automates the display of structured hierarchical data using the traditional visual idioms of hand-designed database UIs: tables, multi-column forms, and outline-style indented lists. We compare our system with pure outline and nested table layouts with respect to space efficiency and readability, the latter with an online user study on 27 subjects. Our layouts are 3.9 and 1.6 times more compact on average than outline layouts and horizontally unconstrained table layouts, respectively, and are as readable as table layouts even for large datasets. Index Terms—Hierarchy data, tabular data, nested relations, layout management 1 I NTRODUCTION An important class of visualizations in everyday business use consists of the table-, form-, and report-style views found in most graphical relational database applications [2]. The data being displayed is typically structured, meaning that each value has an associated type and label in a schema. Furthermore, the data frequently needs to be presented in a hierarchical manner, because of the need to visualize oneto-many relationships between entities in the database. For instance, when users request “a list of employees, grouped by department” or “all information about a customer, including associated support tickets and a list of open orders,” what is being displayed is a structured hierarchical view of the underlying relational data. Unlike other kinds of office productivity software, database applications tend to be tailored for specific target domains such as payroll, customer relationship management (CRM), or inventory tracking, or even more specific subdomains such as hotel reservation systems, inventory management for the offshore industry, or bug tracking systems for software developers [1]. The domain-specific nature of these systems makes them costly to develop and maintain, and there is seldom an opportunity to hire a graphic designer every time a new form needs to be created or an existing one updated. Yet, the software developer must invariably define the details of the view’s visual layout: the location of labels, the dimensions of text fields, the width of table • Eirik Bakke is at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). E-mail: ebakke@mit.edu. • David R. Karger is at MIT CSAIL. E-mail: karger@mit.edu. • Robert C. Miller is at MIT CSAIL. E-mail: rcm@mit.edu. Manuscript received 31 March 2013; accepted 1 August 2013; posted online 13 October 2013; mailed on 4 October 2013. For information on obtaining reprints of this article, please send e-mail to: tvcg@computer.org. 1077-2626/13/$31.00 © 2013 IEEE columns, the organization of form fields into columns on a page, and so on. Since each database application can contain a sizable number of views—say, one table view and one form view per major (non-weak) entity type in the database schema, plus any special-purpose report views—maintaining the detailed visual layouts of each view is a significant development burden, and can lead to poor user interfaces. The problem we approach in this paper is as follows: How can the visualizations required of typical domain-specific database applications be generalized in such a way as to eliminate the need for manual tweaking of low-level visual details? A general-purpose visualization system for nested relations data would be of direct utility to any one of many [15, 29, 27, 28, 16, 2] previously proposed database user interfaces based on structured hierarchical views. To scope our problem, we limit ourselves to the task of displaying simple textual data that has been pre-organized to conform to some hierarchical but non-recursive schema. This includes the documents of most data interchange formats based on XML and JSON, as well as the output of any object-relational or relational-to-XML mapping system that uses a finite number of standard SQL queries to produce its output, e.g. SilkRoute [8]. An equivalent data model is the nested relational model; see Levene [18] for an extensive bibliography. We present a layout management algorithm that automates the display of structured hierarchical data using visual idioms seen in traditional hand-designed database UIs: tables, multi-column forms, and outline-style indented lists. The system gathers simple statistics about fields in the input schema, and uses these to make layout decisions which are then applied uniformly across tuples in each input subrelation. We compare our hybrid layout system with pure outline and nested table layouts with respect to space efficiency and readability, the latter with an online user study on 27 subjects. Our layouts are 3.9 and 1.6 times more compact on average than outline layouts and horizontally unconstrained table layouts, respectively, and are as readable as table layouts even for large datasets. Published by the IEEE Computer Society Accepted for publication by IEEE. ©2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. 2 R ELATED W ORK Tree Visualization. While our layout algorithm assumes the presence of a schema along with its hierarchical input data, it is always possible to derive a valid schema from the input data itself, and our system supports the visualization of any tree-structured input data in XML or JSON form, without further annotation. Our visual layouts make the most sense in cases where the schema can correctly be assumed to be non-recursive, which was the case for 9 out of 10 example datasets in the XML corpus we used for our user study. Thus, our system falls into the tree visualization category of information visualization research. Most differences arise from our assumption of a different variation of the tree-structured input data model than is typically seen in other tree visualization systems. Analogous data model specialization is seen for instance in Robertson’s polyarchies [23]. While our system does not perform any kind of aggregation on input data, but merely displays every piece of data in full, the same is true of many standard kinds of tree visualizations. Academic research on visualizations for tree-structured data has been surveyed by Shneiderman [24] and, recently, Graham and Kennedy [11]. In our case, we are dealing with the subproblem of single tree visualization, with the additional constraint that we are working with structured data only, i.e. data that conforms naturally to some schema. In Graham’s taxonomy, the visualizations produced by our system form a hybrid between the nested and indented list representations. While nested tables, an important base case in our layout system, organize data in a grid of rows and columns, they do not fall into the matrix representation category, but rather among the nested representations, since data leaf nodes are always contained within the visual boundaries of their parent nodes. The study by Chimera and Shneiderman [6] compared three variations of the outline-style indented list view; two of these were interactive. Their results suggest that future versions of our system should include collapsible nodes in indented lists. Ziemkiewicz et al.[31] compare four different classes of tree visualizations, including three nested variants. None of the layouts tested include tables or hybrids between tables and indented list representations. Ghoniem et al. [10] provide a taxonomy of readability tasks on graphs and present a study comparing node-link and matrix representations. Treemap [14] is maybe the most well-known nested or Venn Diagram-style tree representation. Treemaps fit an arbitrarily sized tree into a viewport of pre-defined dimensions, by making subsequent levels smaller and smaller. The relative size of siblings is determined by some semantically significant weight associated with each node in the tree, such as disk space consumed in a map of a file system hierarchy. In contrast, our own system allocates to leaf nodes whatever visual space is necessary to display the contained primitive text values at a constant font size. This is a requirement from the perspective of our target applications. Our system does, however, constrain layout dimensions in the horizontal direction, so that only vertical scrolling is needed if the layout can not fit in the desired viewport. In treemaps, siblings are arbitrarily stacked vertically or horizontally at every other step in the recursion. In our own system, tuples are always stacked vertically. Fields are stacked either horizontally, when contained in a nested table, or vertically, when part of an outline view. This ensures one kind of visual consistency while still allowing the widths of layouts to be constrained. FISH [20] is a variation of the Treemap concept. In this system, styling attributes can be used to configure various node presentation details, including the choice to stack siblings either vertically or horizontally. Such styling must be applied to each individual data node, unlike in our own system, where styling is applied to schema nodes only. Strip Treemap [3] optimizes the presentation of a treemap by sizing rectangles such that they can be stacked in contiguous “strips” without broken horizontal lines. Like Treemap, and unlike our own system, both FISH and Strip Treemap assume that all data must fit into a viewport of predetermined size both in the horizontal and vertical direction. Elastic Hierarchies [30] and EncCon [21] both combine node-link representations with some other representation in order to visual- ize tree-structured data. Node-link representations are typically less space-efficient than nested or indented representations, and seldom appear in traditional database GUIs, our target application. Thus, we did not use them ourselves. EncCon uses node-link representations in every level of the tree, but maintains an invisible Treemap-like layout to determine a favorable positioning for each node. In Elastic Hierarchies, subsequent levels may use either Treemaps or node-link representations, determined interactively. The Elastic Hierarchies paper also has a relevant discussion of the design space of hybrid tree layouts, but does not consider nested tabular layouts. Neither does Graham and Kennedy’s survey. This is likely because tabular layouts apply only to structured data, where series of children in a tree all can be expected to have the same substructure. Systems that solve the more general problem of dealing with semi-structured (schema-free) data will not naturally be able to take advantage of tabular layouts. XML Visualization. VisualXML and XMLAD [7], like Elastic Hierarchies and EncCon, use node-link representations as part of their interface, but are geared towards XML visualization. The XMLAD system does indeed use tabular representations as part of the output layout, but does not support nested tables. Thus, only the bottom logical relations in the data may be displayed as tables. Similarly, the VisualXML system uses outline or list-style representations for the bottom relation level of each subtree. Again, both XMLAD and VisualXML use a node-link representation for all higher levels. This is in contrast with our own system, which uses a column-enabled, indented list for a variable number of top levels in the tree, followed by nested tables for a variable number of bottom levels. Furthermore, our algorithm makes these layout decisions automatically, except where manually overridden. Tree Rewriting provides a visual language “very similar to the lambda calculus in simplicity and expressive power” [13]. While such systems are general enough to produce just about any layout from a set of input data, a user must manually specify how these layouts are to be constructed. Unlike our system, neither is able to produce a default layout subject to a constraint such as available page width. Furthermore, no mechanism is available to help ensure the careful alignment of column fields that is required in order to produce tabular or nested tabular sublayouts. A relevant class of commercial systems consists of report generator tools such as Crystal Reports1 and Altova XML-Spy/StyleVision2 . An extensive survey is provided by Król [17]. These systems let the user build output layouts analogous to those produced by our own system, using a variety of input data sources. The layout building process is manual. Altova’s “Grid View” is similar to a nested table view, but with single-level column headings repeated for every subrelation. Navigation requires heavy use of both vertical and horizontal scrollbars as well as manual collapsing and expanding of data nodes. Altova does combine vertical arrangements of tuple fields in an outline-style sub-view with the use of nested tables, but is unable to render subrelations as indented lists. UI Generation. Supple [9] generates widget-based user interfaces for devices of various sizes and contexts using a cost optimization algorithm, but does not deal with table layouts or other layouts that take advantage of repeated structure in data that adheres to a schema. The Right/Bottom strategy [4] defines heuristics for widget placement in dialog boxes, but similarly does not deal with structured relational data. Database application builder tools such as FileMaker3 and Microsoft Access4 include wizards to help with the creation of new forms, but are only able to layout out a single level of fields automatically, and can not use statistics about the data in each field to make structural layout decisions. Tabular Systems. Show Me [19] is an autostyling system for Tableau, an interactive visualization system. Since Tableau operates on tabular input data and produces Pivot-table or cross-tab-based out1 http://crystalreports.com 2 http://www.altova.com/xml-editor 3 http://www.filemaker.com 4 http://office.microsoft.com/access CEE 471 GEO 471 Sections Place History of the Earth Days P 01 206 13:30:00 GEO 14:20:00 GUYOT 155 15:30:00 16:20:00 GUYOT 154 M W F T W P 04 19:00:00 19:50:00 GUYOT 154 W P 05 11:00:00 11:50:00 GUYOT 154 Th L 01 425 09:30:00 MAE 10:50:00 FRIEN 008 GEO 425 Instructors First Primitive Values T Th 58 Y Other Middle Last Thomas S. Duffy Thomas Nicole Mark Thomas Nicole Mark Nicole Mark Nicole Mark Peter Jeffery 40 N S. K A. S. K 40 Y A. K A. K A. 30 N R. Scott This course examines about 15 seminal events in natural history, and how those events shaped the modern environment. The first section covers the origin of the Universe, chemical elements, and solar system. After a discussion of the structure and dynamics of the Earth, we next examine the origin of life, evolution of eukaryotic cells and multicellularity, the evolution of vertebrates (amphibians, dinosaurs, and mammals), and mass extinctions. The final section discusses the origin of man, the ice ages, and global change. 206a: 3 lectures; 206b: three lectures, 1 three-hour laboratory. History of the Earth This course examines about 15 seminal events in natural history, and how those events shaped the modern environment. The first section covers the origin of the Universe, chemical elements, and solar system. After a discussion of the structure and dynamics of the Earth, we next examine the origin of life, evolution of eukaryotic cells and multicellularity, the evolution of vertebrates (amphibians, dinosaurs, and mammals), and mass extinctions. The final section discusses the origin of man, the ice ages, and global change. 206a: 3 lectures; 206b: three lectures, 1 three-hour laboratory. The Physical Earth An introduction to geology and geological processes. Topics include the physical processes occurring within the earth, including plate tectonics, formation of minerals and rocks, earth structure, earthquakes, faults, and mountain building, as well as the physical processes that transform the earth's surface, including weathering and erosion, flooding, landslides and the development of landscape. Recommended for those interested in majoring in the Geosciences. Introduction to Physical Oceanography The study of the oceans as a major influence on the atmosphere and the world environment. The contrasts between the properties of the upper and deep oceans; the effects of stratification; the effect of rotation; the wind-driven gyres; the thermohaline circulation. 11:00:00 11:50:00 GUYOT 10 GEO 235 CEE 235 An introduction to the science of water quality management and pollution control in natural systems; fundamentals of biological and chemical transformations in natural waters; indentification of sources of pollution; water and wastewater treatment methods; fundamentals of water quality modeling. Day Number Format End Time GEO 206 40 N courseinfo.princeton.edu/courses/geo206ageo206b_F2000/index.html Duffy Gotberg Miller Duffy Gotberg Miller Gotberg Miller Gotberg Miller Jaffe Paull courseinfo.princeton.edu/courses/GEO206aGEO206b_F2000/index.html Final Design Project(s) Quizzes Other (See Instructor) 20 10 20 10 Morrow and Owen Bennett et al. Consolmagno and Schaefer Beatty et al. Sections Title The Planetary System The Cosmic Perspective Worlds Apart: A Textbook in Planetary Science The New Solar System 33 Tchobanoglous & Schroeder Eckenfelder Metcalf & Eddy Sawyer & McCarty Final Tuple Final Final 50 20 Kump, Kasting and Crane Middle Last S. Duffy Thomas Nicole Mark Thomas Nicole Mark Nicole Mark Nicole Mark Peter Jeffery S. K A. S. K A. K A. K A. R. Scott Duffy Gotberg Miller Duffy Gotberg Miller Gotberg Miller Gotberg Miller Jaffe Paull M W F Michael Leslie Bender 13:30:00 16:20:00 GUYOT 178 F M W F Th Michael Sarah Brian Michael Leslie A 10:00:00 10:50:00 GUYOT 220 Bender Johnston Gertsch Bender Michael Sarah Brian Leslie A P 01 13:30:00 14:20:00 GUYOT 155 M W F T P 03 15:30:00 16:20:00 GUYOT 154 W P 04 19:00:00 19:50:00 GUYOT 154 W P 05 11:00:00 11:50:00 GUYOT 154 Th Water Quality L 01 09:30:00 10:50:00 FRIEN 008 Principles of Water Quality Management Wastewater Engineering Chemistry for Environmental Engineers Earth System Science T Th L 01 10:00:00 10:50:00 GUYOT 220 Earth System Science B 01 L 01 B 02 13:30:00 16:20:00 GUYOT 178 L 01 11:00:00 12:20:00 GUYOT 220 B 01 13:30:00 16:20:00 GUYOT 178 L 01 10:00:00 10:50:00 GUYOT 154 10 20 Lab Reports Other Exam Papers Quizzes 20 50 20 10 Kump, Kasting and Crane Lab Reports MidTerm Exam Final Exam 40 30 Press, Siever, Grotzinger, Jordan MidTerm Exam Final Exam Problem Set (s) 20 Pond & Pickard 40 40 Pickard & Emery Days 11:00:00 11:50:00 GUYOT 10 Label for Primitive Field Other Exam Term Paper (s) Quizzes Papers Place Understanding Earth 30 Introductory Dynamical Oceanography, 2nd ed. Descriptive Physical Oceanography: An Introduction Open University Course Ocean Circulation (2001) Team Open University Course Waves, tides, and Shallow-Water Team Processes Instructors First Thomas End Time L 01 20 20 66 1 Meetings Beg. Time T Th M Nadine M W F Anand Laurel Nadine Yue Leslie Bender Johnston Gertsch McQuarrie Pringle Goodell McQuarrie Tian Gnanadesikan Fig. 2. The nested table layout is the most common way to visualize a nested relation, and is supported as a special case in our algorithm. Here, we show nested relational data generated from an academic course catalog in nested table style, with one of the nested columns enlarged to show terminology. Nested table layouts stack tuples in the vertical direction and the fields of each tuple in the horizontal direction, with all field labels L 01 in 10:00:00 10:50:00 GUYOT 220 M Michael Leslie Bender collected a header on top. GEO 505 Fundamentals of the Geosciences I GEO 540 Topics in Deformation of the Upper Crust A year-long survey, in sequence, of fundamental papers in the geosciences. Topics in 505 (Fall) include the origin and interior of the Earth, plate tectonics, geodynamics, the history of life on Earth, the composition of the Earth, its oceans and atmospheres, past climate. Topics in 506 (Spring) include present and future climate, biogeochemical processes in the ocean, geochemical cycles, orogenies, thermochronology, rock fracture and seismicity. A core course for all beginning graduate students in the geosciences. The major deformation processes of the upper crust, above and near the brittle-plastic transition: fault related folding, critical-taper wedge mechanics, detachment faulting, state of stress in crust, faulting. Topics change from year to year. Overview lectures by the instructor, student presentations based on readings from scientific literature and case studies. Current problems in environmental sciences. Element cycles; geochemistry-biotic interactions, human impacts on the environment. A new topic is chosen every semester. Recent topics have included: the global carbon cycle, alternative energies, biodiversity, genetically modified organisms, population and the environment, media and the environment, restoration ecology, the humanities and the environment and architecture, urban planning and the environment, and world oil and the Middle East. Covers topics including origin of elements; formation of the Earth; evolution of the atmosphere and oceans; atomic theory and chemical bonding; crystal chemistry and ionic substitution in crystals; reaction equilibria and kinetics in aqueous and biological systems; chemistry of high temperature melts and crystallization process; and chemistry of the atmosphere, soil, marine, and riverine environments. The biogeochemistry of contaminants and their influence on the environment will also be discussed Use of scientific arguments in concert with engineering, economic, political, and social considerations to develop environmental policies. Class format consists of exercises based on actual case studies related to international agreements, environmental regulations, climate change, natural hazards, arms control negotiations, and national security. 30 N Final S 01 09:00:00 09:50:00 GUYOT 154 14:00:00 16:50:00 GUYOT 154 M F Jorge Gerta Lincoln Thomas Nadine Samuel 30 Y Other L 01 01:30:00 03:20:00 GUYOT 154 12:30:00 13:20:00 GUYOT 154 M Th John This course provides a broad introduction to atmospheric science. Topics covered include atmospheric thermodyanmics, aerosols and clouds, radiative transfer, surface and top-of-atmosphere energy balances, and the hydrological cycle. Elementary principles of atmospheric dynamics and global circulation will be presented. 20 W F puts, it does not fall into the tree-structured data visualization category. Nested table layouts are used in Related Worksheets [2] and in applications produced with AppForge [28] or App2You [16]. These systems do not allow the nested table layouts to be re-styled as indented stem Science B 01 13:30:00 16:20:00 GUYOT 178 Michael Leslie Bender lists or deferred to lower layers ofF the tree-structure. They can not Sarah A Johnston Brian Gertsch page automatically constrain the width of the layout to the available Michael Leslie Bender width.L 01 10:00:00 10:50:00 GUYOT 220 MW FOCUS [25], TableLens [22],F and the system by Tajima and B 02 13:30:00 16:20:00 GUYOT 178 Th Michael Leslie Bender Ohnishi [26] deal with the problem of displaying large, mostly flat Sarah A Johnston tables. The latter system includes a “Record” that reBrian viewing mode Gertsch sembles a single-level form view. While all of these systems support nding Earth L 01 11:00:00 12:20:00 GUYOT 220 T Nadine certain cases involving values spanning multiple cells, they McQuarrie do not opTh 01 structured 13:30:00 16:20:00 GUYOT 178 data M Laurel Pringlevisualization Goodell erate Bon hierarchical in general. Chi’s spreadsheets [5] combine a table layout atNadine the outer level McQuarrie with cells Yue Tian containing plotted 3D graphics, but require the user to define each visualization using commands, and are primarily focused on numerical data. L 01 10:00:00 10:50:00 GUYOT 154 M tory Dynamical Anand Gnanadesikan raphy, 2nd ed. W Text Layout Systems. Text document layout systems, like the one ve Physical F described by Jacobs et al. [12], deal with the problem of rendering a raphy: An Introduction rculation (2001) given amount of text with a given font size on a set of dimensionally constrained pages. They are otherwise different from our own system, ides, and Shallow-Water s since they operate on a very different class of input data. build a habitable planet nvironment s and Applications of c Chemistry MidTerm Exam Quizzes Final Exam Precept Participation Other Exam Problem Set (s) Label for Relation Field Introduction to Water Pollution Technology L 01 P 03 uality Meetings Beg. Time Final Sample Reading List Author Name Day Relation Value This course examines the major bodies of our solar 140 Y system, emphasizing their surface features, internal structures, and atmospheres. Topics include the origin of the solar system, habitability of planets, and role of impacts in planetary evolution. Terrestrial and giant planets will be studied as well as satellites, comets, and asteroids. Recent discoveries from planetary missions are emphasized. The course is aimed primarily at nonscience majors. Final Exam Grading Type Title Number Format A Guided Tour of the Solar System Website Perc. Number Dept. GEO 207 AST 207 etary System mic Perspective part: A Textbook in y Science Solar System s of Water Quality ment ater Engineering y for Environmental s stem Science Description May Audit? Max. Enrollment Course Listings Title ENV 524 GEO 524 Environmental Issues Seminar: The Food Industry and The Environment GEO 331 CHM 331 Introduction to Environmental Geochemistry: Chemistry of the Natural Systems GEO 399 ENV 399 GEO 427 GEO 418 CHM 418 GEO 420 GEO 558 GEO CHM EEB AST S 01 3 255 255 255 255 Environmental Decision Making Introduction to Terrestrial and Planetary Atmospheres Y 20 Y 20 N Y Environmental Aqueous Geochemistry Application of quantitative chemical principles to the 20 study of natural waters. Includes equilibrium computations, carbonate system, gas exchange, precipitation/dissolution of minerals, coordination of trace metals, redox reactions in water and sediments. Topics in Earth Science Although the Milankovitch forcing (modest variations in 20 the distribution of sunlight) has been relatively constant over the past several million years, Earth's response started amplifying around 3 million years ago (Ma), culminating in dramatic oscillations between prolonged Ice Ages and brief interglacial periods such as the present one. Recent observational and theoretical results, especially concerning the tropics, provide valuable clues to this puzzle whose solution is of direct relevance to global warming issues. This seminar will review recent developments. Y Seismology Seminar: Finite Frequency A full treatment of seismic tomography with an 20 Tomography emphasis on finite-frequency interpretations Life in the Universe This course introduces students to a new field, 18 Astrobiology, where scientists trained in biology, chemistry, astronomy and geology combine their skills to discover life's origins and to seek extraterrestrial life. Topics include: the origin of life on Earth, the prospects of life beneath the surfaces of Mars and Europa, a moon of Jupiter; extra-solar planets nearby that offer targets for NASA space telescopes searching for life. Students will experience first hand the diverse techniques used to study extremophiles in hot springs and on future Mars missions during a mandatory fallbreak trip to Yellowstone Park. Igneous and Metamorphic Geology Interpretation of earth history based on the rock 14 record. Two-one and a half hour lectures, one threehour laboratory, several one-day field trips, and an eight-day trip during fall break N 09:00:00 09:50:00 GUYOT 154 314 14:00:00 GEO 16:50:00 GUYOT 154 L AYOUT A LGORITHM GEO 499 30 Investigating Natural Hazards M F Jorge Gerta Lincoln Thomas We evaluate our vulnerability to natural hazards and the impact of disasters to the United States and the Nadine developing world. In particular, we explore what constitutes a natural disaster & predict the future humanitarian, economic,Samuel & political impact of such Y Y www.deepbio.princeton.edu or http://blackboard. princeton.edu Louis Sarmiento Keller Steffens Hollister S. Duffy 12 N McQuarrie George Philander H. N We now describe our layout algorithm. For the purposes of exhibition, we start by discussing basic nested table and outline layouts, then discuss hybrid table/outline layouts, and finally show the complete steps to produce the layouts automatically. L 01 01:30:00of 03:20:00 GUYOT 154 M to produce John The purpose the algorithm is a compactSuppe but read12:30:00 13:20:00 GUYOT 154 Th able view of tree-structured input data that conforms to a non-recursive schema, or, more precisely, nested relational data. Whenever possible, the regularity of the input data’s schema should be used to maximize the readability of the data. For instance, tabular dataM. should be renS 01 18:00:00 20:50:00 GUYOT 154 Th Francois M. Morel dered as a table, with proper column headings describing the name of fields in the schema. We may not require user input informing layout decisions beyond what can be derived from the data itself. 3.1 AOS 527 GEO 527 Atmospheric Radiative Transfer GEO 539 Topics in Paleoecology, Paleoclimatology, and Paleoceanography events given changes in sea level, climate variations, & demographic trends. Students will work in teams to assess risk and to develop economically realistic & scientifically sound policy recommendations. The results of the class will be submitted for publication, and the class will present their analysis to the appropriate policy-makers and/or business leaders. Structure and composition of terrestrial atmospheres. 10 Fundamental aspects of electromagnetic radiation. Absorption and emission by atmospheric gases. Optical extinction of particles. Roles of atmospheric species in Earth's radiative energy balance. Perturbation of climate due to natural and antropogenic causes. Satellite observations of climate system. The application of the fossil record to specific 5 geological problems in depositional environments and the paleoclimatic and paleoceanographic history of the oceans are studied. Y Y Nested Relations and Nested Table Layouts Our algorithm operates tree-structured data conforming to a nonL 01 11:00:00 12:20:00 on GUYOT 220 M Satish Chandra Myneni Babu model as the recursive schema. We have chosenWthe nested relational concrete data model for our implementation, since it tends to lead to S 01 13:30:00 16:20:00 GUYOT 155 M Gregory E. van der Vink Louis Sarmiento Keller Steffens Hollister S. Duffy McQuarrie George Philander H. Suppe simple tree traversal code. It is equally feasible to use a different treestructured data model such as that of XML, and in connection with the user study, we did write a routine for importing arbitrary XML documents. The terminology of the nested relational model is as follows: A value is either a primitive or a relation, where a relation is defined as a set of tuples, each containing a set of fields identified by labels, each containing a value, recursively. The schema of a value either defines the value to be a primitive, or defines the value to be a relation, with schemas further specified for each of the latter’s fields, recursively. Nested relational data is most commonly illustrated in a nested table layout. An annotated example of such a layout is shown in Figure 2; it renders data about the first few courses in an academic course catalog. A nested table layout consists of a header area (shown in blue) and a content area (shown in beige). The header area reflects the schema structure of the displayed nested relation, with simple labels for columns containing primitive fields and recursive labels for columns containing relation fields. Columns containing relation fields may recursively contain columns for either primitive or relation subfields. In the content area, tuples in the rendered relation are stacked vertically with row separators between them. Each tuple renders its primitive values as simple strings and its relation values recursively as the content area of another nested table. Other Final Final MidTerm Exam Problem Set (s) Final Exam Quizzes 30 Oral Presentation (s) Final Exam Precept Participation Papers 10 30 S 01 18:00:00 20:50:00 GUYOT 154 Th Francois M. M. L 01 11:00:00 12:20:00 GUYOT 220 M W Satish Chandra Myneni Babu S 01 13:30:00 16:20:00 GUYOT 155 M Gregory E. van der Vink Atmospheric Science: An Introductory Survey An Introduction to Dynamic Meteorology Houghton The Physics of Atmospheres Hartmann Global Physical Climatology Rogers & Yau A Short Course in Cloud Physics Morel, F.M.M. & Hering, Principles and Applications of J.G. Aquatic Chemistry, 2nd Edition (1993) John Wiley & Sons, NY L 01 08:30:00 09:50:00 GUYOT 154 T Th Samuel George H. Philander C 01 08:30:00 09:50:00 GUYOT 220 T Th Francois Eric M. M. Morel Stephan Egleston Trends Rhythms&Aberrations in Global Climate, J. Zachos The Pliocene Paradox, A. Federov, P. Dekens etc. Obliquity Pacing of Glacial Cycles,P. Huybers/C. Wunsch High Latitude Influence on Eastern Equatorial... Z. Liu Role of Tropics in Changing the Response, S.G.Philander Guust Nolet (preprint to be distributed in class) J. Lunine S 01 13:30:00 16:20:00 GUYOT 154 T Samuel George H. S 01 09:00:00 10:30:00 GUYOT 219 15:45:00 17:15:00 GUYOT 155 13:30:00 14:50:00 GUYOT 220 W W T Th Guust L 01 Gerard Laura Danielle Tullis Edwin Brian Charles Faye M. C. Lewis B 01 B 02 L 01 13:30:00 16:20:00 GUYOT 16 13:30:00 16:20:00 GUYOT 16 19:30:00 20:50:00 GUYOT 16 W T T W Lincoln Lincoln Lincoln Steffens Hollister Steffens Hollister Steffens Hollister S 01 19:30:00 22:20:00 GUYOT 155 M Peter Gregory John E. Broecker Berner & Berner Gunter Faure 30 10 How to build a habitable planet Global Environment Principles and Applications of Inorganic Chemistry 30 20 40 Other Wallace & Hobbs Holton Final Other Other Take-Home Other 3.2 Other Final Exam Problem Set (s) MidTerm Exam Papers Oral Presentation (s) Morel 30 30 40 50 50 Other (See Instructor) Problem Set (s) Take Home Final Exam Other (See Instructor) Oral Presentation (s) 100 Lab Reports Other (See Instructor) Paper In Lieu of Final Papers Precept Participation 20 40 25 35 25 R.B. Smith and L.J. Siegel Kathy B. Sheehan Science, 292; 686-693, (2001) Science, 312; 1485-1489 (2006) Nature, 434; 491-494 (2005) Nature, 427; 720-723 (2004) Paleoceanography, 18(2), 1045 (2003) A Breviary for Seismic Tomography Astrobiology: A Multidisciplinary Approach Windows into the Earth Seen and Unseen Discovering the microbes of Yellowstone 15 Best Smith and Pun Igneous and Metamorphic Petrology How Does Earth Work 40 Outline Layouts 50 50 Philander Nolet Dismukes Landweber Schmitt Onstott Turner Gertsch DiFiore van der Vink Nested table layouts, like the one previously seen, can quickly become extremely wide if many fields are to be displayed. For on-screen interfaces, the need for horizontal scrolling is undesirable, and for printing on paper, pagination in the horizontal direction is awkward. As another basic alternative to the nested table layout, we consider the outline layout, which is common among XML editing tools. Figure 3(a) shows the same data as before rendered using such a layout. In an outline layout, tuples are stacked vertically in an indented bullet list fashion; we show one bullet per tuple. Unlike the nested table layout, which puts the fields of each tuple next to each other on a row, the outline layout stacks the tuple fields vertically as well. Thus, having more fields in a schema makes the layout taller, not wider. In a tuple in an outline layout, primitive fields are rendered as strings with their respective schema labels to the left, whereas relation fields are rendered recursively with their schema labels above, extended to the full width of the layout area. The color scheme is the same (blue and beige) for labels and values in an outline layout as for the header and Other L 01 09:00:00 10:20:00 GUYOT 155 Th T Venkatachalam Ramaswamy Other S 01 10:00:00 13:00:00 GUYOT 16 T Gerta Keller (a) Outline layout Title Description Title Final Exam Perc. 20 • Title Precept Participation Perc. 10 • Title Other Exam Perc. 20 • Title Problem Set(s) Perc. 20 Sample Reading List • Author Name Morrow and Owen Title The Planetary System • Author Name Beatty et al. Title The New Solar System Day W • Day F • Author Name Consolmagno and Schaefer Title Worlds Apart: A Textbook in Planetary Science P 01 13:30:00 14:20:00 GUYOT 155 M W F T • Author Name Beatty et al. Title The New Solar System P 03 15:30:00 16:20:00 GUYOT 154 W P 04 19:00:00 19:50:00 GUYOT 154 W P 05 11:00:00 11:50:00 GUYOT 154 Th L 01 Beg. Time End Time Place Days 11:00:00 11:50:00 GUYOT 10 Instructors First Thomas • Format Number Meetings Middle S. M W F • • Format Number Meetings Last Duffy End Time Place P 03 Beg. Time • Place Course Listings Max. Enrollment May Audit? Website Title Exam Type Final Description Grading Days Day End Time Sample Reading List Sample Reading List (d) Hybrid layout with justified tables Sections Sample Reading List Sample Reading List Sections Sections Course Listings Course Title Listings Description Day • • (f) Schema-only hybrid layout used to partition columns in (e) • Title Description Course Listings Sample Reading List Sections • Sample Reading List Course Listings Sections Title Description Perc. Title Description Max. Enrollment May Audit? Website Sample Reading Final Exam Type List Grading Sections Day Number Format • Duffy Gotberg Miller Duffy Gotberg Miller Gotberg Miller Gotberg Miller Number Day • Author Name Eckenfelder Website Title Principles of Water Quality Management FinalInstructors Exam Type Other • First Nicole Grading Title Perc. • Author Name Metcalf & Eddy Middle KDesign Project(s) 33 Title Wastewater Engineering Last Gotberg Quizzes 66 Other (See Instructor) 1 • Author Name Sawyer & McCarty • First Mark Sample Reading Author Name Title Title Chemistry for Environmental Engineers Middle A. List Tchobanoglous & Schroeder Water Quality Last Miller Sections Eckenfelder Principles of Water Quality Management • Format L Metcalf & Eddy Wastewater Engineering Number Engineers 01 • Format PSawyer & McCarty Chemistry for Environmental Meetings Beg. End Place Days Number 05 Sections Format Number Meetings Instructors TimeMiddleTime Meetings Place Days Last Beg. End First • Beg. Time 11:00:00 Time Time Day 09:30:00 10:50:00 FRIEN 008 T End TimeL 11:50:00 01 09:30:00 10:50:00 FRIEN 008 T Peter R. Jaffe Th Place GUYOT 154 Jeffery Scott Paull Th Instructors First Middle Last Days Peter R. Jaffe Course Listings Number • Day Dept. Th Jeffery Scott Paull GEO 206 Title History of the Earth Instructors Description This course examines about 15 seminal events in natural history, and those events shaped the modern environment. The • how Course Listings • First Nicole first section covers the origin of the Universe, chemical elements, and solar system. After a discussion of the structure and Middle Kdynamics of the Earth, we next examine the origin of life, evolution of eukaryotic cells and multicellularity, the evolution of Last Gotberg vertebrates (amphibians, dinosaurs, and mammals), and mass extinctions. The final section discusses the origin of man, the ice GEO 206 ages, and global change. 206a: 3 lectures; 206b: three lectures, 1 three-hour laboratory. • First Mark Title History of the Earth Max. Enrollment 40 Middle A. Description This course examines about 15 seminal events May Audit? N in natural history, and how those events Last Miller Website courseinfo.princeton.edu/courses/geo206a-geo206b_F2000/index.html shaped the modern environment. The first Final Exam Type Final section covers the origin of the Universe, Grading Title Perc. chemical elements, and solar system. After a Course Listings Other Exam 50 of the discussion of the structure and dynamics • Dept. CEE Term Paper(s) Earth, we next examine the origin of20 life, Number 471 Quizzes 10 evolution of eukaryotic cells and multicellularity, the evolution of vertebrates (amphibians, 20 • Dept. GEOPapers dinosaurs, and mammals), and mass Sample Reading Title Number 471 Author Name extinctions. The final section discusses the List Kump, Kasting and Crane Earth System Science origin of man, the ice ages, and global change. Title Introduction to Water Pollution Technology Sections Format Number Meetings Instructors 206a: 3 lectures; Last 206b: three lectures, 1 threeDescription An introduction to the science End of water quality Beg. Place Days First Middle hour laboratory. management and pollution Time control Timein natural Day Max. Enrollment 40 Leslie of biological and GUYOT 220 Lsystems;01fundamentals 10:00:00 10:50:00 M Michael Bender May Audit? N chemical transformations in natural waters; W indentification of sources of pollution; water Website courseinfo.princeton.edu/courses/geo206aF and wastewater treatment methods; geo206b_F2000/index.html Course Listings Max. Enrollment May Audit? Website Title Final Exam Type Description Grading Dept. Course Listings Dept. Number Design 33 • Format P CEE 471 Project(s) Number 04 GEO 471 Quizzes 66 Meetings Title Introduction to Water Pollution Technology Other (See 1 • Beg. Time 19:00:00 Instructor) Description An introduction to the science of water quality management and pollution control in natural systems; fundamentals of biological End Time 19:50:00 and chemical transformations in natural waters; indentification of sources of pollution; Sample Readingwater List and wastewater treatment Place GUYOT 154 methods; fundamentals of water quality modeling. • Author Name Tchobanoglous & Schroeder Days Max. Enrollment 58 Title Water Quality • Day W May Audit? Y S. K A. S. K A. K A. K A. Perc. 11:00:00 11:50:00 GUYOT 154 • Number Format 19:00:00 19:50:00 GUYOT 154 Perc. • 15:30:00 16:20:00 GUYOT 154 Thomas Nicole Mark Thomas Nicole Mark Nicole Mark Nicole Mark Day Dept. • P 03 First Nicole Middle K Last Gotberg P 04 First Mark Middle A.P 05 Last Miller Duffy Number Number Format Course Listings Title Description Day • 13:30:00 14:20:00 GUYOT 155 Last S. CEE 471 GEO 471 Introduction to Water Pollution Technology An introduction to the science of water quality management and pollution control in natural systems; fundamentals of biological and chemical transformations in natural waters; indentification of sources of pollution; water and wastewater treatment methods; fundamentals of water quality modeling. Dept. Number Max. Enrollment 140 58 GEO 207 May Audit? Y Y AST 207 Website A Guided Tour of the Solar System Final Exam Type Final Other This course examines the major bodies of our solar Grading Title Perc. Title system, emphasizing their surface features, internal MidTerm Exam 20 structures, and 33 atmospheres. Topics include the origin Quizzes 10 Design of the solar system, habitability of planets, and role of Final Exam 20 Project(s) impacts in planetary evolution. Terrestrial and giant Precept Participation 10 Quizzes 66 planets will be studied as well as satellites, comets, Other Exam 20 Other (See 1 and asteroids. Recent discoveries from planetary Problem Set(s) 20 Instructor) missions are emphasized. The course is aimed Author Name Title primarily at non-science majors. Tchobanoglous & Water Quality Author Name Title Schroeder Morrow and Owen The Planetary System Eckenfelder Principles of Water Quality Bennett et al. The Cosmic Perspective Management Consolmagno and Schaefer Worlds Apart: A Textbook in Planetary Science Metcalf & Eddy Wastewater Engineering Beatty et al. The New Solar System Sawyer & McCarty Chemistry for Environmental Format Number Meetings Instructors Engineers Place Days Last Beg. End First Middle Meetings Instructors Time Time Day Place Days Beg. End First Middle Last L 01 11:00:00 11:50:00 GUYOT 10 M Thomas S. Duffy Time Time W F Peter L 01 09:30:00 10:50:00 FRIEN 008 T R. Jaffe P 01 13:30:00 14:20:00 GUYOT T Jeffery Thomas S. Duffy Scott Paull Th 155 Nicole K Gotberg Mark A. Miller P 03 15:30:00 16:20:00 GUYOT 154 W Thomas S. Duffy Nicole K Gotberg GEO 206 Mark A. Miller History of the Earth P 04 19:00:00 19:50:00 GUYOT 154 W Nicole K Gotberg This course examines about 15 seminal events in natural history, and Mark how those events shaped environment. The A. the modern Miller first origin of the Universe, chemical elements, andNicole solar system. After a discussion of the structure and P section 05 covers the 11:00:00 11:50:00 GUYOT 154 Th K Gotberg dynamics of the Earth, we next examine the origin of life, evolution ofMark eukaryotic cells and multicellularity, the evolution of A. Miller vertebrates (amphibians, dinosaurs, and mammals), and mass extinctions. The final section discusses the origin of man, the ice ages, global change. 206a: 3 lectures; 206b: three lectures, 1 three-hour Dept. and Number Max. Enrollment 58 laboratory. 40 CEE 471 May Audit? Y N GEO 471 Website courseinfo.princeton.edu/courses/geo206a-geo206b_F2000/index.html Introduction to Water Pollution Technology Final Exam Type Other Final An introduction to the science of water quality Grading Title Perc. Title management and pollution control in natural systems; Design Project(s) 33 fundamentals of biological and chemical Quizzes 66 transformations in natural waters; indentification of Other Exam 50 Other (See Instructor) 1 sources of pollution; water and wastewater treatment Term Paper 20 methods; fundamentals of water quality modeling. (s) Author Name 10 Title Quizzes Tchobanoglous20 & Schroeder Water Quality Papers Eckenfelder Principles of Water Quality Management Author Name Title MetcalfKasting & Eddy and Wastewater Engineering Kump, Earth System Science Sawyer & McCarty Chemistry for Environmental Engineers Crane FormatMeetings Number Meetings Instructors Instructors Place DaysFirst Last Beg. End FirstMiddle Last Middle Beg. End Place Days Time Time Day Time Time L 01 09:30:00 10:50:00 FRIEN 008 T Peter R. Jaffe L 01 10:00:00 10:50:00 GUYOT 220 M Leslie Bender Scott Jeffery Paull ThMichael W Dept. Number Max. Enrollment 40 F GEO 206 May Audit? N History of the Earth Website courseinfo.princeton.edu/courses/geo206ageo206b_F2000/index.html This course examines about 15 seminal events in natural history, and how those events shaped the Final Exam Type Final modern Grading Title Perc. GEO 206environment. The first section covers the origin ofofthe Other Exam 50 History theUniverse, Earth chemical elements, and solar system. After a discussion of15 theseminal structure and in natural history, and howTerm 20 This examines about events events shaped the modern environment. The Dept.course Number Max. Enrollment thosePaper(s) dynamics of covers the Earth, next the origin of elements, first section the we origin of examine the Universe, chemical and solar system. After a discussion of the structure and Quizzes 10 May Audit? life, evolution of Earth, eukaryotic cellsexamine and multicellularity, dynamics of the we next the origin of life, evolution of eukaryotic cells and multicellularity, the evolution of 20 Papers the evolution(amphibians, of vertebrates (amphibians, dinosaurs, and Website vertebrates dinosaurs, and mammals), mass extinctions. The final section discusses the origin of man, the ice and mammals), mass206a: extinctions. The final section Final Exam Type ages, and globaland change. 3 lectures; 206b: three lectures, 1 three-hour laboratory. discusses the origin of man, the ice ages, and global Grading Title Perc. 40 change. 206a: 3 lectures; 206b: three lectures, 1 N three-hour laboratory. courseinfo.princeton.edu/courses/GEO206a-GEO206b_F2000/index.html Author Name Title Final Kump, Kasting and Crane Earth System Science Title Format Number Meetings Instructors End Place Days First Middle Last Author Name 20 Beg. Title Lab Reports Time Time Day Other Exam 50 L 01 10:00:00 10:50:00 GUYOT 220 M Michael Leslie Bender Papers 20 W Format Number Meetings Instructors Quizzes 10 F Place Days Last Beg. End First Middle Author Name Title Time Time Day Kump, Kasting and Earth System Science Dept. Number Max. Enrollment 40 Crane GEO 206 May Audit? N Instructors HistoryMeetings of the Earth Website courseinfo.princeton.edu/courses/GEO206aBeg. End Place Daysin First Middle Last GEO206b_F2000/index.html This course examines about 15 seminal events Time andTime natural history, how those events shaped the Final Exam Type Final modern environment. The first section covers the Grading Title Perc. B 01 13:30:00 16:20:00 GUYOT 178 F Michael Leslie Bender origin of the Universe, chemical elements, and solar Lab Reports 20 Sarah A Johnston system. After a discussion of the structure and Other Exam 50 Brian Gertsch dynamics of the Earth, we next examine the origin of Papers 20 Llife,01evolution 10:00:00 10:50:00cells GUYOT 220 M Michael Leslie Bender of eukaryotic and multicellularity, Quizzes 10 the evolution of vertebrates (amphibians, dinosaurs, W and mammals), and mass extinctions. The final F section discusses the origin16:20:00 of man, the ice ages, global B 02 13:30:00 GUYOT 178 andTh Michael Leslie Bender change. 206a: 3 lectures; 206b: three lectures, 1 Sarah A Johnston three-hour laboratory. Brian Gertsch Author Name Title Kump, Kasting and Crane Earth System Science Format Number Meetings Instructors Beg. End Place Days First Middle Last Time Time Day GEO 235 B 01 13:30:00 16:20:00 GUYOT 178 F Michael Leslie Bender CEE 235 Sarah A Johnston The Physical Earth Brian Gertsch An introduction to geology and geological processes. Topics include the physical processes occurring within the earth, including L 01 10:00:00 10:50:00 GUYOT M Michael faults, and mountain Leslie building, Bender plate tectonics, formation of minerals and rocks, 220 earth structure, earthquakes, as well as the physical processes that transform the earth's surface, including weathering and erosion, flooding, landslides and the development W of landscape. Recommended for those interested in majoring in the Geosciences. F 40 B 02 13:30:00 16:20:00 GUYOT 178 Th Michael Leslie Bender Y Sarah A Johnston Brian Gertsch Final Sections Day • Number Instructors • First Thomas P 01 Middle S. Last Duffy The Cosmic Perspective Worlds Apart: Textbook in Planetary Science • ACourse Listings The New Solar System Instructors Days Last First CEEMiddle 471 Day GEO 471 M Thomas S. DuffyPollution Technology Title Introduction to Water W Description An introduction to the science of water quality F management and pollution control in natural T Thomas S. Duffy of biological and systems; fundamentals chemical transformations Nicole K Gotberg in natural waters; indentification of Miller sources of pollution; water Mark A. andS.wastewater treatment methods; W Thomas Duffy fundamentals of water quality modeling. Nicole K Gotberg Max. Enrollment 58 A. Mark Miller May Audit? Y K W Nicole Gotberg Website Mark A. Miller Final Exam Type Other Th Nicole K Gotberg Grading TitleA. Mark Miller Dept. Bennett et al. Meetings Consolmagno • Beg. Time 15:30:00 and Schaefer et al. End TimeBeatty 16:20:00 Sections Place Format Number GUYOT 154 Meetings Place Beg. End Days Time Time • Day W L 01 11:00:00 11:50:00 GUYOT 10 Middle Thomas (e) Final hybrid layout with outline columns and justified tables Days 15:30:00 16:20:00 GUYOT 154 W Format P Instructors First Middle Last Number 01 Thomas S. Duffy Meetings Nicole K Gotberg • Listings Beg. Time 13:30:00 Course Dept. Number Mark A. Miller End TimeGEO 14:20:00 207 Place GUYOT AST 207155 • Format P Days Title A Guided Tour of the Solar System Number 04 • Day T Description This course examines the major bodies of our solar system, emphasizing their surface features, internal structures, and Meetings Beg. End Place Days atmospheres. Topics include the origin of the solar system, habitability of planets, and role of impacts in planetary evolution. Time Time Instructors Terrestrial and giant planets will be studied as well as satellites, comets, and asteroids. Recent discoveries from planetary missions are emphasized. The course is aimed primarily at non-science majors. • First Thomas 19:00:00 19:50:00 GUYOT 154 W Max. Enrollment Middle S.140 Instructors First Middle Last May Audit? Y Last Duffy Nicole K Gotberg Website Mark A. Miller Final• Exam TypeNicole Final First Grading Middle KTitle Perc. • Format P MidTerm Exam 20 Last Gotberg Number 05 Quizzes Meetings Beg. End Place Days10 • First Mark Time Time Final Exam 20 Middle A.Precept Participation 10 Last Miller 11:00:00 11:50:00 GUYOT 154 Th20 Other Exam Instructors First Middle Last Problem Set(s) 20 Nicole K Gotberg Sample Reading PAuthor Name Title • Format Mark A. Miller List Number Morrow and Owen The Planetary System 03 • Course Listings Title Description P 01 Beg. Time 13:30:00 14:20:00 GUYOT 155 T Instructors First Middle Last Thomas S. Duffy Nicole K Gotberg Mark A. Miller Instructors • First Thomas Middle S. Last Duffy • 11:00:00 11:50:00 GUYOT 10 Sections • Format Number Meetings Instructors First Perc. • L 01 Day Sections • Format L Number 01 Meetings • Beg. Time 11:00:00 End Time 11:50:00 Place GUYOT 10 Days • Day M Days Number Author Name Consolmagno and Schaefer Title Worlds Apart: A Textbook in Planetary Science Place Author Name Bennett et al. Title The Cosmic Perspective Dept. Author Name Bennett et al. Title The Cosmic Perspective • Title The Planetary System The Cosmic Perspective Worlds Apart: A Textbook in Planetary Science The New Solar System • Day • MidTerm 20 Exam Quizzes 10 Final Exam 20 Precept 10 Participation Other Exam 20 Problem Set 20 (s) Sample Reading List • Author Name Morrow and Owen Title The Planetary System MidTerm 20 Exam Quizzes 10 Final Exam 20 Precept 10 Participation Other Exam 20 Problem Set 20 (s) Sample Reading Author Name List Morrow and Owen Bennett et al. Consolmagno and Schaefer Beatty et al. Sections Meetings Beg. End Time Time Day • Title Description Perc. Title Quizzes Perc. 10 GEO 207 AST 207 A Guided Tour of the Solar System This course examines the major bodies of our solar system, emphasizing their surface features, internal structures, and atmospheres. Topics include the origin of the solar system, habitability of planets, and role of impacts in planetary evolution. Terrestrial and giant planets will be studied as well as satellites, comets, and asteroids. Recent discoveries from planetary missions are emphasized. The course is aimed primarily at non-science majors. Max. Enrollment 140 May Audit? Y Website Final Exam Type Final Grading Title GEO 207 AST 207 A Guided Tour of the Solar System This course examines the major bodies of our solar system, emphasizing their surface features, internal structures, and atmospheres. Topics include the origin of the solar system, habitability of planets, and role of impacts in planetary evolution. Terrestrial and giant planets will be studied as well as satellites, comets, and asteroids. Recent discoveries from planetary missions are emphasized. The course is aimed primarily at non-science majors. Max. Enrollment 140 May Audit? Y Website Final Exam Type Final Grading Title Number Format • Course Listings Title Description Perc. A Guided Tour of the Solar System This course examines the major bodies of our solar system, emphasizing their surface features, internal structures, and atmospheres. Topics include the origin of the solar system, habitability of planets, and role of impacts in planetary evolution. Terrestrial and giant planets will be studied as well as satellites, comets, and asteroids. Recent discoveries from planetary missions are emphasized. The course is aimed primarily at non-science majors. Max. Enrollment 140 May Audit? Y Website Final Exam Type Final Grading • Title MidTerm Exam Perc. 20 (c) Basic hybrid layout (wider) • Number Dept. AST Number 207 Course Listings Dept. • (b) Basic hybrid layout • Number Course Listings • Dept. GEO Number 207 Dept. • Number Dept. Dept. Title Description Max. Enrollment May Audit? Website Final Exam Type Number Fig. 3. We illustrate our algorithm by enabling its features one by one and producing successive layouts of the data from Figure 2. All layouts are at This is a scale. test to(a) see much space the figure caption will take. write some here to bullets, avoid repeating myself. Yes, yes. Lorem ipsum the same is how a basic outline layout; this layout renders tuplesI will in relation valuesmore as indented stacks the fields of each tuple vertically, dolsomething. test to see howand much space the figure caption will(c) take. This is ahybrid test tolayouts, see howatmuch space the figurethat caption will and puts labels Yes. to theThis left is ofaprimitive values above relation values. (b) and show basic two different widths, use the take. (a) Thisat is the a test see how much to space the table figuresublayouts caption willwherever take. (b) aThis is can a test to see how much space the figure caption will take.the (c) outline layout firsttolevel but switch nested table fit within the available horizontal space. (d) justifies Sample Reading This is aoftest see how much space theremaining figure caption will take. (d) This is a test to see how much the figure caption will horizontal take. (e) This is a List • Course Listingscolumns columns thetotable sublayouts to fill the available horizontal space. (e) adds to space the outline sublayout to use space • Sections test to see how (f) much figure caption will generated take. (f) This test to seetohow much ideal spacebreak the figure take. in (e). more efficiently. is a space specialthe schema-only layout by is thea algorithm calculate pointscaption for the will columns content area of nested tables, respectively. The outline layout, unlike the nested table layout, supports the concept of a horizontally constrained width. Like the layouts produced by our final algorithm, an outline layout can be produced for any available width (with some minimum constraints), breaking text in primitive value areas as necessary. Our system distinguishes between fixedand variable-length primitive fields; value areas for fixed-length primitive field are always rendered at their predicted width, whereas areas for variable-length primitive fields are rendered to the full available width. This ensures that the visual width of a particular primitive field remains the same between successive tuples, even when the actual value differs. The rationale is two-fold. First, if layouts are to be used for data entry or data editing in a database application, the width of an input field should be commensurate with the expected size of its values. Second, keeping the perceived overall shape (gestalt) of successive tuples of the same relation schema similar should make it easier for a user to visually scan for specific fields in those tuples. 3.3 Hybrid Layout Whereas nested table layouts quickly grow wide, outline layouts tend to be tall and narrow. Outline layouts use space inefficiently by only starting values on the left-hand side of the page, and by repeating schema labels once for every value. They are also harder to read than table layouts, since they tend to put values from the same schema field but from different tuples far apart. In Figure 3(b), we introduce a hybrid layout that embeds nested tables inside an outline layout. This saves vertical space compared to the pure outline layout in Figure 3(a). Like for the outline layout, we assume that we are given a constrained horizontal width to work with, such as screen size or page width for printing. We then create a layout that is guaranteed to fit within this horizontal width. This avoids horizontal scrolling or pagination, and ensures that the layout will only grow in the vertical direction as more tuples or fields are added. When building a hybrid outline/table layout, the algorithm must start at the root level of the relation schema and decide, for each relation field, whether to render that relation field using a nested table or another level of an outline layout. If the decision is made to render a relation field using an outline layout, the decision process is repeated recursively for each of its fields. If the decision is made to render a relation field using a nested table layout, all child fields are rendered as a nested table layout as well. While our recursive layout generation algorithm technically supports embedding outline sublayouts into nested table layouts, this makes little typographical sense, and we do not make use of this case. The decision to use an outline layout vs. a nested table layout for a given relation schema could reasonably be made using a cost optimization strategy, for instance based on the total area consumed by the layout in each case. However, because nested table layouts almost invariably consume less space than corresponding outline layouts, a simpler heuristic is possible: always use a nested table layout if there is enough horizontal space available for it. This is the rule used in our algorithm; it is illustrated by contrasting the narrow layout in Figure 3(b) with the wider layout of Figure 3(c). In Figure 3(b), the layout is constrained to a small width, and small nested tables have been chosen by the heuristic for relation fields Course Listings, Grading, Sections/Meetings, and Sections/Instructors. All higher-level relations are rendered using outline sublayouts. In Figure 3(c), the same data is rendered at a larger constrained width, allowing both the Sample Reading List relation field and the entire Sections relation field to be rendered as a nested table. Note that a hybrid layout that is given enough horizontal space to work with will always degenerate to a pure nested table layout. Since the outline vs. nested table decision heuristic depends on whether or not there is enough horizontal space for a table, the determination of table column widths is important at this stage. For relation fields, table columns are as wide as the sum of their child fields’ columns, plus separator lines and side margins, as well as any extra space needed to accommodate the relation column’s own header label. For fixed-width primitive fields, the width of the table column is simply the width of the field. For variable-width primitive fields, we experimented with various heuristics, and found the average width of values in the field to be a sensible minimum, limited upwards to a constant value (on the order of 50 characters, a standard book column size). Primitive columns also need to be wide enough to accommodate their schema labels, which may often be wider than the actual values in the fields. For the latter case, we automatically use vertical column labels in tables if this makes the column narrower for the purposes of the outline vs. nested table decision. Examples are the Format and Number column header labels in Figure 3(c). After the minimum widths of table columns have been determined and the decisions to use outline vs. nested table sublayouts at each relation level have been made, additional horizontal space may be available to the right of nested table sublayouts. A separate table justification step uses the remaining space to first, for readability, make any previously vertical column headers’ labels horizontal. This is done in a greedy order to minimize the number of remaining vertical column labels. Then, any remaining horizontal space is distributed among columns holding variable-length primitive fields, in proportion to their fields’ average lengths. The table justification step is illustrated in the transition from Figure 3(c) to Figure 3(d). Note again that layout styling decisions, such as table column widths or whether to use an outline or a table sublayout for a given subrelation, are made once for each field in the schema of the input data rather than once for each value in the input data. This means, for instance, that in a given layout like Figure 3(c), every instance of the Sections relation will be rendered in the same way (either as a table or an outline), regardless of its actual content in each instance. The rationale is similar to that for making primitive fields in outline mode always the same width. 3.4 Columns in Outline Layouts While the hybrid layouts layout shown in Figures 3(b), (c), and (d) save a substantial amount of space compared to corresponding outline layouts, they still use horizontal space inefficiently in cases where small primitive fields can not be made part of a table and where only tables with narrow content can be used. In form-style database user interfaces, the traditional solution is to make use of multiple columns of fields. Note that these are a different kind of columns than the columns in nested tables; they allow different fields in an outline layout to be organized in multiple adjacent stacks. We now show how our system can automatically incorporate columns in outline sublayouts with no manual styling required. We considered various approaches to the problem of introducing columns into outline layouts. Design questions include how to pick the right number of columns to use, how to pick the width of each column, whether to allow certain fields to span multiple columns, and after what fields a new column should be started. We decided to make two simplifying assumptions which seem to work reasonably in practice: (1) the number of columns to use is based solely on the available horizontal width, and (2) every adjacent column has the same width. So for a layout of a width corresponding to a typical letter-size page, for instance, the algorithm would use a two-column layout in the root level outline. However, one complicating issue must still be dealt with: relation fields that could be rendered in a table layout need to be allowed to span multiple columns if necessary. We settled on the following rule: any relation field in an outline is excluded from participating in a set of multiple columns if that would cause it to be rendered as an outline. There is no requirement that the field would actually have to be rendered as a nested table if excluded, but if the excluded field is still rendered as an outline, that outline layout is still subject to the usual heuristics about whether to use columns or not at that next level. For implementation purposes, the algorithm divides the fields of a relation displayed in an outline layout into multiple column sets, each which contains again a list of columns, each which contains the fields in the column. To populate column sets, the algorithm iterates over outline fields in the order they appear in the schema, assigning each to the current column set. If an excluded field is encountered, it is assigned to a new column set of its own, and a new current column • Author Name Markus K. Brunnermeier and Lasse H. Pedersen Title Predatory Trading, Journal of Finance 2005 Title Topic Description L 01 O 30 Max. Enrollment May PDF? May Audit? Assignments 22824 Meetings Beg. Time End Time Place 11:00:00 12:20:00 FISHH B01 Days T Th N Instructor Assignments First Middle Last Markus Konrad Brunnermeier 30 Y Y The course involves student presentations (10-15 minutes) and most likely trading simulations that are based on recently developed trading software. Approximately 75 pages of reading per week. Financial institutions play an increasingly dominant role in modern finance. This course studies financial institutions and focuses on the stability of the financial system. It covers Website important theoretical concepts and recent Final Exam Type Final developments in financial intermediation, asset Grading Title pricing under asymmetric information, behavioral MidTerm Exam finance, and market microstructure. Topics Final Exam include market efficiency, asset price bubbles, Oral Presentation(s) herding, liquidity crises, risk management, market design, and financial regulation. Sample Reading Author Name Title List Markus K. Brunnermeier Asset Pricing under Asymmetric Information Maureen O'Hara Market Microstructure Theory Glyn A. Holton Value-at-Risk: Theory and Practice Markus K. Brunnermeier and Lasse H. Pedersen Predatory Trading, Journal of Finance 2005 Sections Status SCORE Meetings Instructor Assignments Number Place Days TBA? Beg. End First Middle Last Time Time Day L 01 O 30 22824 11:00:00 12:20:00 FISHH B01 T Th N Perc. 35 35 30 Markus Konrad Brunnermeier Hybrid Layout Day Th TBA? Code Number Subletter ECO 467 Institutional Finance Max. Enrollment • Course Listings Number Format Sections • Format L Number 01 Status O Max. Enrollment 30 SCORE Number 22824 Meetings • Beg. Time 11:00:00 End Time 12:20:00 Place FISHH B01 Days • Day T Maureen O'Hara Glyn A. Holton Markus K. Brunnermeier and Lasse H. Pedersen Asset Pricing under Asymmetric Information Market Microstructure Theory Value-at-Risk: Theory and Practice Predatory Trading, Journal of Finance 2005 Day Author Name Glyn A. Holton Title Value-at-Risk: Theory and Practice Markus K. Brunnermeier TBA? Author Name Maureen O'Hara Title Market Microstructure Theory • MidTerm Exam 35 Final Exam 35 Oral 30 Presentation(s) SCORE Number Max. Enrollment Status • Final Sections Title Table Layout • Outline Layout Sample Reading List • Author Name Markus K. Brunnermeier Title Asset Pricing under Asymmetric Information The course involves student presentations (10-15 minutes) and most likely trading simulations that are based on recently developed trading software. Approximately 75 pages of reading per week. Sample Reading List Author Name Number Format Title Oral Presentation(s) Perc. 30 Y Grading Title Perc. Title Final Exam Perc. 35 • Y Assignments Final Exam Type Website Financial institutions play an increasingly dominant role 30 in modern finance. This course studies financial institutions and focuses on the stability of the financial system. It covers important theoretical concepts and recent developments in financial intermediation, asset Title Institutional Finance pricing under asymmetric information, behavioral Topic finance, and market microstructure. Topics include Description Financial institutions play an increasingly dominant role in modern finance. This course studies financial institutions and market efficiency, asset price bubbles, herding, liquidity focuses on the stability of the financial system. It covers important theoretical concepts and recent developments in crises, risk management, market design, and financial financial intermediation, asset pricing under asymmetric information, behavioral finance, and market microstructure. regulation. Topics include market efficiency, asset price bubbles, herding, liquidity crises, risk management, market design, and financial regulation. Max. Enrollment 30 May PDF? Y May Audit? Y Assignments The course involves student presentations (10-15 minutes) and most likely trading simulations that are based on recently developed trading software. Approximately 75 pages of reading per week. Website Final Exam Type Final Grading • Title MidTerm Exam Perc. 35 • Institutional Finance May Audit? Description May PDF? Max. Enrollment ECO 467 Title Subletter Number Code Course Listings • Code ECO Number 467 Subletter Topic Course Listings • N Instructor Assignments • First Markus Middle Konrad Last Brunnermeier Fig. 4. A comparison between our hybrid outline/table layout and a pure nested table layout and a pure outline layout, for the case of displaying a single tuple with many fields, including relational fields containing other nested tuples. Each layout is showing the same data in its entirety, at the same scale and font size. Outline layouts waste space by concentrating data to the left of the screen and by repeating labels for each value. Table layouts waste space when different fields in the same subrelation require different amounts of vertical space. Table layouts also tend to become very wide, requiring horizontal scrolling if viewed on a screen. set started. No reordering of fields is done at any point, as the order of fields in the schema is considered significant for presentation purposes. After all fields have been assigned to column sets, each column set partitions its assigned fields into columns. Figure 3(e) shows the final hybrid layout with column support. The example layout has a single outline sublayout with 3 column sets; the first column set has two columns and contains the fields Course Listings through Description in the first column and the fields Max. Enrollment through Grading in the second column. The second and third column sets each have only one column with a single field in each, containing the fields Sample Reading List and Sections, respectively. The partitioning of columns in a column set, that is, after what fields to start each column, requires a heuristic. Simply giving each column the same number of fields does not work well, since some fields frequently take up more vertical space than others. A better approach is to split the columns in such a way as to minimize the total vertical space consumed; this can be done easily with a dynamic programming routine. However, if this is done independently for each tuple being rendered, two tuples might end up having differently partitioned columns, with different fields starting the columns in each case. This is not ideal for readability. Instead, as before, we make the decision of where to begin new columns only once for the entire layout. To allow calculation of optimal column partitioning positions on the basis of only aggregate information about the input, we allow our layout generation algorithm to generate schema-only versions of sublayouts. For instance, the schema-only layout for the layout in Figure 3(e) is shown in Figure 3(f). In a schema-only layout, variable-length primitive fields are sized according to the average width of the field in the entire dataset. Relation fields are rendered, in outline or table form, with a single placeholder tuple only, but subsequently padded such that the size of the subrelation layout is proportional to the average cardinality of that relation throughout the entire dataset. Thus in Figure 3(f), the Description field is taller than the Title field, and there are about two rows’ worth of space allocated in the Sample Reading List table. Optimal column partitioning decisions are then made using these estimated schema-only layouts. The final class of hybrid layouts produced is considerably more compact than both the outline and the nested table layouts, and can be produced automatically with no manual input. Figure 4 shows a scale comparison between the three layout styles, each showing a single nested tuple from the course catalog example. In this case, the hybrid layout would permit significantly more data to be fit on a single screen without scrolling in either the vertical or the horizontal direction. We can also see that large nested table layouts often waste space whenever two values in the same tuple take up different amounts of vertical space. 3.5 Implementation To produce the hybrid layouts described, our system makes two passes over the input data while maintaining a stylesheet as the only other common data structure. The stylesheet maps schema paths and property types to property values, that is, each node in the schema has one value for each property. See Table 1 for the list of stylesheet properties. Before the algorithm starts, a subset of properties will already have been set as constants, such as the choice of fonts and separator styles. Our algorithm’s first pass over the input data is during the Measure phase, which finds the average rendered width of each primitive value and the average cardinality of each relation value. The application of heuristics to set remaining stylesheet properties is then done in a subsequent pass over the schema only, the Auto-Style phase. Finally, the output layout is constructed in the Layout phase, which is the second pass over the input value. The Auto-Style and the Layout phases execute the same code, but with the Auto-Style phase traversing a schema-only version of the data structure used to maintain context, and using the aforementioned heuristics to set undefined stylesheet properties whenever they are encountered. The heuristics for setting stylesheet properties during the Layout phase are described in Table 1. 4 I NTERACTIVE F EATURES While our evaluation focuses on the static aspects of our layout management algorithm, our implemented system includes multiple features oriented towards interactive use.5 A requirement for many kinds of interactions is to be able to make selections among displayed elements in the layout. Our system supports a spreadsheet-like cursor which can be used to select any cell in the layout, where a cell is defined as either a label for a primitive field, a label for a relation field, or a primitive value. Selections can be made either by clicking the mouse or by moving the cursor with keyboard navigation keys (arrow keys, Home/End, and Page Up/Page Down). 5 To see the interactive features described in this section in action, we recommend watching the demo video accompanying this article. Because the aforementioned definition of a cell serves to ensure that no two cells can ever overlap, determining the cell to be selected in the case of a mouse click is a simple matter of determining what cell occupies, or is closest to, the point at which the mouse was clicked. For keyboard-based cursor movement, we found the cursor behavior to feel the most natural when the relative motion of the cursor followed the physical location of the cells in the visual layout rather than the logical location of the cells in the schema. To make keyboard cursor movement work well when traversing cells stacked in various nontrivial configurations, we store the cursor state as an actual point on the layout rather than simply as a pointer to the selected cell. This generalizes the behavior seen in existing spreadsheets for instance when moving the cursor across merged cells. An additional feature is the ability to interactively override the stylesheet settings made by the automatic layout manager. This has the potential to significantly improve readability of output layouts, since the user can use their domain knowledge to decide where labels are superfluous and can be omitted, what fields should serve as titles and thus be emphasized with larger fonts, and such. Finally, our system supports “frozen” table headers which stay put at the top of the screen for as long as a table is partially visible in the scrolling viewport. This works both for pure table layouts and where table layouts are contained within outline layouts. Potential future features include multiple selection, collapsible relation fields, editing of data values, spreadsheet-style filtering, and more general-purpose query-building. 5 E VALUATION We evaluated three aspects of our system: runtime performance, the amount of space consumed by generated layouts, and the readability Mean Depth Max. Depth Mean Plural Depth Max. Plural Depth 59 887 1 359 250 953 1 566 1 493 3 032 854 901 953 800 18 751 419 61 560 809 310 1 020 731 101 494 200 2.25 2.20 2.96 5.01 3.15 2.10 3.63 6.16 2.00 3.85 2.33 4 4 4 6 5 3 5 8 2 6 3 2.25 2.20 1.00 2.51 1.94 1.09 2.48 3.43 1.00 1.94 1.00 4 4 1 3 3 2 4 5 1 3 1 Table 3. Runtime measurements for each phase of the layout algorithm. Standard error is within 3% in each case. RelationalPrinceton RelationalPrincetonHuge Pages 12 455 Algorithm Phase Runtime (s) Paginate Dataset Layout Properties Set During Auto-Style Phase P R StartNewOutlineColumn. True for the first field in each column of an outline column set. The heuristic partitions columns based on schema-only layouts that use ValueDefaultWidth and AverageCardinality to estimate field sizes. P R StartNewOutlineColumnSet. True for the first field in each outline column set. Column sets allow some sibling fields to be organized in multiple columns and others not. The heuristic puts a field in its own single-column column set if it would otherwise contain an outline sublayout. P R UseVerticalTableHeader. Whether to display a label in a table header vertically. The heuristic initially assumes vertical labels for primitive fields if this makes the column narrower, but restores as many horizontal labels as possible when the table is justified. P R TableColumnWidth. The width of each table column. Primitive columns are ValueDefaultWidth wide before justification; relation columns are the sum of their children plus twice TableNestSpaceSideWidth. Columns are also extended to accomodate their labels, as necessary. R UseTable. Whether to use an outline or a table sublayout for this relation. A table sublayout is used iff it its width before justification is less than or equal to the available horizontal space. 828 463 21 811 13 679 12 360 13 559 22 012 15 046 18 218 10 607 11 236 4 858 Auto-Style Properties Set During Measure Phase R OutlineLabelWidth. Width of labels in outline sublayouts. Sibling fields all use the same width, which is defined at the parent relation level. R AverageCardinality. The average number of tuples in each subrelation. P IsVariableLength. Whether a primitive field holds long strings of variable length. P ValueDefaultWidth. Average width of primitive values in this field when rendered with ValueTextStyle, or maximum width for non-variable length fields. Dataset RelationalPrincetonHuge RelationalPrinceton Auction321Gone SigmodRecord SwissProt DBLP Mondial NASA TPCHPart ProteinSequence CoursesReed # Tuples Constant Heuristic Parameters R OutlineMaxLabelWidth. Maximum width of labels in outline sublayouts. R OutlineColumnMinWidth. Minimum width of each outline column. P OutlineMinValueWidth. Minimum width of a primitive value in an outline. P OutlineSnapValueWidth. Multiple to round up to when setting the width of a nonvariable primitive value in an outline sublayout. P TableMaxPrimitiveWidth. Maximum width that can be allocated to a table sublayout column with variable-length primitive values, before table justification. # Primitives Basic Styling Constants OutlineBulletStyle. Bullet type for bulleted tuples in outline sublayouts. OutlineIndentWidth. Indentation amount for bulleted tuples in outline sublayouts. OutlineTupleSpaceHeight. Vertical space between tuples in outline sublayouts. TableNestSpaceSideWidth. Horizontal margin amount for nested tables. LabelTextStyle. Text style for outline labels or table headers. ValueTextStyle. Text style for values in outline or table sublayouts. # Characters in Primitives R R R R PR P Table 2. Quantitative statistics related to the size and complexity of datasets referred to in this paper. The depth of a primitive value is the number of enclosing relation values that must be traversed to reach the primitive value from the root. The plural depth only counts non-singleton enclosing relations. Measure Table 1. Properties defined, for each schema node, by a stylesheet. P, R, and P R indicate properties applicable to primitive nodes, relation nodes, or both, respectively. We have omitted color- and border-related properties. 1.03 40.16 0.29 0.31 0.83 42.43 0.0014 0.0451 of large layouts as measured by the time taken for human subjects to solve question tasks about the rendered data. As sample datasets, we picked one XML file from each of the 10 categories in the XML Data Repository at the University of Washington6 , except for the Treebank dataset, which is the only one with a recursive schema. We used the “preview” version of each dataset to make sure visualizations would be of a realistic size for human perusal. We also included one dataset from a relational database containing the complete course catalog for a semester at Princeton University. See Table 2. 5.1 Runtime Performance For runtime measurements, we ran all phases of the layout algorithm in sequence, and repeated the entire sequence multiple times. The runtimes for individual phases were averaged, less initial dry runs. Resulting runtime statistics for two datasets are shown in Table 3, for 30 runs plus 3 dry runs. The machine used had an Intel Core 2 Duo CPU and 4GB of RAM. Our two sets of runtime measurements suggest, as expected, that both the Measure and Layout phases run in time roughly proportional to the size of the input data, as measured by the size of the output layouts. Also as expected, the time consumed by the Auto-Style phase does not depend on the size of the input data, as it depends only on the schema and input stylesheet. For the larger RelationalPrincetonHuge dataset, the fact that the Layout phase does not take significantly more time to run than than the Measure phase suggests that the main bottleneck of the Layout phrase is the line breaking code that determines the size of rectangles assigned to display primitive values. Profiling has confirmed this to be the case. For the smaller RelationalPrinceton dataset, the time to perform Auto-Style and Layout for a new width is interactive. 6 G. Miklau, http://www.cs.washington.edu/research/ xmldatasets/www/repository.html Total Layout Area (US Letter-size pages or equivalent) 70 Unpaginated Hybrid Layout Paginated Hybrid Layout Unpaginated Outline Layout Paginated Outline Layout Unpaginated Tabular Layout 60 50 40 30 20 10 0 RelationalPrinceton Auction321Gone SigmodRecord SwissProt DBLP Mondial NASA TPCHPart ProteinSequence CoursesReed Fig. 5. Total area consumed by layouts of each of the three types. Outline and Hybrid layout widths are constrained to 8 inches, and are shown both with and without pagination. 5.2 Layout Space Efficiency To evaluate the space efficiency of our layouts, we compared, for each dataset, the area consumed by our own hybrid layout vs. the area consumed by a pure outline layout and a pure nested tabular layout. Each layout was produced by our layout manager, with the latter two using hard-coded values for the stylesheet property that selects whether a table or an outline style is used at any given level in the schema. Figure 5 compares the total area of each kind of layout for every dataset. The unpaginated area of a layout is that of the smallest rectangle enclosing it. The paginated area, for hybrid and outline layouts, is the number of pages consumed by the layout times the imageable (non-margin) area available on each page. Thus, the latter includes space wasted at the end of each page whenever the pagination algorithm has opted to break the page at an earlier but less awkward place. Since pure tabular views of an entire dataset are generally too wide to fit on a regular letter-size page, the tabular layout is rendered as a single, very large page. In cases where the tabular layout is actually narrow enough to fit on a page, notably the the SigmodRecord, TPCHPart, and CoursesReed datasets, the hybrid layout is nearly identical to the tabular layout, except that the pagination algorithm may be used to break up the hybrid layouts in the vertical direction. The ratio of the area consumed by an unpaginated outline layout to that consumed by an unpaginated hybrid layout is 3.9:1 on average. Similarly for tabular to hybrid layouts, it is 1.6:1 on average. Looking at the data from Figure 5, we see that the hybrid layout consumes less space than the corresponding outline layout for every dataset, with or without pagination enabled. The difference is greatest, between 4 and 13 times, in the cases where the hybrid layout corresponds to a pure nested tabular layout, namely SigmodRecord, TPCHPart, and CoursesReed. In these cases the schema of the data was flat or almost flat, and so a standard table layout would make very efficient use of the space. In the other cases, the hybrid layouts are about half the size of outline layouts on average. The smallest difference was for the Auction321Gone dataset, where the outline layout was 1.3 times the size of the hybrid layout. In this case the schema was nested in several levels, but contained only singular relations (relations only ever holding a single tuple) beyond the top level, so there were no opportunities for the hybrid layout algorithm to introduce tables into the lower levels of the layout. The modest saving over the outline layout came from the hybrid layout’s ability to display data in an outline tuple over two columns. A more significant difference was for the Mondial dataset, where the outline layout was 3.3 times the size of the hybrid layout. Here the hybrid layout made good use of both tuple columns and the ability to render subrelations as tables, and only used one level of outline bullets. While pure nested tabular layouts cannot be constrained to a page width like the outline and hybrid layouts, they tend to consume less total area than the outline layouts. See again Figure 5. For Auction321Gone, NASA, ProteinSequence, SwissProt, the hybrid layout still consumes between 30% and 80% less space than the tabular layout. This is because nested tables waste large amounts of space whenever a row contains cells of variable heights, such as when one empty and one well-populated subtable are placed horizontally adjacent to each other on the same row of a parent table. 5.3 Readability To evaluate the readability of our layouts, we conducted a betweensubjects online user study using Amazon Mechanical Turk7 and StudyCaster, a Java-based tool we developed to allow test subjects to stream timestamped recordings of their computer screens to our server with a minimum of effort. In an initial public recruiting stage of the study, workers were offered $0.25 to launch the StudyCaster and solve a chart-making task that required the workers to have Microsoft Excel installed on their machines. In the second and main stage of our experiment, we gradually invited qualified Mechanical Turk workers from the first stage directly to do a second task, worth $3.00. This task contained, for each subject, 9 different two-part questions, each two-part question being based on a separate PDF file with a layout generated from one of our 9 XML sources from the UW XML repository. The questions were a mix requiring the subjects to do both scanning across multiple similar entities (e.g. “What is the Brand number of the product sold in a Jumbo Bag container at a Retail price of less than $950?” or “How many articles were published in Volume 12, Number 3?”) and lookup between the attributes and related entities of a single entity (e.g. “What is the name of the person who was responsible for digitizing the earlier work by authors X?”). To reduce potential noise from subjects’ varying familiarity with their PDF readers’ search feature, the PDF files were rasterized, effectively disabling the feature for everyone. All subjects were given the same questions and datasets in the same order. However, the type of layout provided for each dataset was randomized, with the constraint that each subject would see 3 datasets rendered with each of the 3 kinds of layout types. The order of the layout types was round-robined such that datasets number 1, 4, and 7 would use the same layout types, as would 2, 5, and 8, and as would 3, 6, and 9. Each of the 18 total questions (from 9 two-part questions) would be shown in the StudyCaster pop-up window, which allowed us to measure the exact amount of time the subjects spent viewing, and hence presumably spent working on, each question. The StudyCaster software also allowed us to further limit timings to when workers had the correct PDF file in focus in their PDF reader to answer the currently shown question (sampled from the Win32 API at 5Hz), and to exclude time idle more than 5 seconds from keyboard or mouse activity (same). The idle time rule was used to decrease noise from workers taking a break from the computer while having a question open on the screen. Our user study yielded data from 27 subjects. An additional 6 subjects completed the study, but were not included in the dataset due to 7 http://www.mturk.com 270 Hybrid Layout, Subtask A Outline Layout, Subtask A Tabular Layout, Subtask A Hybrid Layout, Subtask B Outline Layout, Subtask B Tabular Layout, Subtask B 240 Mean Time to Solve Correctly (s) Mean Time to Solve Correctly (s) 510 480 450 420 390 360 330 300 270 240 210 180 150 120 90 60 30 0 1 -A 210 180 150 120 90 60 30 01 2-Si 3-Sw 4-D 5 6 7 8 9 BLP -Mond -NASA -TPCH -Protei -Cours iss on3 gmodR ial nSe e Part 21G ecor Prot quen sReed one d ce Task Number-Data Set Name (Subtasks A) -Au ucti 2-Si 3-Sw 4-D 5-M 6 7 8 9 B g ond -NASA -TPCH -Protei -Cours ial nSe e Part n32 modRe issProt LP quen sReed 1Go cord ce ne Task Number-Data Set Name (Subtasks B) ctio Fig. 6. The mean time to solve each task in the user study, grouped by the kind of layouts that were used to generate the PDFs subjects used to solve the task. The error bars show the Standard Error of the Mean. Subtasks B were given to subjects directly after Subtasks A, and were in each case identical to Subtask A except for a small emphasized change in the question text. technical problems uploading their screen recordings and timing data. The 18 task questions were answered correctly by 88% of subjects on average, with too limited variation to draw conclusions about possible impact of layout type on correctness. Figure 6 shows the average time taken to complete each subtask for each layout type. Each subject’s timing is included in the average for each subtask only if the subject answered that question correctly. To test our user study for statistical significance, we considered the timing data from each of the 18 subtasks separately. In each case, we thus had 3 sets of measurements of the time taken to solve the task correctly, one for each kind of layout presented to the user. We first ran Levene’s test to confirm that our experiment design conformed to ANOVA’s assumption of homogeneous variances between the 3 measurement populations in each case. In only 1 of the 18 cases (Task 5B) was Levene’s test significant (indicating non-homogeneous variances) with p < 0.05, suggesting that this is a reasonable assumption. We thus proceeded to use ANOVA to analyze the results, with a Tukey Honest Significance Test (HSD) as a follow-up test in cases where the ANOVA was significant. Since we are doing 18 tests, we should require p < 0.05/18 for strictly significant ANOVAs, as per the Bonferroni correction. There are two significant results to this confidence level, for tasks 2A and 8B. For the purposes of discussing results, however, we have done the Tukey HSD follow-up test for all tasks with ANOVAs up to p < 0.05. This allows us to list all the most significantly different pairs of timings between different layout types, as shown together with the relevant p-values in Table 4. Note that we can expect about one (18 ∗ 0.05) of the borderline-significant ANOVAs in this table to be due to chance. Looking at the follow-up tests from Table 4, we can see no significant differences in the task completion times between the tabular vs. the hybrid layouts. We do however see consistent differences both between the outline and the hybrid layouts as well as between the outline and the tabular layouts. These differences are present in both subtasks of several questions, suggesting that they are relevant both when users are first learning to do a task and when they immediately after do a second similarly structured task. In terms of relative task completion times, it is clear that both the hybrid and the tabular layouts outperform the outline layout, in both cases being completed 2.9 times faster on average for the tasks listed in Table 4. This is consistent with the hypothesis that larger layouts are harder to use because more scrolling is needed to look through the same data. The relatively high performance of the nested tabular layouts suggests that for the kinds of large datasets we had our users work with, the very regular structure of the tabular layout can outweigh its disadvantages of taking up more space and requiring both horizontal and vertical scrolling. However, the tabular layout would be a poor choice for smaller datasets, such as the common database application requirement of showing the details of a single entity with all its attributes and Table 4. Summary of statistical tests run on the dataset from Figure 6. Only tasks for which the ANOVA yielded p<0.05 are shown. For Tukey HSD pairs with p<0.05, we also show the relative differences in average task completion times. Task Levene ANOVA (heteroscedasticity) # 2A 2B 3B 5A 5B 8A 8B 9A 9B p 0.5312 0.0627 0.2889 0.7976 0.0389 0.1099 0.3409 0.4104 0.1812 between pairs) p 0.0012 0.0034 0.0032 0.0072 0.0174 0.0101 0.0004 0.0034 0.0107 Mean-Time-toSolve Ratio Tabular v. Outline: Outline: Hybrid Hybrid Tabular p 0.8006 2.05 2.80 0.9279 3.40 2.62 0.0962 2.81 0.8521 2.11 2.74 0.9999 4.97 0.8201 1.88 0.3129 2.00 4.16 0.9553 2.46 2.20 0.7919 4.69 Tukey HSD (follow-up to find differences Outline v. Hybrid p 0.0019 0.0036 0.3047 0.0200 0.0251 0.0092 0.0048 0.0050 0.0108 Outline v. Tabular p 0.0169 0.0448 0.0023 0.0142 0.0532 0.0996 0.0008 0.0119 0.0530 related subentities. Future evaluation could focus on layout performance on smaller, form-size datasets. 6 C ONCLUSION We have presented a layout management algorithm that automates the display of structured hierarchical data using the traditional visual idioms of hand-designed database UIs: tables, multi-column forms, and outline-style indented lists. By default, the widths of generated layouts are constrained so that only vertical scrolling may be necessary to view the data in its entirety. Our stylesheet system is further designed such that two input values mapping to the same schema node will always be styled in a similar way. Our hybrid layouts are 3.9 and 1.6 times more compact on average than outline layouts and horizontally unconstrained table layouts, respectively, and are as readable as table layouts even for large datasets. We believe that our system can function as a single output system for most of the data views commonly required in domain-specific database applications, whether they be large tables required to display information about many entities at once or form views that must display many details of a single entity compactly without scrolling. ACKNOWLEDGMENTS We would like to thank Michael S. Bernstein, Christian Bernt Haakonsen, Jeffrey Heer, and anonymous reviewers past and present for their helpful input during various stages of this project. R EFERENCES [1] E. Bakke and E. Benson. The schema-independent database UI: A proposed holy grail and some suggestions. In Proceedings of the 5th Biennial Conference on Innovative Data Systems Research (CIDR ’11), 2011. [2] E. Bakke, D. R. Karger, and R. C. Miller. A spreadsheet-based user interface for managing plural relationships in structured data. In Proceedings of the 29th International Conference on Human Factors in Computing Systems (CHI ’11), New York, NY, USA, 2011. ACM. [3] B. B. Bederson, B. Shneiderman, and M. Wattenberg. Ordered and quantum treemaps: Making effective use of 2D space to display hierarchies. ACM Transactions on Graphics (TOG), 21(4):833–854, October 2002. [4] F. Bodart, A.-M. Hennebert, J.-M. Leheureux, and J. Vanderdonckt. Towards a dynamic strategy for computer-aided visual placement. In Proceedings of the workshop on Advanced Visual Interfaces (AVI ’94), pages 78–87, New York, NY, USA, 1994. ACM. [5] E. H.-H. Chi, P. Barry, J. Riedl, and J. Konstan. A spreadsheet approach to information visualization. In Proceedings of the IEEE Symposium on Information Visualization (InfoVis ’97), pages 17–24, 1997. [6] R. Chimera and B. Shneiderman. An exploratory evaluation of three interfaces for browsing large hierarchical tables of contents. ACM Transactions on Information Systems (TOIS), 12:383–406, October 1994. [7] P. Chmelar, R. Hernych, and D. Kubicek. Interactive visualization of data-oriented XML documents. In T. Sobh, editor, Advances in Computer and Information Sciences and Engineering, pages 390–393. Springer Netherlands, 2008. [8] M. Fernández, Y. Kadiyska, D. Suciu, A. Morishima, and W.-C. Tan. Silkroute: A framework for publishing relational data in XML. ACM Transactions on Database Systems (TODS), 27(4):438–493, 2002. [9] K. Gajos and D. S. Weld. SUPPLE: Automatically generating user interfaces. In Proceedings of the 9th International Conference on Intelligent User Interfaces (IUI ’04), pages 93–100, New York, NY, USA, 2004. ACM. [10] M. Ghoniem, J. Fekete, and P. Castagliola. A comparison of the readability of graphs using node-link and matrix-based representations. In Proceedings of the IEEE Symposium on Information Visualization (InfoVis ’04), pages 17–24, 2004. [11] M. Graham and J. Kennedy. A survey of multiple tree visualisation. Information Visualization, 9(4):235–252, 2010. [12] C. Jacobs, W. Li, E. Schrier, D. Bargeron, and D. Salesin. Adaptive grid-based document layout. ACM Transactions on Graphics (TOG), 22(3):838–847, July 2003. [13] J. Jelinek and P. Slavik. XML visualization using tree rewriting. In Proceedings of the 20th Spring Conference on Computer Graphics (SCCG ’04), pages 65–72, New York, NY, USA, 2004. ACM. [14] B. Johnson and B. Shneiderman. Tree-maps: A space-filling approach to the visualization of hierarchical information structures. In Proceedings of the 2nd Conference on Visualization (VIS ’91), pages 284–291, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press. [15] H. Kitagawa, M. Gotoh, S. Misaki, and M. Azuma. Form document management system SPECDOQ—its architecture and implementation. In Proceedings of the second ACM-SIGOA conference on Office Information Systems (COCS ’84), pages 132–142, New York, NY, USA, 1984. ACM. [16] K. Kowalzcykowski, A. Deutsch, K. W. Ong, Y. Papakonstantinou, K. K. Zhao, and M. Petropoulos. Do-It-Yourself database-driven web applications. In Proceedings of the 4th Biennial Conference on Innovative Data Systems Research (CIDR ’09), 2009. A newer version of this paper may be found at http://db.ucsd.edu/pubsFileFolder/ 319.pdf. [17] D. Król, J. Oleksy, M. Podyma, and B. Trawiński. The analysis of reporting tools for a cadastre information system. In Proceedings of the 9th International Conference on Business Information Systems (BIS ’06), 2006. [18] M. Levene. The Nested Universal Relation Database Model, volume 595 of Lecture Notes in Computer Science. Springer Berlin/Heidelberg, 1992. [19] J. D. Mackinlay, P. Hanrahan, and C. Stolte. Show Me: Automatic presentation for visual analysis. IEEE Transactions on Visualization and Computer Graphics (TVCG), 13(6):1137–1144, November/December 2007. [20] R. Mitchell, D. Day, and L. Hirschman. Case study: Fishing for information on the Internet. In Proceedings of the First Information Visualization Symposium (InfoVis ’95), pages 105–111, Los Alamitos, CA, USA, October 1995. IEEE Computer Press. [21] Q. V. Nguyen and M. L. Huang. EncCon: An approach to constructing interactive visualization of large hierarchical data. Information Visualization, 4(1):1–21, 2005. [22] R. Rao and S. K. Card. The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’94), pages 318–322, New York, NY, USA, 1994. ACM. [23] G. Robertson, K. Cameron, M. Czerwinski, and D. Robbins. Polyarchy visualization: visualizing multiple intersecting hierarchies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’02), pages 423–430, New York, NY, USA, 2002. ACM. [24] B. Shneiderman. The eyes have it: A task by data type taxonomy for information visualizations. In Proceedings of the 1996 IEEE Symposium on Visual Languages, page 336, Los Alamitos, CA, USA, 1996. IEEE Computer Society. [25] M. Spenke, C. Beilken, and T. Berlage. Focus: The interactive table for product comparison and selection. In Proceedings of the 9th Annual ACM Symposium on User Interface Software and Technology (UIST ’96), pages 41–50, New York, NY, USA, 1996. ACM. [26] K. Tajima and K. Ohnishi. Browsing large HTML tables on small screens. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (UIST ’08), pages 259–268, New York, NY, USA, 2008. ACM. [27] L. Wegner, S. Thelemann, J. Thamm, D. Wilke, and S. Wilke. Navigational exploration and declarative queries in a prototype for visual information systems. In C. Leung, editor, Visual Information Systems, volume 1306 of Lecture Notes in Computer Science, pages 199–218. Springer Berlin/Heidelberg, 1997. [28] F. Yang, N. Gupta, C. Botev, E. F. Churchill, G. Levchenko, and J. Shanmugasundaram. WYSIWYG development of data driven web applications. Proceedings of the VLDB Endowment, 1(1):163–175, 2008. [29] S. B. Yao, A. R. Hevner, Z. Shi, and D. Luo. FORMANAGER: An office forms management system. ACM Transactions on Information Systems (TOIS), 2(3):235–262, August 1984. [30] S. Zhao, M. J. McGuffin, and M. H. Chignell. Elastic hierarchies: Combining treemaps and node-link diagrams. In IEEE Symposium on Information Visualization (InfoVis ’05), pages 57–64, October 2005. [31] C. Ziemkiewicz, R. J. Crouser, A. R. Yauilla, S. L. Su, W. Ribarsky, and R. Chang. How locus of control influences compatibility with visualization style. In 2011 IEEE Conference on Visual Analytics Science and Technology (VAST ’11), pages 81–90, October 2011.