GeoSense An open publishing platform for visualization, social sharing, and analysis of geospatial data. ARCHNES Anthony DeVincenzi TT B.F.A. Visual Communication, Seattle Art Institute 2007 Submitted to the Program in Media Arts and Sciences, Shlf c, oo~ o hi A-rcecur dlI an- annng 11, in partial fulfillment of the requirements for the degree of Master of Science in Media Arts and Sciences at the Massachusetts Institute of Technology June 2012 @ 2012 Massachusetts Institute of Technology. All rights reserved Aut or Anthony DeVincenzi Program in Media Arts and Sciences May 11, 2012 Certified by Dr. Hiroshi Ishii Jerome B. Wiesner Professor of Media Arts and Sciences Associate Director, MIT Media Lab Program in Media Arts and Sciences Accepted by Dr. Mitchel Resnick Chairperson, Departmental Committee on Graduate Students Program in Media Arts and Sciences I T GeoSense An open publishing platform for visualization, social sharing, and analysis of geospatial data. Anthony DeVincenzi ;~ Thesis Supervisor Dr. Hiroshi Ishii Jerome B. Wiesner Professor of Media Arts and Sciences Associate Director, MIT Media Lab Program in Media and Sciences Thesis Reader Cesar A. Hidalgo Assistant Professor, MIT Media Lab {'34> Thesis Reader Joi Ito Director, MIT Media Lab Acknowledgments THANK YOU, Hiroshi, my advisor, for allowing me to diverge greatly from our group's primary area of research to investigate an area I believe to be strikingly meaningful; for no holds barred in critique, and providing endless insight. The Tangible Media Group, my second family, who adopted me as a designer and allowed me to play pretend engineer. Samuel Luescher, for co-authoring GeoSense alongside me. My thesis readers Joichi Ito and Cesar Hidalgo for providing feedback, inspiration, and guidance over the course of this work. The people of Safecast, who support an idea larger than what any one man could accomplish. You are truly inspiring. Divid Lakatos, and Matthew Blackshaw, for our many adventurous projects to date, and for those to come in the near future. Mom and Dad, for allowing me to explore my passions despite how inapplicable they may have seemed at times. My family, and Jessica for loving me. I learn from your patience. My friends in Seattle, and around the world. TABLE OF CONTENTS Introduction 13 Related Work 18 Contemporaries Safecast 20 23 A call for help 23 Keeping quarters 24 Application Design 27 Balancing simplicity and complexity 27 Data mobility 28 Summary of system 28 Second order observation 30 Data features 30 Development timeline 31 Design Theory Geovisualization 33 Aesthetics 36 Spatial-temporal narratives 39 Process 42 Concept 43 Safecast worldmap (V1) 45 Generalizing the platform (V2) 48 User interfacefor data management GeoSense (V3) 9 33 49 50 Spatial comments and chat 52 Continued:Beyond the screen 53 Technical Design Server structure 56 Amazon EC2 56 Ubuntu 56 Satellite & satellite API 56 Architecture 56 GeoSense Database 57 Data import 58 Aggregation and reduction through MapReduce 58 Spatial indexing and grid queries 59 TeamdataDatabase 61 Application Structure 61 Views 61 Models 62 Collections 62 ExternalLibraries 62 Challenges 65 Data purity 65 Performance 66 Scale 67 Custom instances 67 Use Cases 10 55 69 Safecast 69 Sourcemap 71 The Lace Race 71 Results Future Work 72 74 Tile servers 74 Expanded visualization types 75 Models & mechanistic explanations 75 Boolean conditions and spatially bound alerts 75 Conclusion 78 References 81 Appendix 87 Tablet AR installation 11 87 12 Introduction Throughout this document we refer to two projects: GeoSense, a visualization platform, and Safecast [1], a sensing and data collection organization. Their differences will be described at length as well as their commonality and shared resources. ONWARD Geovisualization is a common form of information visualization, or scientific data visualization that when combined with visual pattern recognition allows for increased human understanding in effort to enhance the decision making process around a given view of data. [2] Geospatial data has become abundant, and so have the many sensors that we use to collect it. With over 1.2 billion web and GPS enabled devices in our pockets [3], the amount of geotagged meta data ranging from tweets to photos has skyrocketed to enormous proportions. As more data becomes coupled with geospatial coordinates the intrinsic relationship between the meaning of the data and the place-in-space from where it came can be visualized, observed, and analyzed to inform decision making processes. However, this poses a problem as growing amounts of data can become more and more difficult to parse and understand. 13 Today, the tools available for geospatial mapping remain highly specialized with significant technical overhead often outweighing the capabilities of the user. We use maps to codify the physical existence of immaterial media and without accessible tools for visualization, the meaning of data is lost in the columns and rows of spreadsheets. Further, the inability to quickly and simply create and share geovisualizations in a lightweight manner has slowed the evolution of sharing and collaboration in GIS [4]. How could a community, a university, or an entire industry benefit from having the complexity of geospatial data visualization reduced to that of email, or a single tweet? To be more specific, what if we could seamlessly share and engage with social features such as comments and live interaction around geospatial data? We believe that empowering users with the tools necessary to construct visual and social narratives around contextual data will enhance their collective ability to respond to current events while simultaneously planning for the future. To achieve this we must first build a platform that can interpret the many disparate forms of data and enable them to co-exist in a single unified visualization. Without this tool, our data and voices are left in singular silos never able to engage and interact with the voices of many. The visualization may take a number of forms, two or three-dimensional, varied in aesthetics per the author's discretion yet constrained within a sandbox as to guide the user - in short, not too much control, but not too little. A simple interface for sharing and socializing the new geovisualization invites multiuser collaboration, where each user may contribute and discuss the current datasets; supporting the claim that the shared knowledge of many users is often more valuable than that of one [5]. Finally, data pertaining to user interaction involving comments, tweets, and physical location may be aggregated to create a second order dataset which in turn may be incorporated into the visualization for communal behavior analysis. GeoSense aims to provide such a tool, where the user can perform tasks of both the visual artist and data analyst all while contributing to the shared cognition and collective intelligence of a broader community. Geo- 14 Sense is an easy-to-use web based platform for the organization and upload of multiple datasets, a framework platform for 2D and 3D visualization, as well as a suite of social and analysis tools. GeoSense explores generating visual correlation models based on data layering and the aggregate of community analysis in lieu of unified theories, or known mechanistic explanations. After the 311 disasters in Japan involving the Thhoku earthquake, tsunami, and Fukushima Daiichi reactor meltdown, the community was left with little information around the outcome of the crisis. The public struggled to obtain answers to even the most basic questions: "Is it safe for me to stay in my home?" and "Is my food safe to eat?" Thousands rose to aid, and amongst the responders was Safecast, an independently organized crowd-sourced mapping network. Despite the great amount of information and data that was collected, there was no clear path towards displaying, juxtaposing, and discussing the multivariate sources of critical information. GeoSense was founded to support the efforts of Safecast and the many communities of Japan. 15 16 17 Related Work We have, for hundreds of years, refined our use of visual language in the art of data visualization. As early as the 18th century men such as Joseph Priestley, an English theologist and academic had begun exploring the graphical representation of statistical methods through what is believed to be one of the earliest implementations of a timeline; designed to illustrate the contemporaneity of ancient philosophers and statesman [44]. During a similar time William Playfair, a contemporary of Priestly, debuted what are believed to be the first known instances of bar and pie charts in his two books The Commercial and Political Atlas, and Statistical Breviary respectively. These early exploration laid a foundation upon which nearly three hundred years of related work has been conducted. In more contemporary times, an enumerable amount of work has been done in the field of data visualization, much of which stems from the foundational work of Edward Tufte and his many visual definitions described in "Visual Explanations" [6]. Tufte's seminal work in visual explanation and analysis has provided the foundation for an even wider field of informational graphic design: a notable trend covering a massive spectrum of content ranging from visualizations for geospatial data [7] to social and emotional observations through data analysis [8]. 18 Exports and Imports to and from DENMARK The .Bottom ise is dsqd nt Se NORWAY from r/oo TO178Q 1arrs the Ryht- hand her bzto L.QOOO eark One of the first time series graphs:William Playfair'strade-balancetime-series chart,published in PoliticalAtlas, 1786 In the area of visualization for geospatial applications much work has been done by the GIS community to provide tools which allow for the exploration and visualization of location based datasets. Of these, Google Earth [9] and NASA World Wind [10] have been widely adopted as platforms for plotting sets of data ranging from tracking glacier footprints [11] to the displacement and distribution of refugees located in remote areas of the world [12]. This wide application space is evidence towards the versatility of utilizing a three dimensional globe to display both context and meaning of data. Deeper into tools built specifically for spatial data analysis, both Arc GIS [13] and ESRI MapIt [14] provide tools which claim to provide "easy online discovery, access, visualization, and dissemination of geospatial informa- 19 tion." Both services offer an extensive suite of data visualization and analysis tools, though neither provide a suitable framework for control over dataset comparison beyond basic layering and are both constrained to two dimensional view-ports. Similarly, community crisis mapping tools such as Pachube [15] and Ushahidi [16] allow users to take much of the foundational mapping work done by the aforementioned sources, and add specific additions related to disaster relief. In the space of tools and research for geospatial data comparison, analysis and theoretical model generation, significant work has been done by Floraine Grabler et al, with Automatic Generation of Tourist Maps, where the salience of map elements are determined by using bottom-up vision-based image analysis and top-down web-based information extraction methods [17]. The technique of selective visualization with respect to geography and locational data is an important accomplishment towards identifying how to present visual data based on the user specified variables of interest within the data. Further, work by Jeffrey Heer and Michael Bostock of Stanford University has explored how to leverage crowd sourcing to generate a visual analysis of raw data in "Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design" [18]. Contemporaries Web-based authoring tools for generating geovisualizations have become more prominent in recent years, offering an assortment of services towards helping online visitors create custom visualizations. Of them, the following are most related to GeoSense: GEOCOMMONS Most notably is GeoCommons, a public community of GeoIQ users who are building an open repository of data and maps [19]. GeoCommons has a number of similarities to GeoSense, namely in that users are given an interface to 20 assist in the upload and treatment of geospatial data, as well as a shared data repository amongst users. While many of GeoCommons' features are thoroughly implemented, including an impressive level of control over data layering through boolean operations, there remains little to no social infrastructure beyond the ability to share on Facebook or Twitter. HARVARD UNIVERSITY'S WORLDMAP Similar to GeoCommons, yet slightly smaller in scale, is Harvard University's WorldMap project [20], which invites its users to "build [their] own mapping portal and publish it to the world or to just a few collaborators." WorldMap offers a complex and configurable user experience, offering users the ability handle multiple sets of layered data atop an assortment of base map tiles. As with GeoCommons, The Harvard Worldmap has no true infrastructure for communal dialog and analysis. MAPBOX MapBox [21] is a simplified toolkit for publishing static geovisualizations. Their clean aesthetic and well designed native authoring platform named TileMill [22] stands out as best in class regarding user interface and experience. The MapBox tools are less suited for community map building and are more fitted towards creating attractive visualizations as an embed or stand alone site. MANY EYES Finally, the democratization and socialization of data visualization has been explored by Fernanda Viegas et al., in "Your Place or Mine? Visualization as a Community Component" [23] where a number of studies were conducted in order to "enable the use of visualization technology by lay users and to facilitate communication around the visualizations via tools for annotation, sharing and discussion." Many eyes does not focus on geovisualization and instead explores community dialog around common data graphs such as bar and pie charts. 21 22 Safecast GeoSense serves as the visualizationenginefor Safecast.org: a non profit collective of hackers and humanitarianswho are actively crowd sourcingradiationmappingfrom the 3-11-11 Daiichi reactormeltdown in the Fukushimaprefecture, Japan. A call for help On Friday, the 11th of March 2011, Japan suffered a national catastrophe known now as the 311 Earthquake. At a staggering magnitude of 9.0 (Mw) [24] the off-coast earthquake was the most powerful to ever affect Japan and amongst the most powerful ever recorded [25]. As a result of the undersea epicenter, a series of tsunamis were triggered generating waves which were seen to reach as high as 130 feet. Amongst the tragic and catastrophic loss of life (-15,000), injury (-26,000), and property destruction (-129,000 buildings) [26], the damage caused by the tsunamis put into motion a chain of events which would lead to the eventual equipment failures, nuclear meltdown, and following radioactive material leakage from the Fukushima I Nuclear Power Plant (referred to as Daiichi). Rated as a level 7 catastrophe on the Interna- 23 tional Nuclear Event Scale (INES) [27], the Fukushima I meltdown was the largest nuclear incident since the 1986 Chernobyl disaster. [28] Estimated economic losses skyrocketed into the tens of billions [29]. While no factor could outweigh the tragic loss of life, a full recovery and ensured healthy future for the country and its inhabitants quickly became Japan's main focus. It was during this time, seemingly moments after the beginning of this tragedy, that Safecast was formed. Safecast is a global organization working to empower people with data, primarily through building sensor networks that enable both contribution and free use of the data collected. After the 311 earthquake and resulting nuclear situation at Fukushima Daiichi it became clear that people needed more data than what was available. Since the post 311 formation of Safecast, the team has grown to a dedicated core team and over 150 supporting volunteers. It has recently received grants from the John S. and James L. Knight Foundation and has, to date, deployed over 150 handmade radiation sensors with a measurement aggregation of over 2 million individual readings [30]. Safecast is almost certainly the single largest source of radiation data in Japan, if not the world; all if which is open and available under CCO dedication [31]. GeoSense, as a project and platform, was born out of the necessity for Safecast to make visible its growing collection of data, and quickly evolved into a larger study which aims to redefine the relationship between community driven datasets and the democratization of geovisualization and analysis. Keeping quarters On March 22nd, 2012, we held a meeting at the Tokyo Hacker Space to discuss the current state and future needs of GeoSense as it pertained to Safecast. The following day, a demonstration of the data and its visualization was given at the Roppongi Hills art night, part of the Mori Art Museum, in Roppongi, Japan. During this event, numerous members of the audience shouted out, uncharacteristically for Japanese culture, and declared their need for unfettered access to this critical data. 24 "They tell lies" one woman exclaimed from the audience, "they don't want us to know what's really happeningand you're the only ones who know the truth!"We can only assume "they" refers to the local government or TEPCO, the power company responsible for the Fukushima reactors [32]. Regardless of political or conspiracy beliefs, one year past the 311 incident the cry for help was clear as ever. During the event we presented a recap of the previous 12 months, announced that at least 2 million data points had been collected, demonstrated the GeoSense visualization platform, and presented a musical synthesizer which generated interpretive music related to the ambient radiation around it. The following day a press conference was held at The Fab Cafe in Shibuya, Tokyo. Members of the press were invited to attend and learn about the achievements of Safecast to date. We again announced the 2 million data points collected, the GeoSense platform, as well as an exciting new Safecast Geiger Counter which was built entirely by Safecast team members. The press, many of whom represented major Japanese outlets like NHK and TBS, had inquiries around the mapping platform: Questions such as "What do the colors mean? Is red dangerous?Is green safe? How can I tell who collected the data?What aboutdata that is incorrector malicious?"were most common amongst the bunch. The answer, of course, was that much like our data our visualization engine would be as agnostic as possible - meaning that all variables from data type to data display would be fully customizable. Our answer in short - "We are not presenting conclusions, only an observational platform from which you may draw your own." 25 26 Application Design Balancing simplicity and complexity The most fundamental design principle behind GeoSense is to procure simplicity and legibility where complexity and confusion exists. In order to produce a usable platform with the greatest amount of user coverage and rich feature depth, it has been carefully designed to promote ease-of-use from the API to the UI. However, this does not discredit the need for a tool which provides even the most seasoned data analysts with new and actionable insights. To address this, GeoSense scales gracefully dependent upon its user's specific needs; a simple geovisualization can quickly grow into a deeply insightful tool for analysis through a series complex, spatial-temporal queries across an infinite number of data sets. We believe that there exists value in large data analysis in place of known data models as was philosophically described by Nobel prize winner Philip Anderson in "More and Different" [33], and further explored by the entirety of the contemporary big data movement. Rather than incorporate complex computationally expensive algorithms to understand, interpolate, or predict model behaviors GeoSense instead invites the community as a source of analysis utilizing human intuition and natural pattern recognition to detect 27 occurring phenomena. This is not to say there isn't inherent value in known models, it is however a different approach which lends itself to a level of accessibility and friendliness which may in turn better serve a large community. Finally, GeoSense takes use of multiple open technologies, all of which contribute greatly to the usability of the platform. Only 5 years ago the requirements to offer a service at this scale would have come with astronomical cost, requiring dedicated physical hardware servers, a team of engineers, and client side computing power that just did not exist. Open source software efforts and blossoming internet communities cannot be thanked enough. Data mobility All data brought into or authored within GeoSense is stored, managed, and appropriated by the GeoSense Satellite RESTful API. The GeoSense application invites users to explore different dimensions and parameters of their datasets, both spatial and temporal, providing a suite of tools which acquire their parameters via the API. In fact, any map or source of data may be used outside the GeoSense application ecosystem. For example, should a user wish to develop their own front end application or integrate dataset(s) into another service, the satellite API provides sufficient scaffolding and endpoints to do so. Summary of system GeoSense is an open platform for the comparative and cooperative visualization of geo spatial data. It is fundamentally different from similar platforms that aim to provide complex mapping GIS tools and as a result are often weighed down by a cumbersome feature set. GeoSense aims at providing the highest level of simplicity through carefully considering the average ability and limited prior knowledge of users, in regards to GIS systems. In order to build such a system, special considerations have been made in developing the UX. Given that a vast majority of first 28 world internet users are equipped with geospatial aware devices and platforms such as Google Maps and Bing, which has bolstered awareness of cartographic interaction, GeoSense comes at a time when the user has already acquired familiarity with mapping concepts and is in prime condition to begin authoring. The system manifests as a web application available publicly at http://geo.media.mit.edu where any user can, within seconds, acquire a boilerplate visualization template to which they can upload or link geospatial datasets. We believe that geospatial data is best understood collaboratively as was explored by Viegas et al in 2007 with Many Eyes [5]. To promote social behavior a single user's map is incredibly easy to share, as it belongs to a unique URL address. Maps can be shared through integrated social outlets such as Twitter, Facebook, or more traditionally through email or text link. To promote multi-user collaboration, all maps are generated with a public and private short URL (public view and administer respectively) which can be used to access the visualization platform. A map accessed through a specific URL allows for user annotation and commenting, both on specific data points and general location coordinates. Users are also made aware of other current collaborators and their general whereabouts in the context of the map. To elaborate, the entire map is a chat room and message board to which invited users may co-author and analyze data. These features are explained in greater detail throughout this document. GeoSense provides an insanely simple platform for visualizing multiple disparate sources of geospatial data. In parallel, it also provides a suite of tools for collaboration and data insight which have, to date, not existed in well executed form. GeoSense is built specifically to serve users whose main skills are not computer science or design, but who have curiosity around geospatial analysis and appreciate beautiful presentation design. 29 Second order observation By exposing user behavior in context of the geography from where they originated along side areas of interaction, a second order observation can be described. Specifically, for geospatial data and geovisualization the place in space where the viewer or author exists may have special relevance to the data they are investigating - both at the individual and community level. To explore this concept, each instance of GeoSense keeps track of where its users originate from, where (and if) they leave geospatial comments, as well as how they interact within the integrated chat room. Data features Data representation is highly variable within GeoSense. It is left up to the map's author to select the visual style, though GeoSense maintains predefined data point aggregates for large or extremely dense datasets. Data may be explored interactively by clicking on either a cluster of aggregated data or an individual datum. Meta information associated with the specific data is then revealed in geospatial context, assisting the user in better understanding the information with which they are interacting. We discuss in great depth the visual and computational considerations of visualizing data features in the Design Theory chapter. 30 Development timeline GeoSense was developed over a sixth month period, all of which was spent in close collaboration with Safecast. To serve both the active Safecast community and prepare GeoSense for growth into additional communities, milestones vary from summit meetings in Japan to periods of presentation at the MIT Media Lab. This timeline is reflective of Geo's development, as well as its future plans and iterations: Oct2011 Dec Jan Feb Mar Conception V.1 Safecast Worldmap Research & Meetings I V2 Development _____________ Tokyo visit V.3 GeoSense Deployment 31 ____________ ___________ May 32 Design Theory "The world is complex, dynamic, multidimensional; the paper is static,flat. How are we to represent the rich visual world of experience and measurement on mere flatland?" Edward Tufte [34] Geovisualization Producing effective visual representation of multi-layered information atop a map or any cartographic medium poses a torrent of potential complications. For every condition that produces a desirable result one hundred new complications may reveal themselves generating information-less patterns as a byproduct of their presence. As explained first by Josef Albers and underscored later by Tufte, the conundrum is that 1 + 1 may often equal 3 [35], where the byproduct of the initial variables produce an additional, distinct condition - adding to the visual complexity. 33 As described by Albers, the combination of one or more shapes may produce a thirdshape (shown in red) as their byproduct To address this, we employ a number of techniques, both aesthetic and computational, that address the needs of user generated geovisualization. The key features we consider are: 1. Mindful representation of multivariate information layers drawn across both two and three dimensional planes. 2. Dynamic data densities where the application state (or UI) informs the visual output. GeoSense is faced with a number of challenges when representing geographic data within the user interface. Aside from standard complexities that arise from visualizing large data, such as information density, other conditions must be considered when we investigate the user's interaction with the data. It is blunt and inefficient to show all data, as visual comprehension begins to suffer as the amount, or more specifically the density, of visualized data increases. Overabundant or incomprehensible arrangements stem from failures in design rather than from the information itself- regardless of magnitude. To address this complication, we employ a well known tactic of fitting a grid of boundary boxes against the map, to which data is aggregated in relation to the user's visible viewport. The grid is dynamically generated and sized. Many geospatial visualizations have addressed this, either for visual or computational simplicity, by averaging number of occurrences into a known cultural boundary. For instance, population density is often visualized as a choropleth map [figure below] where a polygon shape defines the state boundaries and all data within the given bounds is displayed as a single hue 34 Left: A computationallygenerated interpolationof radiationlevels. Right: A choropleth map showing population density by prefecture near Tokyo, Japan.Neither image producedfrom GeoSense across the entire shape. This technique often misleads the viewer, as the data within the bounded areas is not nearly as uniform as the visualization suggests. A similarly misleading tactic is to attempt averaging information over a given space. Computational interpolations [figure above], while often making the visualization seems denser and perhaps more visually compelling, do little more than generate an unqualified visual representation and, in the case of Safecast's radiation dataset, produce extremely misleading conceptions regarding the data's meaning. Interpolations are effective when attempting to predict or model the behavior or future state of a dataset, especially in the case of trajectory over time and space. 35 Aesthetics Shape, color, and size of visualized objects is carefully considered, as the shape of an object is optically tethered to the geography from where it rests. For example, a single data point may represent one particular point in space but to show it as a single pixel on a map is sometimes misleading. Instead, by showing the data point as a 10x10 pixel box it suggests that the data point corresponds with an area of space on the map rather than a single point on the map. Likewise, the visual change of data must coincide with adjustments in the map zoom level; If a datum does not change its size parameter as zoom is adjusted, the user will perceive the shape size to have no geographic binding in relation to the geo coordinates of the map. This is perfectly illustrated in modern mapping tools such as Google Maps or Open Street Maps where the map tiles change resolution in respect to the user's perceived distance from earth (zoom). Additionally, the color of data information plays a critical role in both the visual legibility of each point of data as well as the intent expressed by the visualization. For example, the question continually arises whether or not certain types of data, radiation in our case, should be colored or have a fixed color scale. The most common example is a linear hue shift from green to red. In western culture green is universally accepted as safe, versus red, which is understood as being dangerous. Ironically, in Japan the color red represents heroism, love, and is a a positive visual indicator for the Tokyo stock exchange. Further, how does one normalize scale to color where the range value is either user generated or chosen arbitrarily? Non linear value distributions cause additional complexities to representing data using a hue shift and often need to be represented in logarithmic scale. It was decided early on that the potential harm in suggestive coloring, especially within critical datasets like radiation, outweighs any aesthetic benefit. To address these concerns GeoSense gives the user complete control 36 A view of bold, brightly coloredshapes atop a dark tonal map. Blue dots represent earthquakes sized by magnitude.Red dots represent nuclear reactorssized by power. over data representation; the choice of whether data is represented as a single pixel, relatively sized circle, or bounding box, as well as single or hue-shifted color is completely customizable. By default, the application promotes bold color and is set against dark, tonal map tiles which best suites the type of data uploaded. To do this, we borrow a page from Swiss cartographer Eduard Imhof's first rule of color composition: Pure,bright or very strong colors have loud, unbearableeffects when they stand unrelieved over large areasadjacent to each other, but extraordinary effects can be achieved when they are used sparinglyon or between dull background tones. "Noise is not music ... only on a quiet background can a colorful theme be constructed." [36] 37 GeoSense addresses the multivariate nature of geospatial visualization by combining the proper amount of end user control with system constraints; in turn addressing the technical, artistic, and culture complications that arise. The figure below describes the three primary methods of data representation and their literal to representational qualities: Pixel Box Circle LITERAL REPRESENTATIVE U 0100 Different renderingtechniques used by GeoSense. From left (most literal) to right (most representative) and their correspondingvisualizations below 38 Spatial-temporal narratives In addition to the two and three dimensional canvases that GeoSense displays information, a fourth dimension for time has been implemented through a time series graphing system. Data sources containing temporal attributes may be explored alongside their geophysical attributes in shared context. In order to expose the value of a dataset's time quality, each datum is sequenced in successive time spaced by uniform time intervals. Coupling the spatial dimensions of the map viewport with the temporal sequence of the series graph deepens the an onlooker's understanding of a the dataset depth. By reducing the complexity of of the data into two understandable, and intrinsically related parameters - time and space - an equally interactive and elegant view in all four dimensions is made tangible. Top: Earthquakesshown with both geospatial and temporal analysis.Bottom: Arrangements of series graph display types - bar chart,scatterplot,area,sparkline 39 Because data properties such as color and shape are selected at the data management level, parameters are synchronized across all visualization mediums: a series of red dots for earthquakes on the maps will display as a red time series line on the graph. Additionally, temporal data may be explored through a number of time based graph techniques that currently include scatterplot, line, area fill, and bar chart. SPACE <0o0 tTIME 2012 T he space (xy) plane represents the user's current viewport.It is defined by a constraininglatitude/ longitude and zoom level. T he time (z) plane displays selections ofa data set based on occurrences within a given time constraint.In this case, we show a selection between t1 and t2. Users may also find interest in further exploring subsets of data through the time graph and can easily do so by interacting with a number of UI features allowing for time-range adjustment, and on-graph annotation. The above figure describes the spatial-temporal relationship between the user's view of the time series graph and the visible geospatial viewport. As a user interacts with the time constraint controls, in this case ti and t2, the amount of data shown both on the map and graph are concatenated against the new parameters. 40 41 Process GeoSense began with a simple concept: making the most simplistic, frictionfree experience for mapping geospatial data with special attention towards social collaboration and data analysis. Moreover, this tool should allow for the effortless creation of data and model mashups that expose insight into the meaning of the data. The initial goal was largely unconstrained in its definition and by design was allowed to grow and evolve as certain points of development were reached. At the time of conceiving the idea, a number of related projects had been recently completed by members of the core team. For example, at least three large scale geospatial projects had taken form, all of which we were required to build custom geospatial visualization. These projects, Peddl [37], Place Pulse [38], and Sourcemap [39] provide deep insights into the complexities of design and implementation for custom made geospatial visualization where the datasets where both community driven and dynamically updating. 42 YOA hem 3 ffk" I I **wfd I Want This i ~4 U + - ice mapShare thi Left: A view from the Peddl marketplace.Middle: A view from a PlacePulse visualization.Right: A view from Sourcemap.com To begin, GeoSense was prototyped as a wireframe concept to assist with identifying the UI/UX foundation from which to build the service. These early prototypes explored different arrangements of user interfaces that, if implemented, would serve as the app's foundation. Early wireframes borrowed a common design pattern found in applications such as Google Maps where the left most column of the screen, delegated for content related to the right column, taking up nearly two-thirds of the total real estate with a geovisualization. Concept The wireframe prototypes proposed three key features for the GeoSense platform: 1) a GUI with the map as the locus of interaction 2) a simple management interface for adding and subtracting data and 3) layers of interactivity atop the map object that expose features to the users. Some of these features were defined as the ability to comment on geospatial coordinates as well as building'if-this-then-that' [40] style queries around the active datasets. 43 An early illustrationdemonstratingthe split, two column real estate, the ability to add data as well as a three-dimensionalglobal view. 44 Demonstrationofan early "if this, then that"geo-bound condition. This feature was later removed for the release version of GeoSense and further discussed in the Featured Work section As is common in prototype design, a significant amount of time was spent on iterating the UI and UX in the form of a visual storyboard where any amount of development or system engineering would be postponed until the first "functional prototype". Safecast worldmap (Vi) Upon completion of the GeoSense wireframe prototype, a production version implementing the Safecast dataset underwent development. Understanding that the application was going to be deployed periodically to a large user base, the development of experimental features was put into a sandbox, forked from the original repository, so that two instances of GeoSense could simultaneously exist: one for public viewing at http://blog.safecast.org/worldmap and another which would eventually become GeoSense. With the Safecast worldmap, referred to as version one, only the most fundamental features were developed while a small amount of visual 45 design and aesthetic polished was applied over the entire application. Initial features included the ability to show or hide the Safecast mobile dataset as well as a choropleth map of Japanese population averages per prefecture. Core features such as geospatial search, basic map controls, multiple map themes, and social sharing were also implemented. During this stage, the data being shown was populated from ten different A view of the Safecast Worldmap showing data aggregationacross the islandof Japan. Google Fusion Tables, each of which held a aggregated granularity of data dependent on zoom level. The tables were mapped to the user's zoom level within the application such that as the user clicked "zoom in" or "zoom out" the tables would be queried to render the respective height (loom, 1000m, 1000m, and so on). Each table contained specific KML data which defined a 4 point geographic box. The benefit of rendering tiles from a dedicated map server became immediately obvious, as the amount of client-side computation involved in displaying 10,000+ data points in a single view outmatched the capabilities of a Javascript based approach. Further explanation and justi- 46 fication for and against the use of tile servers is explored further in the Technical Design section. IsV/ aeag 0.229 76.481 A closer view of the Fukushimaarea,showing the 20km evacuation radiusand a finer data resolution Zooming in on the Fukushima prefecture revealed the 20km evacuation zone, as well as a higher granularity of data points. Clicking on any individual data point, or cluster, would reveal information such as CPM (counts per minute) and uSv/h (micro-Sievert per hour) per hour pertaining to that specific set of data. Version one of the Safecast Worldmap was live between February 15th, 2012, and May 11th, 2012, when it received more than 10,000 unique visitors. Significant feedback was received both from the Safecast and public community. The general sentiment was that the Worldmap was impressively simple, easy to comprehend, and a step in the right direction. 47 Generalizing the platform (V2) The second iteration of the platform, referred to as version two, began with a complete rewrite of the application structure as will be outlined in the Technical Design chapter. Version one had been built as a standalone application, more akin to an advanced prototype functional enough to garner interest and insight, but without the fundamental framework required for additional features. With a number of new members joining the development, version two quickly took on a much more structured framework with specific focus towards speed to development. V2 takes a step back from Safecast, and a step towards generality. Rather than build features specifically pertaining to the radiation dataset, efforts were spent building a platform that would expand the simplistic power of the Safecast Worldmap to any and all users who had their own types of geospatial data. ICofirm and Add The current user interfacefor reviewing recently added data. Users aregiven the ability to select which columns represent the necessary attributesof location and intensity 48 USER INTERFACE FOR DATA MANAGEMENT Many of the design and engineering cycles during version two were put into the process of a seamless experience for users to add their own data. In order to do so, a workflow had to be developed which would assist users in preparing their data such that it could be understood and interpreted by our system. To do so, an "add data" wizard was developed, where users were instructed to attach a datafile either through uploading from their file system or by URL link. Once the data had been received by GeoSense, it was parsed and display back to the user as a table of columns and rows. To identify the data headers, the user is instructed to drag and drop labels onto the columns. Properly imported datasets are represented in the system on the left column 4 Close ULbrary Comments Comments Nuclear Reactors (29) Add New Data Drag and drop rghr onto your map U Earthouakes M8771 Browse Data Library Safecast Display Theme Rat map Dark Ught 3D Globe Standard Earthquakes Visible Hidden SingleColor ColorScale Nucear Reactors Nuclear Accidents Data p xels Remove circles Save and Update Add New Data Browse Data Library The data managementpanel. Showingfrom left to right - initialview, data library browser,and expanded controlsfor added data sources where they are shown inline with additional data sources. This visual group, or "accordion" component, allows users to easily manage, edit, or remove the current data sources. 49 An important advancement during V2 was the introduction of data models that existed separately from the visual representation. All data added into the system is pushed into a remote database where it stored and made accessible through a public API. While this ultimately means that all data in GeoSense could be re-visualized elsewhere by a third party, it also means that the application can easily iterate through different types or methods of visualizations. V2 began exploring this by introducing a modal switch which toggles between "Flat Map" and "3D Globe" respectively Clicking the toggle changes the display type and automatically rebinds the active data models as appropriate for each visualization. GeoSense (V.3) The third version of the product brings the first actual instance of a "GeoSense" in its entirety. Wrapped with an additional layer of instruction and messaging, GeoSense becomes less an experimental application and more a widely available consumer service website. AMAP SCREAIT LOAD YOUR QATA SHARE AND DISCUSS AddYour Awesom D. The landingpagefor geo.media.mit.edu, inviting users to create a map by enteringa name and clicking the prominent create button 50 Left: CoastalJapan showing earthquakes,nuclear reactors,and coastalflooding.Right: A view of asia showing earthquakesalongside a time series graph.Bottom: A view of the webGL 3D globe V3 features a homepage instructing users how to create their own mapping sandbox. The homepage also features a number of community insight tools such as "recently created maps" and "recently added datasets".Currently, all data stored on GeoSense is made publicly available. As well, maps created on GeoSense are publicly viewable though only users with a special admin URL are able to manipulate or add associated data. 51 SPATIAL COMMENTS AND CHAT The release version of GeoSense also incorporates a number of critical features which add to the value of community input and collaborate around specific maps. A simple UI feature for leaving geotagged comments or commenting directly on a data point is provided. This familiar interface, akin to leaving a comment on Youtube or Facebook, invites users to leave annotations in direct spatial context. Similarly, a set of "on/off' toggles allow users to see the physical location of users currently viewing their map as well as the geo locations of where all past contributors and editors have been. During this phase, GeoSense underwent a complete API overhaul from basic restructuring of naming conventions to complete refactoring of routes. The API was generalized and cleaned to improve workflow for the development team as well as to prepare for community wide usage. Methods for data uploading, parsing, and aggregation were greatly enhanced during ver- Two users converse about the safety levels of the Safecast radiationdataset in reference to their residences. Comment bubbles on the map create links to andfrom the chat window which user's may use to specify a specific geo coordinate 52 sion three, which will be more fully detailed in the Technical Design chapter. GeoSense version three, undergoing active development at the time of writing this, will serve as the platform from which the project will continue to evolve and also mirrors the state of recent releases, posted at GitHub (http://github.com/tonydevincenzi/geo). CONTINUED: BEYOND THE SCREEN An obvious benefit of developing for- web accessibility is the vast number of devices that can access the full range of the application. To test extensibility, we developed an iPad application that, with a simple wrapper around webkit, allows for full functionality on an iPad tablet device. To compliment the form factor and push the boundaries on how to present the project in situ at the MIT Media Lab, the GeoSense team developed a suite of technologies to transform the entire platform into an experimental augmented reality installation. Featuring a full sized physical globe, users are given tablet devices as instruments to explore data on and around the tangible earth. Moving the globe rotates the data accordingly, as does moving the tablet device around the space. This exciting exploration creates questions about how to best represent virtual geospatial data tethered to a physical object, and what other user interaction scenarios may emerge in the future. 53 54 Technical Design The GeoSense technical implementation is best described by outlining the underlying frameworks for the server, app, and web service respectively. A large number of framework services have been employed, iterated on, removed, and revised during the development of GeoSense. The current technology stack is by no means the most practical or scalable implementation, 55 but is perhaps most fitting as it is built entirely atop open source platforms whose ethos align with the goal and aim of GeoSense and Safecast. Server structure AMAZON EC2 The GeoSense web service, code named "Satellite", is hosted on an Amazon EC2 instance server. Amazon EC2 was chosen for its ability to scale to meet increased load demands as the service grows in size. It is also heavily adopted and well documented by the contemporary web development community. UBUNTU The server runs an instance of Ubuntu Linux, a Unix based operating system, that has a thriving community of developers who have documented the many ways of "rolling a server" to your own specifications, much like Amazon EC2. Satellite can run on any unix based operating system and is completely managed and deployed through terminal configuration. Satellite & satellite API ARCHITECTURE NODE The satellite web server is a node.js based application. Node.js is a javascript framework for writing scalable internet applications, most commonly for web servers [41]. Node uses an event driven Asynchronous I/O for improved scalability and reduced infrastructure overhead. Unlike the majority of Javascript based programs, it is executed 'server side', the benefit of which is a close coupling of language and method between server-side and client-side rendering. In the case of GeoSense, this was a obvious benefit as a number of the 56 applications features mix-server and client-side rendering techniques. Node comes coupled with Node Package Modules, which is a stand alone manager for installing a community curated collection of "modules" that extend the basic functionality of Node. GeoSense uses the following major NPM packages: EXPRESS / CONNECT A fast, and small server-side JavaScript web development framework with features including routing, session support, cookie handling, and logging. MONGOOSE An object modeling tool designed to work in an asynchronous environment, making integration with MongoDB extremely pleasant and straight forward. NOWJS An implementation of web sockets (via socket.io) and node-proxy libraries for real-time communication for live updates between users. GEOSENSE DATABASE MONGODB, GIS For data storage and management, MongoDB [42] (from humongous) is used as the central data repository. Mongo is a NoSQL database, meaning that it stores structure as a JSON-like document with dynamic schemas. Table-free database architectures are known to be more efficient in terms of speed and efficiency for certain types of applications. Mongo includes a number of crucial libraries referred to as "MongoGIS" that are optimized for geospatial data operations. These operations are central to data storage and retrieval within GeoSense. For example, Mongo makes easy the ability to index and quickly return search results for complex queries such as "average the 500,000 points closest to my location where value is never higher than 5". 57 Data import Importing data is handled by the server once the client has specified and uploaded a suitable datatype. GeoSense currently supports XML, JSON, and CSV datatypes. Once a file has been posted to the server, it is put through a process which cleans and standardizes the import. Each line of the data source is read in linear order, where each column or property is then transformed into a field within our associative MongoDB collection. Original conversations of the uploaded data are kept as a collection prefaced with o_ in the active database. As the document is being parsed, transformed fields are asynchronously dumped into a master collection that houses all uploaded points within GeoSense. Their unique _id is retained and used to associate the individual field with its parent collection. Attributes unique to the dataset, such as title, default color, created by, and modified date are stored in an associative collection where the _id attribute is used as a linkage identifier. Data import and parsing happens asynchronously once the user uploads their first dataset. The time remaining is indicated to the user in the GUI by showing the estimated time remaining on the data conversation. Once the data is properly converted and stored, it is drawn into the user's current viewport. Aggregation and reduction through MapReduce For datasets exceeding a certain number of fields (arbitrarily ~1,000) an aggregation process is executed to greatly increase the performance of the data for both the client and server. To accomplish this, we create sub collections of the dataset, each containing a reduced aggregate as a function of zoom level. We currently support reductions for 15 discrete zoom levels as well as temporal reductions that host only the time series for each dataset reduced into days, weeks, months, and years in accordance with the zoom level aggregate. 58 To accomplish this, we employ a technique referred to as "MapReduce". Traditionally, MapReduce is a framework for distributing the processing of huge datasets across a large number of nodes. In the case of GeoSense and the GIS libraries for MongoDB, it is a tool for batch processing data and aggregation operations. Spatial indexing and grid queries As described in the previous Design Theory chapter, all data stored and displayed within GeoSense is subject to a mesh grid. This grid, mesh, or lattice, serves the dual functions of one, reducing the amount of visual complexity for the user and two, standardizing and reducing the amount of computational processing for the client and server. For example, at a global zoom level showing all 180,000 earthquakes over a magnitude of 4.4 since 1973 would be both visually and computationally inefficient. Instead, occurrences are organized into micro clusters, fitted to the known geospatial grid, and displayed dynamically in regards to zoom level and the bounding extremities of the user's viewport. This approach produces an optimized number of queries against a geospatial index. To create and manage these queries, the GeoSense application constructs the viewport grid in accordance with the aggregate collections generated explained in the previous section Aggregation and Reduction through MapReduce. The following structuring logic was developed with and paraphrased from Walter Mendez (MIT EE/CS 2015) who contributed to the GeoSense project during the summer of Spring of 2012: On constructingthe mesh grid The grid is managed by a set of ordered pairs, which are not created at random. They follow a geometric pattern that is based entirely on the physical 59 dimensions of the zoom level and the parameters of the viewport grid being generated. The origin of of this coordinate system, or (xO 9yO) is placed at the lower left hand corner of the bounding area and as a result, a change in the horizontal direction and the vertical direction, x and y respectively, can be defined as the following: - lengthZ 0 ,, A = widthzoom lengthgrid widthgid It hence follows that, given the zoom level's bounding corners, the lower left being (xO,yO) and the upper right being (xf ,y 1 ) any point in the grid could be reached by the following general formula: r lengthZ 00 M,y0 +n widthzoom + lengthgrid widthgrd where m is in the range of {O,...,lengthgrd}I and n is in the range of {o,...,Widthgrd } .The geometric constraint when it comes to the bounds of the grid is then defined. When m and n are equal to their respective maxima: length,_ widtho xo+lengthgrid length ,yo + wdhgrid wi. thgrid) = length,,rd width,, ++d f+length,,myo+widthzom=(x,,y, Given MongoDB's geospatial indexing specifications, the database indexes the data using spatial coordinates (longitude, latitude). To create the boundaries of a grid, we specify a box by passing in a lower left hand corner and an upper right hand corner. Thus, for any given m and n in our grid, a bounded box would have as lower left and upper right corner respectively: x0 + m 60 length " lengthgri ,yO + n width zoom , widthgrid )I x0 + (m + length lengthgri YO +(n+1) width Z widthgrid 1 This makes geometric sense. In order to get to the upper right hand corner of a box given the lower left hand corner, we need only add AX and A , as well as a single box side length and width, in each direction. Finally, each cell within the grid contains an array storing all the data points retrieved from the server, the number of points in said array, the minimum, the maximum, the average, and the center point of the respective container. TEAMDATA DATABASE POSTGIS Data specific to Safecast is stored in a separate database, which operates outside the server bounds of GeoSense. Safecast's dataset, which is referred to as teamdata, is stored within a PostGIS (Postre GIS) database and is subject to a different upload and management process than data added directly through GeoSense. Though the Safecast dataset is community driven, it's handled and monitored by a number of Safecast volunteers due to the critical nature of the data. APPLICATION STRUCTURE The map platform, which is the publicly visible portion of GeoSense, is a built fully in HTML5 and Javascript. The application is organized in a MVC (Model, View, Controller) framework using Backbone.js [43] that provides logical structuring of the application into a manageable development flow. The application is organized into the following structure: VIEWS The visual build is constructed through a simple templating engine that serves views based on the application state. These views vary from '2D map view' to '3D map view' and 'About GeoSense' view Each view is an individual module that contains a linked HTML and CSS file for format and styling. 61 MODELS Models are used to define the parameters around how individual pieces of data are handled within the GeoSense application. For example, the most common model is 'point', which refers to a singular point of data containing a latitude and longitude coordinate. Each point may differ from the last, both in lat/lon and in additional values (intensity, date added, etc). COLLECTIONS Collections are bundles of models that exist together under the umbrella of parent properties. For example, a million points (taken from the point model) may make up the collection 'air pollution' that then has its own properties independent from the individual models themselves. Collections, as containers of models, are bound to views within the application. EXTERNAL LIBRARIES A number of widely adopted external libraries are used as part of the GeoSense application. Listed below are their titles and basic operation: TWITTER BOOTSTRAP Twitter's bootstrap framework is used underneath the application to provide easy access to commonly used design patterns such as headers, footers, button types, forms, modal windows, and more. Bootstrap is a welcome additional to the technology stack as it reduces the vast amount of timeconsuming work by replicating expected behaviors of a web app. It is, in general, a fantastic boiler plate for starting a new application. However, precautions have to be taken to ensure that the ubiquitous "look and feel" of Bootstrap does not overtake the application. To do so, nearly all the default styles provided are restyled or adjusted. 62 JQUERY/J QUERY UI Jquery, a javascript framework library for accessing and manipulating the DOM (Document Object Model) of the application is fundamental to any Javascript based application. Jquery UI is a simple extension of Jquery that appropriates certain features such as "drag and drop", which may be only necessary in certain applications. THREE.JS Three.js is a javascript library that wraps a basic render model around the OpenGL based WebGL. Three.js simplifies access to WebGL and is instrumental in Geo's ability to display data in the third dimension. OPENLAYERS OpenLayers is an open source library for displaying and manipulating map data. It is built entirely in Javascript, and provides an API for constructing interactive map applications. GeoSense uses OpenLayers as the rendering engine for two-dimensional maps and has heavily extended the canvas rendering class to support features unique to GeoSense. This list covers the most fundamental libraries but is not exhaustive. For more information regarding the current state of the GeoSense library arrangement visit the project on Github (http://www.github.com/tonydevincenzi/geo) 63 64 Challenges Data purity Because GeoSense does not offer itself as a source of data but rather a source for data observation, there are certain precautions towards allowing the community to generate and share data sources. For example, erroneous data may be inserted into the system by any user and then replicated by future users. Rather than try and detect bad data, or even offer tools to report such incidents, GeoSense takes the position that it offers nothing but the platform and that all data within the platform is community generated. In the case of Safecast, the data is stored in the teamdata database, which is part of the Safecast repository. GeoSense has integrated bespoke hooks for the teamdata dataset, but only in a manner that is available at safecast.org. Therefore, for all intents and purposes, the data available at http://geo.media.mit.edu is community generated and not explicitly endorsed by the platform. This is made clear in the GeoSense terms and conditions, which are available online. Data comes in many shapes and sizes. An ongoing challenge is continuing the development of upload compatibility from within the add data wizard. To date, GeoSense requires that the user specify at least three crucial 65 columns for every uploaded dataset: latitude, longitude, and intensity. Ideally, a lightweight algorithm could handle the majority of the guesswork involved in specifying these columns as the names held within header rows of geospatial data are often similar (i.e., lat or latitude). Finally, certain considerations are taken when choosing how to handle a maximum file size for user upload. For instance, it is computationally expensive to upload and parse through a file the size of the Safecast dataset, which at time of writing is over 3 Million entry points housed in a 50mb CSV file. GeoSense currently limits the file size upload to 20mb, which can still easily cover more than one to two million entries in a well managed document. Increasing this capacity would require significant server enhancements and storage capacity, coming at significant cost. Performance When attempting to process and visualize large amounts of data, performance issues are one of the first hurdles to overcome. Rendering millions of live data points requires a dynamic relationship between the rendering engine (front end) and data server (back end). In its current build, Satellite, the GeoSense web service, aggregates and returns data from the back end based on the specifications requested by the front end. Because the data within the GeoSense application is handled separately from the visualizations, it is easy to adjust the requests based on the currently application state. This is most evident in the scenario of rendering to the flat map, where we begin to experience extreme performance loss when more than -20,000 individual objects are being rendered. Conversely, it is much easier to render large amounts of data through the webGL pipe, which is utilized by the 3D globe display type. Because webGL has access to the video card's GPU, the majority of display logic can be pushed off the CPU, which is the general bottleneck for JavaScriptheavy applications. Future versions of GeoSense may implement a custom tile server, 66 similar to how Google Fusion Maps are rendered, which in turn would alleviate the constraints of rendering data points into the map tiles. Tile servers are, at this time, complex and expensive to manage. New services such as MapBox have begun to innovate with products like TileMill, though the infancy of the software comes with too many limitations for it to be used by GeoSense. Scale As GeoSense begins to grow in users and scope, scale becomes a prevalent issue. In its current state, scale is handled by basic load balancing and an elastic instance through Amazon EC2 [44]. GeoSense has been carefully designed to handle a magnitude of scale, though the costs of operation would scale in parallel. Future funding will be required to keep the service running if extreme growth is experienced. Custom instances As GeoSense continues to grow, the community may want to create their own instance of the platform on a different server. Because it is open source, the entirety of the project can be downloaded and installed via the public GitHub repository. This creates complexity when trying to develop GeoSense for both Safecast as well as community usage. Because of this, there may be ongoing branches of the GeoSense project that are specific to a certain instance of the project, Safecast in this example, and would differ in certain features from the instance hosted at http://geo.media.mit.edu. This fragmentation can cause complications when developing new futures, as it requires that all custom or branched features are forward compatible with changes to the master repository To avoid further complication, GeoSense will only "officially support" development of the master repository and specific derivatives that are generated by the core team. 67 68 Use Cases GeoSense has been evaluated against a number of different usage scenarios whose interests and datasets differ greatly. In order to prove the versatility of the system, it was crucial to select example maps and users whose feedback would differ based on their individual needs. Our tool's true power is demonstrated through how we observe the community using it to tell stories; the narratives developed within GeoSense exceeded our original intent and expectations. The following case studies were conducted with the GeoSense platform: SAFECAST The first and most obvious usage scenario is Safecast, whose dataset was the spark behind the development of GeoSense. With over 10,000 active viewers through the development of GeoSense V3, Safecast has been the primary driver behind feature-set development. For the first time, the Safecast dataset was fully visible as a perfect mirror of its current state in the teamdata database: there were no intermediary hand-built aggregates or reductions as was previously the case. 69 SAFECAST H An image of GeoSense for Safecast showing a coastalarea ofJapanfeaturing:Radiation levels (green to pink), coastalflood zones (red coast), nuclear reactors (red dot), and earthquakes(blue) For our usage scenario, the Safecast dataset was combined with historical earthquake data, nuclear power reactors, and nuclear power plants with reported INES (International Nuclear Events Scale) incidents as well as modelgenerated coastal flooding models from the 3/11 earthquake and ensuing tsunami. The selective choice of data layering was done to not only tell an important story, but open the stage for discussion: common questions such as "where should I consider building a house?", "Is my child's school playground safe from radiation?", and "What areas are at high risk for similar catastrophe?" have been asked and addressed. By allowing the community to discuss data placed in context, the back-and-forth of email news groups and repetitive question & answer has been reduced. Much like the ancient stone markers found in coastal Japan warning the inhabitants of tsunamis, GeoSense offers not only a view into the past but a glimpse into the future where individuals and communities alike can make concise, informed decisions. 70 SOURCEMAP Sourcemap.com is the open directory of supply chains and environmental footprints. Consumers use the site to learn about where products come from, what they're made of, and how they impact people and the environment. Companies use Sourcemap to communicate transparently with consumers and to tell the story of how products are made. [39] The GeoSense team is working closely with CEO Leonardo Bonanni of Sourcemap on finding ways to explore the causal relationships between climate, cultural, and ecological data in conjunction with product supply chains. We have begun by exploring the relationship between North American farm location, food distribution patterns, global warming, and population density. When properly visualized, new insights related to operational risk factors and supply chain optimization have arisen. THE LACE RACE The Lace Race is an ongoing global game developed by a team of artists and researchers from the MIT ACT, Media Lab, CSAIL and Department of Architecture. It debuted at the Reykjavik Arts Festival in Reykjavik, Iceland. The Lace Race game is simple: participants are given a single shoe lace with a unique identifier number. Each participant is then encouraged to continually trade his or her shoelace(s) with strangers or other participants. Per each encounter, the exchanging user is encouraged to tweet in the following format "#LaceRace 123 location" where "#LaceRace" refers to the game's hash tag, "123" the unique identifier, and "location" to the physical location of the exchange. GeoSense was then used to watch the Twitter hashtag #LaceRace and produce a realtime map of all ongoing Lace Race activity. Users are also encouraged to use the geo-tagged comment system to leave annotation on their exchange, where they saw specific laces or even to hunt down specific numbers as a source of information exchange. 71 Results As of writing, GeoSense has encountered more than 10,000 users through Safecast alone. It was demonstrated to over 400 visitors and broadcast to thousands during the 2012 spring MIT Media Lab Member's week. Many parties were interested in using GeoSense as a new way to decode their own, cryptic data. Specific interest was shown by members of the National Wildlife Federation in regards to better understanding the social, economic, and environmental impact of seasonal fires; we anticipate many future partnerships. Thanks to Safecast, a constant stream of users encounters GeoSense for mission-critical usage regarding the radiation dataset. Results so far are positive, and optimistic, but we realize only the surface has been scratched and will continue to feverishly develop GeoSense until it reaches its full potential. 72 73 Future Work GeoSense is an ongoing ever evolving project. Because it is open source and serves as the visualization platform for Safecast's future work, it will always be defined not only by the experimental directions we hope to take but also by features that best suit the needs of the active user base. Hundreds of potential directions have been discussed, of them these are some of the most pressing: Tile servers As previously described in Technical Design and Challenges, technical limitations are quickly met when attempting to handle and visualize large and dense sets of data. The most efficient methods remains to be one of the oldest, to render all of the data as part of the map tile on the server itself. GeoSense currently renders visual information into the canvas layer client- side and displays it as an overlay atop a pre-generated map tile. To date, we have reached an efficiency that challenges the performance of even a dedicated tile server, however older machines and mobile users may find the experience slower and in some cases, completely broken. 74 Expanded visualization types With a robust method for handling large data sets and a community of active users, GeoSense is in a prime position to iterate and experiment with new types of visualizations. We imagine there to be a well of opportunity in exploring information visualization beyond geovisualization. We hope to work towards finding new and expressive visual explanations of a dataset's potential meaning. Models & mechanistic explanations As is started to be explored by the introduction of time series graphs and pregenerated model overlays, the idea of allowing for user-specified models to cast against their dataset is compelling. We imagine that once a set of data is represented in GeoSense, a number of conditions can be applied against it. These conditions are infinite but we are currently exploring falloff decay, parameters for attraction and deflection, as well as movement and inertia. Ultimately, a suite of tools could be developed to allow users, or communities of users, to develop models towards understanding the meaning or future impact of their geovisualization. Boolean conditions and spatially bound alerts Part in parcel of the original GeoSense proposal was to invite individual users to create geo-fenced conditional alerts atop their geovisualization. This interface will allow users to specify an "if-this-then-that" problem statement where if certain criteria is met, a series of specified outcomes will execute. A situational example of this would be: "If radiationsover 500CPM is reported within 5KM of my home then email me a notice". 75 This feature was deprecated in the current build of GeoSense as, during development, it was found to be less crucial than a stable infrastructure of geospatial commenting and live chat amongst current users. We are looking to reevaluate the importance of boolean conditions and spatial alerts in the coming months. 76 77 Conclusion GeoSense liberates the author, viewer, and data. It proposes that design may be used as a lens to enhance human understanding and promote imagination - that provocative discoveries can be uncovered through intent and serendipity alike. We have demonstrated how, through the juxtaposition of visual language and observational analysis, insightful narratives can be discovered; leading a community of individuals to generate hypotheses around the causality of data and worldly events. With geovisualization comes many complexities. Daunting they may be, their very presence also provides inherent value; to be massively complex is both boon and bane. To explore, to probe at, and to liberate lifeless tabulated data into instructive, insightful, and human readable information is a prelude to an even larger effort. We have explored the visual marriage of time and space, where both parameters are tuned and tweaked to provide the viewer with insights that were once locked away within spreadsheets. We have also begun expanding the known vocabulary of geovisualization for the digital age, where each pixel can have tremendous meaning and consequence; devising a representational taxonomy that serves both form and function. Finally, we have seen the need for, and positive response to, com- 78 munity tools for building dialog and sharing intelligence. GeoSense has opened the doors for both thought and voice, where the user plays the role of designer, scientist, analyst, and philosopher. Our accomplishment is an important first step, but is it only that - the first step. To answer the harder questions, to gaze into the future, we must first have a tool to see into the past and into the now; with GeoSense we may begin this process with massive data as our vessel, assembled by and for a community of open minds and thinkers. 79 80 References [1] Safecast, "Safecast," blog.safecast.org. [Online]. Available: http://blog.safecast.org/. [Accessed: 27-Apr.-2012]. [2] J. Mackinlay, S. K. Card, and B. Shneidermann, Reading in Information Visualization: Using Vision to Think. Morgan Kaumann Publishers, 1999. [3] MobiThinking, "Global mobile statistics 2012," mobithinkingcom. [Online]. Available: http://mobithinking.com/mobile-marketing-tools/latest-mobile-stats. [Accessed: 27-Apr.-2012]. [4] Geospatial Today. [Online]. Available: http://geospatialtoday.com. [Accessed: 27-Apr.-2012]. [5] E B. Viegas, M. Wattenberg, E van Ham, J. Kriss, and M. McKeon, "Many Eyes: A Site for Visualization at Internet Scale," pp. 1-8, Aug. 2007. [6] E. R. Tufte, Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press, 1997, p. 156. [7] stamencom. [Online]. Available: http://stamen.com. [Accessed: 27-Apr.-2012]. [8] "We feel fine and searching the emotional web," presented at the Proceedings of the fourth ACM international conference on Web search and data mining, New York, NY, USA, 2011, pp. 117-126. [9] Google, "Google Earth," google.com. [Online]. Available: http://www.google.com/earth/index.html. [Accessed: 27-Apr.-2012]. [10] NASA, "World Wind JAVA SDK," worldwind.arc.nasa.gov, 18-Jul.-2011. [Online]. Available: http://worldwind.arc.nasa.gov/java/. [Accessed: 27-Apr.-2012]. [11] NSIDC, "View NSIDC Data on Virtual Globes: Google Earth," nsidc.org. [Online]. Available: http://nsidc.org/data/virtual-globes/. [Accessed: 27-Apr.-2012]. [12] unhcr.org. [Online]. Available: http://www.unhcr.org. [Accessed: 27-Apr.-2012]. [13] ArcGIS, "ArcGIS Online," arcgis.com. [Online]. Available: http://www.arcgis.com/home/. [Accessed: 27-Apr.-2012]. [14] ESRI, "Maplt - Create Interactive Business Maps | Map SQL Server & Excel Data," esri.com. [Online]. Available: http://www.esri.com/software/mapit/index.html. [Accessed: 27-Apr.-2012]. [15] Pachube, "The Internet of Things Real-Time Web Service and Applications - Pachube," pachube.com. [Online]. Available: https://pachube.com/. [Accessed: 27-Apr.-2012]. [16] Ushahidi, "Ushahidi :: Home," ushahidi.com. [Online]. Available: http://www.ushahidi.com/. [Accessed: 27-Apr.-2012]. [17] "Automatic generation of tourist maps," ACM Trans. Graph., vol. 27, no. 3, pp. 100:1-100:11, 2008. [18] "Crowdsourcing graphical perception: using mechanical turk to assess visualization design," presented at the Proceedings of the 28th international conference on Human factors in computing systems, New York, NY, USA, 2010, pp. 203-212. [19] geocommons.com. [Online]. Available: http://geocommons.com. [Accessed: 28-Apr.-2012]. [20] worldmap.harvard.edu. [Online]. Available: http://worldmap.harvard.edu. [Accessed: 28-Apr.-2012]. [21] MapBox, "MapBox I MapBox," mapbox.com. [Online]. Available: http://mapbox.com/. [Accessed: 27-Apr.-2012]. [22] TileMill, "TileMill | MapBox," mapbox.com. [Online]. Available: http://mapbox.com/tilemill/. [Accessed: 27-Apr.-2012]. [23] "Your place or mine?: visualization as a community component," presented at the Proceedings of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, New York, NY, USA, 2008, pp. 275-284. [24] CAIS, "thquake Prediction, Japan," cais.gsi.go.jp. [Online]. Available: http://cais.gsi.go.jp/YOCHIREN/activity/191/191.e.html. [Accessed: 27-Apr.-2012]. [25] CBS, "New USGS number puts Japan quake at 4th largest - CBS News," cbsnews.com, 14-Mar.-2011. [Online]. Available: http://www.cbsnews.com/stories/2011/03/14/501364/main20043126.s html. [Accessed: 27-Apr.-2012]. [26] N. G. JP, "Damage Situation and Police Countermeasures associated with 2011Tohoku district - off the Pacific Ocean Earthquake," npa.go.jp, 25-Apr.-2012. [Online]. Available: http://www.npa.go.jp/archive/keibi/biki/higaijokyo-e.pdf. [Accessed: 27-Apr.-2012]. [27] NISA, "INES (the International Nuclear and Radiological Event Scale) Rating on the Events in Fukushima Dai-ichi Nuclear Power Station by the Tohoku District," nisa.meti.go.jp. [Online]. Available: http://www.nisa.meti.go.jp/english/files/en20110412-4.pdf. [Accessed: 27-Apr.-2012]. [28] I. B. Times, "Analysis: A month on, Japan nuclear crisis still scarring International Business Times," ibtimes.co.in, 09-Apr.-2011. [Online]. Available: http://www.ibtimes.co.in/articles/132391/20110409/japan-nuclear-cris is-radiation.htm. [Accessed: 27-Apr.-2012]. [29] L. Times, "Japan earthquake: Insurance cost for quake alone pegged at $35 billion, AIR says - Los Angeles Times," articles.latimes.com, 13-Mar.-2011. [Online]. Available: http://articles.latimes.com/2011/mar/13/world/la-fgw-japan-quake-ins urance-20110314. [Accessed: 27-Apr.-2012]. [30] Safecast, "Safecast Data Downloads," maps.safecast.org. [Online]. Available: http://maps.safecast.org/downloads/. [Accessed: 27-Apr.-2012]. [31] CC, "Creative Commons - CCO 1.0 Universal," creativecommons.org. [Online]. Available: http://creativecommons.org/publicdomain/zero/1.o/. [Accessed: 27-Apr.-2012]. [32] TEPCO, "TEPCO: Status of Fukushima Daiichi and Fukushima Daini Nuclear Power Stations after great east japan earthquake," tepco.co.jp. [Online]. Available: http://www.tepco.co.jp/en/nu/fukushima-np/index-e.html. [Accessed: 27-Apr.-2012]. [33] P. W Anderson, More and Different: Notes from a Thoughtful Curmudgeon, 1st ed. World Scientific Publishing Company, 2011, p. 424. [34] E. R. Tufte, Envisioning Information. Graphics Pr, 1990, p. 126. [35] J. Albers, Search versus re-search. Trinity College Press, 1969, p. 85. [36] E. Imhof, Cartographic Relief Presentation. ESRI Press, 2007, p. 388. [37] peddl.com. [Online]. Available: https://peddl.com. [Accessed: 27-Apr.-2012]. [38] P. Pulse, "Place Pulse I The Collaborative Image of the City," pulse.media.mit.edu. [Online]. Available: http://pulse.media.mit.edu/. [Accessed: 27-Apr.-2012]. [39] sourcemap.com. [Online]. Available: http://sourcemap.com. [Accessed: 27-Apr.-2012]. [40] ifttt.com. [Online]. Available: http://ifttt.com. [Accessed: 27-Apr.-2012]. [41] ReadWriteHack, "Wait, What's Node.js Good for Again?," readwriteweb.com. [Online]. Available: http://www.readwriteweb.com/hack/2011/O1/wait-whats-nodejs-goodfor-aga.php. [Accessed: 27-Apr.-2012]. [42] mongodb.org. [Online]. Available: http://www.mongodb.org. [Accessed: 27-Apr.-2012]. [43] backbonejs.org. [Online]. Available: http://backbonejs.org. [Accessed: 27-Apr.-2012]. [44] C. Hidalgo, "Graphical Statistical Methods for the Representation of the Human Development Index and its Components." Appendix Tablet AR installation GSPEAK BRIDGE In order to translate coordinate position of both the iPad and physical globe, a translation bridge was developed and deployed as part of the GeoSense application. This bridge, written in Ruby acts as an interpreter between Oblong's Gspeak system and the GeoSense platform. THE INTENT OF AN AUGMENTED REALITY APPLICATION Paraphrasedfrom Samuel Luescher's 2012 projectproposal "As a tangible interface to this data, we propose a physical globe whose position and orientation in space the application is monitoring. When holding up a tablet to the globe, digital layers are superimposed on the camera image of the globe that is displayed on the tablet screen. By coupling the physical affordances of the object with an AR application for tablet computers, we ex- pect to tackle a number of usability problems that commonly occur with mapping applications. We explore possible interaction techniques when coupling tablets with the globe and using them for individual navigation around the geospatial data, subsequent decoupling of specific map views from the globe and the tablet, as well as using the globe as a master control for larger views." Left: Samuel Luescher (front) and Anthony DeVincenzi (back) createda new map with GeoSense. Right:A view of the tabletAR installation