GeoVISTA Center, Department of Geography, Pennsylvania State University Mark Gahegan Bill Pike Sachin Oswal Gary Sheppard Gary Liu Brandi Nagle Junyan Luo Sharing Our Understanding Of Earth Science Resources A knowledge management portal to support collaborative geoscience CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Introduction, motivation & year 2 goal Making electronic geoscience resources more available is not enough… We need to be able to describe these resources more effectively… To be successful, contributing and finding resources must become an integral part of the way scientists/educators work Major goal for year 2…Develop visually-based tools to help geoscientists organize, describe, and gain access to the GEON resources CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Knowledge management for collaborative geoscience Representation • top-down ontology languages • bottom-up context, situations (provenance) • visual appearance, signification • history & evolution • alternative descriptions instantiation Capture conceptualization • collaborative web interface • diagramming tools • text mining tools • importing existing ontologies • workflow discovery Usage • ontology mediation services • ontology similarity measures • browsing conceptual structures • shared virtual workspace CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Representation Ontology languages (OWL, RDF, DAML+OIL) Association histories of how resources are used Visual appearance / signification serialization Additional descriptive information / resources </owl:Class> <owl:Class rdf:ID="Marsh"> <rdfs:subClassOf rdf:resource="#CoastalRegion"/> <rdfs:subClassOf rdf:resource="#WetlandRegion"/> </owl:Class> … Fragment of OWL ontology from NASA’s EarthRealm project CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Contextualizing science “In science, numerous lines of investigation interweave to delineate a type of rationality that is historically situated and practical, and involves choice, deliberation, and judgment.” Richard Bernstein Beyond Objectivism and Relativism: Science, Hermeneutics, and Praxis - Richard Bernstein Beyond Objectivism and Relativism: Science, Hermeneutics, and Praxis Our aim is to contextualize resources through experiences; this is crucial for understanding in domains that are highly interpretive Put another way, what do feeding ducks have in common with 50% of our understanding? CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Three problems with a solely ontological approach Top down knowledge (ontologies) only get you so far… other kinds of (bottom up) knowledge are also very important & useful Use-cases (situations surrounding the use of resources) Social networks Most current ontologies are static resources… Our understanding is dynamic & continually evolving Unless ontologies are community-owned, dynamic resources they will soon become part of the problem, not part of the solution What happens to all the thousands of resources that predate ontologies? The cost of retro-fitting ontologies is prohibitive. CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Associations CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Amazon Web Services, degrees of separation using the Amazing (Kevin) Baconizer (www.baconizer.com) From "How Maps Work" by MacEachren, Alan to "Oops I Did It Again" by Spears Britney: 12 hops – People who bought: How Maps Work: Representation, Visualization, and Design - By Alan M. MacEachren also bought: Web Cartography - By M-J Kraak and Allan Brown People who bought this also bought: Seeing Through Maps: The Power of Images to Shape Our World View - By Ward Kaiser and Denis Wood Mapping: An Illustrated Guide to Graphic Navigational Systems - By Robert Fawcett-Tan What is a Designer: Things, Places, Messages - By N Potter and R Kinross Reinventing the Wheel - By Jessica Helfand Photobooth - By Babbette Hines MTV Photobooth - By MTV and Rizzoli International Publications Stages - By Britney Spears and Sheryl Berk Britney Spears - By Britney Spears Baby One More Time (+5 Bonus Tracks) - By Britney Spears Oops I Did It Again - By Britney Spears CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Capturing use-cases Who created that concept / resource? When was it created? Has it been modified recently? Who has used it? … What did they do with it? Such questions add a rich context by capturing situations surrounding resource usage CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Resource usage data logged usage data (Oracle, MySQL) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Mining association rules from use-case logs Association rules are mined from user action logs (uses the WEKA (Waikato Environment for Knowledge Analysis) API that implemented the Apriori algorithm (Agrawal, R. and Srikant, R., 1994). Tools added for data preprocessing and classifying: attribute selector: allows user to select a subset of data attributes. data filters: allows user to define filters to convert String, Time, Numeric data in any attribute column to nominal data for association mining. CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Data mining tools (association rules) Results & sensitivity settings Data Filter - String Attribute Selector Design Data Filter - Time Data Filter - Numeric CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Capture: concept creation & harvesting (Codex, e-Delphi) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Capture example (Randy Keller’s gravity map from previous GEON meeting) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Supplemental material: e.g. educational resources CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Supplemental material: e.g. Google search results CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Google search (Google search API is built into Codex) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Usage codex demonstration CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Managing groups & user workspaces CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Reusable knowledge structures afford… Private and shared knowledge spaces for describing resources Provenance information produces a web of relationships between resources Evolution and emergence of ideas within a community Discovery of points of agreement and divergence in concept construction or problem-solving approaches http://flatbox.geog.psu.edu/codex CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Example: questions you can ask Gravitational anomaly dataset A Is described by these concept map(s) / ontologies: Was created in this way: Plays a role in these workflow(s): Has been used to fulfill these task(s): Has been used by these people: Is most often used with these method(s) Has received the following review(s) / feedback: Is similar to, or differs from, anomaly dataset B in the following way(s): CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Future plans Add more perspectives onto resources into Codex (e.g. working with Digital Library for Earth Science Education (DLESE)) Improve transition from one perspective to another Peer-to-peer implementation Improve transition between semi-formal concept maps (provided by domain scientists) and formal (computable) ontologies that are defined more rigorously. Experiment with Codex used live to capture conceptual understanding (face to face and over the Web) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Summary: projects we are perusing for GEON 1. Concept map / ontology visualization & management tools (ConceptVista & Codex): searching & browsing of knowledge domains, and other resources. 2. Concept capture software (e-Delphi, Codex): developing vocabularies by which resources and learning activities are described 3. Concept map / ontology versioning and comparison (differencing) 4. Concept uncertainty (fuzzy-rough set approach) 5. Use-Case Tools: logging and data mining (association rules) 6. Visualization and analysis tools: e.g. animated maps, scatterplots, 3D scenes, cluster analysis, machine learning methods 7. Component assembly and deployment (GeoVISTA Studio): could help in selecting and packaging activities into self-contained, deployable units. 8. Managing learning activities: Learning Activity Toolkit (Southampton, UK & PSU) 9. Integration of concept management with DLESE API & strand maps CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Publications 1. Pike W., Gahegan M, 2003, “Constructing semantically scalable cognitive spaces”, in: Spatial Information Theory: Foundations of Geographic Information Science. Lecture Notes in Computer Science 2825, Kuhn W, Worboys M, and Timpf S (Eds.). Springer-Verlag, Berlin: 332-348. 2. MacEachren A M, Gahegan M, Pike W, 2004, “Geovisualization for constructing and sharing concepts”, Proceedings of the National Academy of Sciences, Vol. 101. 3. Gahegan M, Pike W, Ahlqvist O, Neff R, Yu C, “How much do we agree? A knowledge management system to help represent and mediate concepts developed by collaborating human-environment researchers” submitted to Annals of the Association of American Geographers. 4. Gahegan, (2004). “Beyond tools: visual support for the entire process of GIScience.“ In: Exploring Visualization (Eds. Dykes, J., MacEachren, M. and Kraak, J.-M.) 5. Brodaric, B. and Gahegan, M. (in press) “Representing Geoscientific Knowledge in Cyberinfrastructure: challenges, approaches and implementations”. GSA Special Papers volume. 6. O’Brien, J. and Gahegan, M. (2004). “A knowledge framework for representing, manipulating and reasoning with geographic semantics.” International Conference on Spatial Data Handling, Leicester. 7. Gahegan, M. (2004).“The Future of GIScience? GRID Computing and the Semantic Web”. Keynote address, GISRUK Conference, www.gisruk.org 8. Pike W,Yarnal B, MacEachren A, Gahegan M,Yu C, (in press) “Infrastructure for collaboration: Building the future for local environmental change”, to appear in Environment. 9. Pike W. A., Ahlqvist O., Gahegan M., Oswal S., “Capturing context in collaborative science: Supporting collaborative science through a knowledge and data management portal,” Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data, at Second International Semantic Web Conference, Sanibel Island, FL, October 2003. CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org end Questions? CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Supplemental slides CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Managing and sharing visual appearance Concept Hierarchies A Hierarchical View of the Concepts Concepts are listed Alphabetically Currently We Support RDF, OWL, and XML. Concept Graph Style Editor Styles describe how concepts should be rendered. Different concepts can have different styles using property filters Styles can be serialized using XML-based StyledLayered Descriptor Language, (SLD) Concepts are Represented as Nodes, and their relations are represented as Edges. CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Cyber-Infrastructure: underlying technologies Peer-to-Peer (P2P) Computing, software technology that enables networked computers to communicate (exchange information) without a common operating environment. The Information Power Grid (IPG) and Globus provide protocols Web Services, provide standards to describe, find & access remote resources. Web services mechanisms are integrated into the Grid model through the Open Grid Services Architecture (OGSA). Semantic Web, describing and searching for web content using formalized semantics (controlled vocabularies, taxonomies, ontologies) … as opposed to the current ‘chaos’, largely based on literals, popularity & corporate sponsorship! Collaborative Knowledge Environments, Data & Knowledge portals Asynchronous discussions Video conferencing CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Towards a knowledge collaboratory CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org gesopatial infrastructure An Integrated Approach to Distributed GeoCollaboration infrastructure for geoscience GEON infrastructure for e-government/e-society National Map HERO Digital Earth infrastructure for homeland security Geospatial One-Stop NGA: NSGI application domains research advances (in gray) leveraged to meet challenges (in blue) o Ontology and concept browsers Existing metadata standards Semantic web integrating knowledge acquiring knowledge Ontology creation Browsing & querying knowledge Knowledge Infrastructure constructing & accessing knowledge Automated indexing tools Supporting knowledge evolution Ontology mining / harvesting Representing and sharing perspectives Semantic indexing Semantic search Geospatial data repositories Meta-search (ensemble techniques) Supporting knowledge communities applying knowledge Collaboratories Collaborative visualization Visually mediating understanding Off-loading ideas Enabling negotiation Group Work with geospatial information & technologies Making decisions Supporting work Distributing access practices to knowledge Dialogueenabled interface Content-object replication kit (CORK) e-Delphi, ConceptVISTA, & argument visualization e Geo / Environmental science advancing science K-12 science & professional development Public/civic planning/resource management enhancing prosperity & civil society CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Emergency response & recovery Strategic threat assessment supporting homeland security www.geongrid.org Contexts: Creation Application Represented by Who did it? Who should use it? Collections of people Where was it made? Where does it apply? Collections of sites / scales When was it made? When does it apply? Collections of temporal intervals How was it made? How should it be used? Collections of methods and data Why was it made? Why should it be used? Collections of research questions, motivations, theories CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org