1 The role of geoscience ontologies in a knowledge infrastructure for e-science Boyan Brodaric Geological Survey of Canada Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 E-geoscience What are the types of geoscience concepts? Do they differ from other sciences? How do they relate to models and theories in CI? Acknowledgements eSci Institute Visitors Program Theme 4: Spatial Semantics for Automating Geographic Information Processes (F.Reitsma) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 2 Outline 3 Cyberinfrastructure (CI) Knowledge infrastructure (KI) Geoscience ontologies in KI types of concepts in geoscience ontologies Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Cyberinfrastructure Cyberinfrastructure (CI) cyber-networked resources: data, software, instruments, people,… E-science scientific activity in CI: new paradigm for dramatic discoveries NSF Vision for CI systems: HPC, connectivity data: capture (sensors), manipulate (structure, analyse, integrate, visualize) people: virtual collaboration, education Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 4 CI typical components Systems HPC, connectivity 5 Knowledge representation Theories, concepts Information stores, analyses, integrations, visualizations People collaboration, education Instruments (data) observations, measurements, experiments Models models, simulations Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 geo-CI example: real-time severe storm modeling (LEAD) On-Demand Grid Computing Data Mining Weather ontology and glossary resource discovery Is there a severe storm forming? Streaming observations Forecast Model Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 6 geo-CI example: seismic hazard modeling (SCEC) Information On-Demand Grid Computing workflows ontologies info discovery info integration InSAR Image of the Hector Mine Earthquake A satellite generated Interferometric Synthetic Radar (InSAR) image of the 1999 Hector Mine earthquake. Shows the displacement field in the direction of radar imaging Each fringe (e.g., from red to red) corresponds to a few centimeters of displacement. What is the hazard threat ? Monitoring, mapping Seismic Hazard Model Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 7 Two perspectives on CI The ENGINEERING perspective: efficiency replicating existing geoscience practice in CI more efficiently doing more, faster: more computation, more resources, existing methods e.g. resource discovery, integration, use (finding, linking and using info., tools, …) is evolutionary: an incremental shift The KNOWLEDGE perspective: creativity leveraging CI for new approaches to doing science new doings: new questions and results with new CI-driven methods e.g. knowledge discovery, integration, use (explicitly derive and test ideas in CI) is revolutionary: a paradigmatic shift Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 8 Efficiency perspective: ontologydriven map integration (GEON) e.g. find all regions with sedimentary rocks metadata geospatial projection 9 common concepts in metadata geol. concept schema common concepts in schema rock type geol. unit common concepts in content classifications sedimentary workflows english sandston e slate sandstone arenite slate queries vocabulary french sandstone grès slate arénite grès ardoise common vocabulary ontology Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 "textbook knowledge" - defining concepts in: • theories, laws • model types • classification systems • taxonomies • etc. objects processes, events theories 10 ontology ontology Creativity perspective: theorydriven geoscience discovery hypothesis testing hypothesis formation models (after A.K. Sinha, 2004) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Aspects of a knowledge infrastructure (KI) for CI Gap 11 KI should support full discovery Theory discovery Model discovery Theory ontology model type Abduction Deduction Information Model regularities: e.g. schemas objects processes Induction Observation Data Information discovery sensor s Data discovery (adapted from Sowa, 2000) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Current geo-CI focus Gaps 12 focus on efficiency as path to creativity missing explicit theory-discovery support implicitly assumed to be done in heads of scientists little infrastructure for scientific theories e.g. implications of alternate theories on models limited support for pragmatics (agent actions) limited role of ontologies background tools for info. manipulation, not scientific discovery e.g. mainly used in resource discovery and integration must understand ontologies as geoscience artefacts, to aid discovery Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Ontologies as geoscience artefacts 13 Universal Theory Utilizes universal concepts and their relations: e.g. theories (plate tectonics,…), normative guides (stratigraphic codes, textbooks,...) classification systems (rock types,…) conceptual data models (e.g. NADM) Conceptual Model Utilizes generalized situations and relations: e.g. legends, stratigraphic lexicons, regional models Toward formalizing concepts in geoscience models and theories Situational Geospatial Model Particular • Utilizes states and their relations: e.g. maps, 3D models, simulations,… (from Dekemo, 2004) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Ontologies as geoscience artefacts 14 Do concept types imply theory and model types? what concept types are inherent to geosciences? what gains are accrued from higher representation precision? Likely problem in KI re: concept types using a concept too general or too specific for a task: E.g. using regional concepts to calculate the local earthquake or landslide risk for your home (too general) E.g. using the current local conditions for regional groundwater vulnerability estimates (too specific) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Need for abstraction distinctions risk models use physical qualities of rocks e.g. density, porosity, permeability,… different rock quality values affect risk estimates measured values at a site (point) prototypical values for one polygon (on a map) prototypical values for a class of polygons (in a map legend) normative values (in a classification scheme) Seismic Hazard Model Geologic Map Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Groundwater Vulnerability Model 15 Geoscience Ontology Abstraction levels for geoscience concepts Upper-level • universal spans geospace-time definitional identified by ‘logic' endurant Domain-level • domain specific spans geospace-time definitional defined by 'essences' geologic formation (Millikan, 2000) situated in geospace-time situational defined by 'histories' formation X Individual single entity rock body #1 State single description Rock body #1 @ time1 Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 16 Situated concepts 17 historical, geo-temporal, situational Process S Defined by: t0 t1 tn Keli Mutu volcanism in Flores, Indonesia (Pasternack & Varekamp, 1994; 1997) common process history Concept ‘Keli Mutu Volcanic lake’ process history → situation process=S Quality1=x-y Process = ‘Keli Mutu volcanism’ pH = 1.8 - 3.1 Instantiated by: individuals sharing a process extension (class) individual1 individual2 ‘Tiwu Ata Polo’ ‘Tiwu Ata Mbupu’ Described by: none/some common qualities State State potential prototype effects process=S quality1=x quality2=a process=S quality1=y quality2=b potential quality change in time Process = ‘Keli …’ Process = ‘Keli …’ pH = 1.8 pH = 3.1 Location = Flores Location = Flores Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Essential concepts 18 ahistorical, spatial, definitional ‘Granite’ Process Type Defined by: Concept space of essential qualites quality1=x-z quality2=y-z … (after Gardenfors, 2000) Process = ‘Igneous’ Quartz = 20-60% Alkali Feldspar = … Plagioclase = … (Jersey & Tarman) Instantiated by: individuals with point in quality space individual1 individual2 extension (class) ‘Rock Sample 1234’ Process = ‘Igneous’ State State quality1=x quality2=y … quality1=z quality3=z … Quartz = 25% Alkali Feldspar = 40% Plagioclase = 35% Location = … Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Geoscience Ontology Abstraction levels for geoscience concepts Upper-level • universal Domain-level • domain specific spans geospace-time definitional identified by ‘logic' endurant spans geospace-time schematic template, frame, dimensions geologic unit spans geospace-time definitional defined by 'essences' geologic formation (Millikan, 2000) (Gardenfors, 2000) (Millikan, 2000) situated in geospace-time situational defined by 'histories' formation X Individual single entity rock body #1 State single description Rock body #1 @ time1 Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 19 Schematic concepts Defined by: schematic space ahistorical, aspatial, definitional 20 ‘Rock material’ Concept Process Minerals quality1 quality2 … ‘Granite’ Process = ‘Igneous’ Quartz = 20-60% Alkali Feldspar = … Plagioclase = … Concept quality1=x-z quality2=y-z … (Jersey & Tarman) individual1 individual2 ‘Rock Sample 1234’ Process = ‘Igneous’ Quartz = 25% State State Alkali Feldspar = 40% quality1=x quality2=y … quality1=z quality3=z … Plagioclase = 35% Location = … Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Example [LEKXIS] upper schematic essential situational individual state Physical Object origin lifetime Lithostrat Unit origin lifetime lithology mappability Geologic History Geologic Time Earth Material { mappable,…} Formation Geologic History Geologic Time Earth Material mappable Formation X history of X Devonian sandstone mappable Rock body Y history of Y Late Devonian sandstone mappable Rock body Y @ t1 history of Y@ t1 Frasnian shale mappable Property Process 21 Material [L] Logic-driven [E] schEma-driven Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 [K] Content-driven [X] conteXt-driven [I] Individual [S] State Example [LEKXIS] Material origin lifetime Property 22 Process Physical upper schematic Object origin lifetime minerals Geologic History Geologic Time {Q,A,P,…} Geologic History Geologic Time Granites of Fm X geologic history of X Devonian Granite of X1 geologic history of granite of X1 Granite of X1 @ t1 history of gr. of X1 @ t1 Earth Material Granite essential situational individual state composition [L] Logic-driven texture [E] schEma-driven { felsic, mafic, …} {coarse, …} Q=20-60 A=… P=… M<=90% felsic medium crystal Q=25-35 A=31-40 P=25-43 M=15-25% felsic LateDevonian Q=30 A=40 P=30 M=20% felsic Frasnian … … medium crystal medium crystal …. Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 [K] Content-driven [X] conteXt-driven [I] Individual [S] State Example 23 upper Geographical Region Geographical Feature Geographical Region Biological Object schematic Ecologic Rank Mountain Political Unit Biological Rank essential Domain Fault-origin Mtn. Country Species situational Polar Canadian Rocky Mtn. Western country Human individual this Polar region Mt. Whistler Canada Boyan state this Polar @ now Mt. Whistler @ now Canada @ now Boyan @ now Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Ontologies as geoscience artefacts 24 Theory Utilizes universal concepts and their relations: upper e.g. theories (plate tectonics,…), normative guides (stratigraphic codes, textbooks,...) schematic classification systems (rock types,…) conceptual data models (e.g. NADM) Conceptual Model Geospatial Model essential Utilizes generalized situations and relations: situational e.g. legends, stratigraphic lexicons, regional models individual • Utilizes states and their relations: state e.g. maps, 3D models, simulations,… (from Dekemo, 2004) Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 Application: semantic cube All individuals Many individuals [S] Situational One individual One place increasing semantic granularity [S] State Many places [I] Individual All places [K] Ahistorical [S] Schematic [L] Upper One time Many times All times Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 25 Application: semantic confusion matrix 26 Semantic confusion =def inappropriate substitution of a concept from one level of granularity with a concept from another level, for some task. Over-granular = too specific Under-granular = too general State [S] State [S] e.g. using local conditions for regional risk Individual [I] Situational [X] Essential [K] under-granular: in time and place under-granular: in time, place, individuals undergranular: in time, place, individuals under-granular: in place, individuals undergranular: in time, place, individuals Individual [I] over-granular: in time Situational [X] over-granular: in time and place over-granular: in place, individuals Essential [K] over-granular: in time, place and individuals over-granular: in time, place, and individuals undergranular: in time, place, individuals over-granular: in time, place, and individuals Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 e.g. using regional conditions for local risk Final thoughts CI needs KI for full scientific discovery proposal for increased semantic granularity in geoscience ontologies to aid discovery increased prominence of historical-geographical but does it work?... needs: historical case studies knowledge infrastructure implementation and testing formal knowledge representation Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17 27 28 Questions, Comments? Geoscience Ontologies in Knowledge Infrastructure Boyan Brodaric, Edinburgh, July 17