Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure Highlights Deborah L. McGuinness Tetherless World Senior Constellation Chair and Professor of Computer Science and Cognitive Science (previously Acting Director of the Knowledge Systems Laboratory at Stanford University) Joint work with Peter Fox and James Hendler Tetherless World Constellation Rensselaer Polytechnic Institute McGuinness – Microsoft eScience – December 8, 2008 1 Selected Examples and Foundations Semantic Technologies used in eScience (currently funded) Virtual Solar Terrestrial Observatory (vsto.org) Semantic Provenance Capture for Data Ingest Systems (SPCDIS) Semantically-Enabled Scientific Data Integration (SESDI) A Community-Driven Scientific Observations Network to Achieve Interoperability of Environmental and Ecological Data Semantic Foundations Inference Web – Environment for Explanation, Transparency, and Trust PML – Knowledge Provenance Interlingua (Proof Markup Language) Ontology Environments: Ontology Repositories, Ontology Editing, Semantic Wiki (Semantic History), … Scalable Web Science – New Web Science Center – part of Web Science Research Initiative, … McGuinness – Microsoft eScience – December 8, 2008 2 Virtual Solar Terrestrial Observatory (vsto.org) Interdisciplinary Virtual Observatory for searching, integrating, and analyzing observational, experimental, and model databases. Subject matter: solar, solar-terrestrial and space physics Provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use 3 year NSF project; initial deployment in year 1, multiple deployments by year 2; year 3 outreach and broadening While aimed at one interdisciplinary area, it also serves as a replicable prototype for interdisciplinary virtual observatories Current NSF follow on for provenance extension (Semantic Provenance Capture in Data Ingest Systems) McGuinness – Microsoft eScience – December 8, 2008 3 Semantic filtering by domain or instrument hierarchy Partial exposure of Instrument class hierarchy McGuinness – Microsoft eScience – December 8, 2008 4 Quick look browse 5 20080602 FoxeScience VSTO–et al. McGuinness – Microsoft December 8, 2008 5 Inference Web Explanation Architecture WWW SDS OWL-S/BPEL Trace of web service discovery Learners * Proof Markup Language (PML) Toolkit IWTrust Trust computation IW Explainer/ Abstractor End-user friendly visualization Learning Conclusions JTP/CWM KIF/N3 Trust Theorem prover/Rules SPARK SPARK-L Trace of task execution Justification Provenance Text Analytics UIMA IWBrowser Expert friendly Visualization IWSearch search engine based publishing IWBase provenance registration Trace of information extraction Semantic Web based infrastructure PML is an explanation interlingua Represent knowledge provenance (who, where, when…) Represent justifications and workflow traces across system boundaries Inference Web provides a toolkit for data management and visualization McGuinness – Microsoft eScience – December 8, 2008 6 Global View and More Views of Explanation filtered focused Explanation (in PML) provenance Explanation as a graph Customizable browser options Proof style Sentence format Lens magnitude Lens width More information McGuinness – Microsoft eScience – December 8, 2008 abstraction discourse trust global Provenance metadata Source PML Proof statistics Variable bindings Link to tabulator … 7 Provenance View Source metadata: name, description, … Source-Usage metadata: which fragment of a source has been used when Views of Explanation filtered focused Explanation (in PML) trust McGuinness – Microsoft eScience – December 8, 2008 global abstraction discourse provenance 8 Conclusion and Links Knowledge Provenance is growing in criticality as applications become more distributed, hybrid, and collaborative Inference Web and PML provide an open infrastructure and starting point that is being used more in a wide set of applications. inference-web.org Semantic eScience class link (with book to follow) http://tw.rpi.edu/wiki/Semantic_e-Science Sample of implemented eScience applications using semantic technologies: Interdisciplinary Virtual Observatory (VSTO): vsto.org Semantic Provenance: (SPCDIS): tw.rpi.edu/wiki/SPCDIS Volcano/Atmosphere/Plate tectonics (SESDI): sesdi.hao.ucar.edu/ McGuinness – Microsoft eScience – December 8, 2008 9 Extra McGuinness – Microsoft eScience – December 8, 2008 10 McGuinness NSF/NCAR May 6, 2008 11