The Semantic PDS J. Steven Hughes presented by Elaine Dobinson PV 2005 The Royal Society, Edinburgh 21-23 November 2005 steve.hughes@jpl.nasa.gov The Semantic PDS Topics • Introduction • The Planetary Data System (PDS) • Planetary Science Ontology • Semantic Web Tools – Resource Description Framework (RDF) – Metadata Browsers • Conclusion 2 The Semantic PDS Introduction • The Planetary Data System (PDS) desires more powerful, flexible, intuitive, and user friendly interfaces to its resources – The advent of the Web has raised users expections regarding access to information – Users now expect “google-like” searches • Text-based search (e.g. “Titan”) – But desire “smarter” searches • Facet-based search (e.g. “Target_Name = Titan”) • Precision of Forms-based search (e.g. Latitude between 10.0 and -10.0) • The Semantic Web and the resulting tools and languages are making this possible 3 The Semantic PDS Introduction • "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -Tim Berners-Lee, James Hendler, Ora Lassila, "The Semantic Web", Scientific American, May 2001 – Web technologies such as HTML, hyperlinks, and http protocol have focused on providing information for human consumption – New technologies such as Extensible Markup Language (XML), Resource Description Framework (RDF), and software agents will provide information sufficient for computer processing and reasoning – The semantic web is dependent on the existence of domain models/ontologies that provide the necessary semantics 4 The Semantic PDS Introduction • In the following we will present a proof-of-concept process for building a semantic web application – Describe an existing knowledge-base (PDS Catalog) • Data model (schema) • Data (instances) – Build an ontology from the data model – Export the knowledge-base to a semantic web language – Load a shareware metadata browser – Configure a user interface that allows facet- and text-based searches • Describe the benefits of using Semantic Web technologies 5 The Semantic PDS The PDS acquires, preserves, and distributes the large volume of unique and valuable data returned by Solar System Exploration missions PDS Overview NASA’s Planetary Data System Salient Features (Products & Services) • • • • • • • High quality peer-reviewed science data archives Online data search/retrieval to planetary community Archiving expertise/standards to planetary missions Scientific expertise and support for users Supports mission planning, design, and research Value-added aggregated data products Education and outreach data products and services Science (via Discipline Specialists) • • Engineering Node provides infrastructure system engineering and development, data engineering and standards, etc. across PDS Discipline Nodes curates discipline-specific science datasets for use by the science community 6 The Semantic PDS The PDS Data Model • The PDS developed the Planetary Science Data Model in the late 1980’s • The PDS data model was implemented in a relational DBMS and made operational in 1990. • • – Catalog of Data Sets – Currently ~1000 data sets in catalog An evolved version of the original data model remains the key standard for collecting and organizing archive data and metadata – ~50 defined objects (modeled entities) – ~1,200 data elements (attributes) – Maintained in the Planetary Science Data Dictionary Data model evolves to reflect changing domain 7 The Semantic PDS The Planetary Science Ontology • The PDS Data Model was ingested into Stanford’s Protégé ontology tool – PDS entities became ontology classes • E.g. Data Set, Target, Mission – PDS data elements became ontology slots • E.g. Data_Set_Id, Data_Set_Desc • Constraint attributes were maintained (e.g. Cardinality) – Included interoperability concepts from Object-Oriented Data Technology research – PDS data records become ontology instances • Data records extracted and ingested • Produces a knowledge-base – Export to Resource Description Framework (RDF) 8 The Semantic PDS Semantic Web Tools Resource Description Framework (RDF) (1 of 2) • The Resource Description Framework (RDF) describes resources using statements in the form of a subjectpredicate-object expression, called a triple. 1. the resource or "thing" being described 2. the trait or aspect about that resource (often a relationationship) 3. the object of the relationship or value of that trait • RDF can be represented in an XML syntax (RDF/XML) that is the standard representation for access and exchange of RDF across the Web <rdf_:Data_set rdf:about="&rdf_;co-s-issna/isswa…“ dc:title="CASSINI ORBITER SATURN ISSNA…" rdf_:reslocation="http://pdsquery/query?..." rdf_:resclass="data.metadata.dataset" dc:publisher="NASA.PDS" rdf_:data_set_id="CO-S-ISSNA/ISSWA-2-EDR…"> <rdf_:target_name> <rdf:Description rdf:about="&terms;titan"> <rdfs:label>TITAN</rdfs:label> </rdf:Description> </rdf_:target_name> <rdf_:mission_name> <rdf:Description rdf:about="&terms;cassini-huygens"> <rdfs:label>CASSINI-HUYGENS</rdfs:label> </rdf:Description> </rdf_:mission_name> </rdf_:Data_set> RDF/XML for PDS Cassini Imaging Data Set 9 The Semantic PDS Resource Description Framework (RDF) (2 of 2) • • RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. RDF properties also represent relationships between resources and an RDF model can therefore resemble an entity-relationship diagram. <rdf_:Data_set rdf:about="&rdf_;co-s-issna/isswa…“ dc:title="CASSINI ORBITER SATURN ISSNA…" rdf_:reslocation="http://pdsquery/query?..." rdf_:resclass="data.metadata.dataset" dc:publisher="NASA.PDS" rdf_:data_set_id="CO-S-ISSNA/ISSWA-2-EDR…"> <rdf_:target_name> <rdf:Description rdf:about="&terms;titan"> <rdfs:label>TITAN</rdfs:label> </rdf:Description> </rdf_:target_name> <rdf_:mission_name> <rdf:Description rdf:about="&terms;cassini-huygens"> <rdfs:label>CASSINI-HUYGENS</rdfs:label> </rdf:Description> </rdf_:mission_name> </rdf_:Data_set> RDF/XML for PDS Cassini Imaging Data Set 10 The Semantic PDS RDF Schema (RDFS) • RDF Schema (RDFS) allows the definition of sets of terms for a subject area. – Resources classes • Hierarchy of classes – Resource properties • Class Properties – Relationships between classes and properties – Characteristics of properties and classes • <rdfs:Class rdf:about="&rdf_;Data_set“ rdfs:label="Data_set"> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdf:Property rdf:about="&rdf_;data_set_id“ rdfs:label="data_set_id"> <rdfs:domain rdf:resource="&rdf_;Data_set"/> <rdfs:range rdf:resource="&rdfs;Literal"/> </rdf:Property> <rdf:Property rdf:about="&rdf_;target_name“ rdfs:label="target_name"> <rdfs:domain rdf:resource="&rdf_;Data_set"/> <rdfs:domain rdf:resource="&rdf_;Target"/> <rdfs:range rdf:resource="&rdfs;Literal"/> </rdf:Property> RDFS/XML for PDS Data Set RDFS provides the semantics about resources, their relationship with other resources, and their properties and values. 11 The Semantic PDS Metadata Browser • Longwell – A suite of web-based RDF browsers developed by the Simile Project. – A web application designed to allow browse and search an RDF model – Allows browse and search of arbitrarely complex RDF datasets • an end-user friendly view (where all the complexity of RDF is hidden) • an RDF-aware view (where all the details are shown) – Written in Java that depends on the following libraries for operation: • • Apache Velocity as the template engine • Apache Lucene as the search engine • Jena as the RDF support library (parsing, storing, querying) SIMILE (Semantic Interoperability of Metadata and Information in unLike Environments) – A joint project conducted by the W3C, MIT Libraries, and MIT CSAIL. RDF-aware View 12 The Semantic PDS Metadata Browser - End-User Friendly View 13 The Semantic PDS Conclusion Semantic Web Technology is Now Maturing • Poweful Ontology (Data Model) Management Tools – Ontology Capture • Object classes, attributes, and relationships • Rich Semantics (Cardinality, inverse relationships, …) – Ontology Representation • Standard and novel notations – UML Class Diagrams – HTML Documents – Dynamic drill down GUIs • New plugins being developed all the time – Ontology Export • RDFS/RDF, OWL, Frames, XML Schema/XML, XMI (UML), Relational Schema – Implementation Independent – Promotes Interoperability • Based on data model semantics • Commalities between data models are easily exploited • Inference is supported 14 The Semantic PDS Semantic Web Technology is Now Maturing • Ontology + Instances = Knowledge Base – Ingest and validate domain instances (the data) – Equally accessible metadata and data • Browsers for Metadata/Data Navigation – Generated semi-automatically – Supports Text-, Facet-, and Forms-Based Search – Graphical Notations for GUIs being developed – Allow co-location of differing domain ontologies/knowledge bases 15 The Semantic PDS Schedule for PDS “Google-Like” Search •Prototypes – Built and currently evaluating •White Paper on General Search/Technology Comparison – 12/05 •MC Decision to Proceed – 03/06 •General Data Set Search Deployed – 08/06 16 The Semantic PDS Contacts and URLs J. Steven Hughes steve.hughes@jpl.nasa.gov Daniel Crichton dan.crichton@jpl.nasa.gov Semantic PDS Prototype http://sempds.jpl.nasa.gov/sempds/longwell SIMILE/Longwell project http://simile.mit.edu/ PDS http://pds.jpl.nasa.gov/ 17 The Semantic PDS Backup 18 The Semantic PDS Acronyms and Definitions DBMS – Data Base Management System E-R – Entity-Relationship Facet – A constraint on an attribute PDS – Planetary Data System RDF – Resource Descripition Framework RDFS – RDF Schema UML – Unified Modeling Language XMI - XML Metadata Interchange XML - Extensible Markup Language Ontology • Philosophy – Ontology is the study of the existence of things. • Information Science – An ontology is the product of an attempt to formulate an exhaustive and rigorous conceptual model about a specific domain 19 The Semantic PDS Protégé Class Browser 20 The Semantic PDS Protégé HTML Export 21 The Semantic PDS Benefits Simple Define Search Classify Planetary Science Correlative Search Ontology Data Describe Mining Validate 22