Free text and tags also allowed (Ontology-based) Metadata: What is it, Where and How can we use it, and How can we share it? www.ontogrid.eu And many other Wh-questions Controlled and systematic management Oscar Corcho University of Manchester Oscar.Corcho@manchester.ac.uk National e-Science Centre, Edinburgh 27/11/06 Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data (Integration) Web Semantic Knowledge (Reasoning) Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 2 Annotation assert facts using terms (metadata in RDF) Represent terms and their relationships (ontology in RDFS/OWL) News Videocast Grant Application Research Events Organisation Gene Database Edinburgh, 27 November 2006 3 Types of vocabularies. Formality Controlled vocabularies Thesauri “narrower term” relation Terms/ glossary Dublin Core Formal is-a Frames (properties) Formal instance Informal is-a MeSH ISWC FOAF TGN KWeb OntoWeb Add your vocabularies here BIRNLex GO General Logical constraints Value Restrs. Disjointness, Inverse, Part-Of ... CulturalTour FundFinder GALEN Lassila O, McGuiness D. The Role of Frame-Based Representation on the Semantic Web. Technical Report. Knowledge Systems Laboratory. Stanford University. KSL-01-02. 2001. Edinburgh, 27 November 2006 4 Metadata annotation Different types of annotation depending on the type of vocabulary used Based on Dublin Core The contributor and creator is the flight booking service “www.flightbookings.com”. The date would be January 1st, 2003, in case that the HTML page has been generated on that specific date. The description would be something like “flight details for a travel between Madrid and Seattle via Chicago on February 8th, 2004”. The document format is “HTML”. The document language is “en”, which stands for English Based on thesauri Madrid is a reference to the term with ID 7010413 in the thesaurus, which refers to the city of Madrid in Spain. Spain is a reference to the term with ID 1000095, which refers to the kingdom of Spain in Europe. Chicago is a reference to the term with ID 7013596, which refers to the city of Chicago in Illinois, US. United States of America is a reference to the term “United States” with ID 7012149, which refers to the US nation. Seattle is a reference to the term with ID 7014494, which refers to the city of Seattle in Washington, US. Based on ontologies Concept instances relate a part of the document to one or several concepts in an ontology. For example, “Flight details” may represent an instance of the concept Flight, and can be named as AA7615_Feb08_2003, although concept instances do not necessarily have a name. Attribute values relate a concept instance with part of the document, which is the value of one of its attributes. For example, “American Airlines” can be the value of the attribute companyName. Relation instances that relate two concept instances by some domain-specific relation. For example, the flight AA7615_Feb08_2003 and the location Madrid can be connected by the relation departurePlace Ontology-based document annotation: trends and open research problems. Corcho, O. International Journal of Metadata, Semantics and Ontologies 1(1):47-57. 2006 Edinburgh, 27 November 2006 5 Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data (Integration) Web Semantic Knowledge Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 6 Integration use a uniform common model in RDF Connecting through shared terms and shared instances Preserving context and provenance D2R R2O BIRN Mediator Agents Smart portals Data mining Social networking Smart search Knowledge Discovery Information Integration Edinburgh, 27 November 2006 and aggregation 7 Resource Description Framework [instanceOf] SwissProt_seq urn:data1 [similar_sequence_to] [input] [performsTask] urn:hit1… urn:BlastNInvocation3 urn:hit2…. [contains] [output] Find similar sequence urn:hit50…. . urn:data2 urn:data12 [input] [instanceOf] urn:compareinvocation3 [distantlyDerivedFrom] [output] Missed sequence [hasHits] Blast_report [instanceOf] [output] urn:hit5… urn:hit8…. [contains] [hasName] Sequence_hit [directlyDerivedFrom] urn:data:3 urn:data:f1 [instanceOf] urn:hit10…. . [type] [output] urn:invocation5 [type] DatumCollection urn:data:f2 [hasName] New sequence LSDatum Data generated by services/workflows [ ] Properties Concepts Services literals Edinburgh, 27 November 2006 8 Metadata Matters Flexible and extensible self describing schemas that don’t have to be nailed down “Lets describe my data set, or the output format of my tool, that changes all the time” Open world “I need to comment on that experiment” “That fact is now incorrect because …” Data fusion across different data models cross linked by shared instances and shared concepts Global naming scheme E.g. LSID: Life Science Identifiers Edinburgh, 27 November 2006 9 Don’t Prescribe, Describe!! The tyranny of the table •The tyranny of the tree “Not everything fits in one taxonomy” Edinburgh, 27 November 2006 10 -- Maryanne Martone (US BIRN) Seamark Demo: ID new drug candidates for BRKCB-1 GO2Keyword.rdf Keywords.rdf ProbeSet.rdf Keyword GO2UniProt.rdf GO2OMIM.rdf Probe Protein Gene MIM Id IntAct.rdf OMIM.rdf GO.rdf UniProt.rdf Organism Enzyme GO2Enzyme.rdf Citation Compound Taxonomy.rdf PubMed.xml Courtesy Joanne Luciano Enzymes.rdf KEGG.rdf Pathway Edinburgh, 27 November 2006 11 RDF for Proteomic Standards Edinburgh, 27 November 2006 12 http://www.naturebiotechnology.org Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data Web Semantic Knowledge (Reasoning) Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 13 Inference Logic-based classification and validity checking using OWL Rules using SWRL (Semantic Web Rule Language) RDF queries Just making connections because so much stuff is connected! 8q24 PVT1 Rearrangement of a DNA sequence homologous to a cell-virus junction fragment in several Moloney murine leukemia virus-induced rat thymomas Edinburgh, 27 November 2006 James Hendler Science and the Semantic Web Science 299: 520-521, 2003 14 In summary Expressive models SWRL Inference Model fusion OWL Controlled vocabularies RDF XML Annotation Extensible metadata schemas that you don’t have to nail down RDF(S) Integration Integration Data fusion Edinburgh, 27 November 2006 15 Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data (Integration) Web Semantic Knowledge (Reasoning) Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 16 EU-STREP Project OntoGrid SEMANTIC OGSA Middleware for the Semantic Grid Capabilites & Behaviors P2P Metadata Storage & Querying for Semantic Grids (Atlas). Principled way of Ontology Access: WS-DAIOntrealization RDF(S) Annotation: Applications Insurance Settlement Satellite Image Quality Analysis • Data and provenance Knowledge Parser • Services ODE-SGS Business process monitoring Negotiation Coordination Disclaimer: Talking about Grid does not necessarily mean High Performance Edinburgh, 27 November 2006 Computing and Parallelisation, but mainly management of distributed systems 17 S-OGSA Semantic-OGSA (S-OGSA) is... Our proposed Semantic Grid reference architecture A low-impact extension of OGSA • Mixed ecosystem of Grid and Semantic Grid services Services ignorant of semantics Services aware of semantics but unable to process them Services aware of semantics and able to process (part of) them • Everything is OGSA compliant Defined by • Information model New entities Model provide/ consume expose Capabilities Mechanisms • Capabilites New functionalities • Mechanisms use How it is delivered Edinburgh, 27 November 2006 18 S-OGSA Model METADATA as Semantic Annotations Edinburgh, 27 November 2006 19 S-OGSA Model: Metadata is a first-class resource Benefits of treating Metadata as a first-class resource: -- Clear AuthZ mechanisms -- Clear lifetime -- Metadata can be also distributed Edinburgh, 27 November 2006 -- ... 20 S-OGSA Capabilities: From OGSA to S-OGSA Application N Semantic-OGSA OGSA Application 1 Security Optimization Data Execution Management Semantic Services Resource management Information Management Infrastructure Services Edinburgh, 27 November 2006 21 S-OGSA Capabilities: From OGSA to S-OGSA Application N Optimization Execution Management Resource management Data Semantic Provisioning Services Ontology Semantic Services Reasoning Information Management Semantic binding Security Knowledge Semantic-OGSA OGSA Application 1 Metadata Annotation Infrastructure Services Edinburgh, 27 November 2006 22 S-OGSA Patterns. Semantic Aware and Capable Service Deployed in Globus Toolkit 4 Ontology Service Metadata Service 1.1 Farm out request Properties Lifetime Metadata Seeking Client 1 Access/Query Semantic Bindings Semantics Others… Resource Service Semantic aware interface Edinburgh, 27 November 2006 23 S-OGSA Scenario. Satellite Image Quality Analysis Scenes: Satellite Routine Operations Routine operations Metadata generation Report retrieving Satellite LifeCycle: Launch and Early Orbit Phase (~ 3 days) Calibration and Validation campaign (~ 6-9 months) Routine operations (~ 5-9 years) Satellite de-orbiting. Product processing continues Edinburgh, 27 November 2006 24 Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data (Integration) Web Semantic Knowledge (Reasoning) Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 25 S-OGSA Metadata Access/Management Protocols Semantic Binding Service Suite create WS-Addressing: epr SB Factory WS-RP: Get/Set/Query Properties Client create query SB WS-Notif: Subscribe / Notify WS-RL: Destroy , SetTerminationTime Semantic Binding SB Inspectprops . . . SB RDF WS-RL ++: archive Query w/o Inference, UpdateContent Query( over unified view) Metadata Query query Edinburgh, 27 November 2006 26 S-OGSA Metadata Lifecycle Metadata is normally in stable situation If the entity it refers to or the knowledge entity it uses change, then it may move to a stale situation Stable GE changed Stale KE changed Checks needed Possibly reannotation Metadata can be archived or deleted from the system Archived Deleted “Periodically, we will have to reannotate everything” Edinburgh, 27 November 2006 27 -- Maryanne Martone (US BIRN) Data Integration Information integration from gLite and GT4 information services BDII RGMA MDS Trade-off between... Continuous update or ondemand access, fresh information Consolidated data but possibly non-fresh information Edinburgh, 27 November 2006 28 Outline Metadata, annotations... What are they and where are they used? Semantic Annotation Web Semantic Data (Integration) Web Semantic Knowledge (Reasoning) Web Our approach to systematic metadata management OntoGrid and S-OGSA The S-OGSA model: Semantic Bindings S-OGSA capabilities and mechanisms One S-OGSA scenario of use Ongoing work Conclusions Edinburgh, 27 November 2006 29 Conclusions Metadata can be used for many purposes Simply for the sake of annotation • Reuse and sharing Look at the Web 2.0 success For integration • Open and flexible schemas. Describe, not prescribe For reasoning • Complex applications S-OGSA Metadata as a first-class citizen Semantic Binding Semantic Binding Service already available for use • Robust metadata management • Distributed Metadata lifecycle Edinburgh, 27 November 2006 30 Access to S-OGSA Publications An overview of S-OGSA: a Reference Semantic Grid Architecture. Corcho O, Alper P, Kotsiopoulos I, Missier P, Bechhofer S, Goble C. Journal of Web Semantics 4(2):102115. June 2006 Source code http://www.ontogrid.net/, For Downloading Distributions Access to CVS Connection type: pserver user: ontogrid password: not needed Host: rpc262.cs.man.ac.uk Port: 2401 Repository path: /local/ontogrid/cvsroot module: prototype Edinburgh, 27 November 2006 31 Questions Thank you for your attention! Questions? Acknowledgements Carole Goble OntoGrid team members at Manchester • Pinar Alper, Ioannis Kotsiopoulos, Sean Bechoffer, Ian Dunlop, Wei Xing OntoGrid Consortium Edinburgh, 27 November 2006 32 (Ontology-based) Metadata: What is it, Where and How can we use it, and How can we share it? www.ontogrid.eu Oscar Corcho University of Manchester Oscar.Corcho@manchester.ac.uk National e-Science Centre, Edinburgh 27/11/06