(Ontology-based) Metadata: How What Where

advertisement
Free text and tags also allowed
(Ontology-based) Metadata:
What is it,
Where and How can we use it,
and How can we share it?
www.ontogrid.eu
And many other Wh-questions
Controlled and systematic
management
Oscar Corcho
University of Manchester
Oscar.Corcho@manchester.ac.uk
National e-Science Centre, Edinburgh
27/11/06
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data (Integration) Web
 Semantic Knowledge (Reasoning) Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
2
Annotation assert facts using terms (metadata in RDF)
Represent terms and their relationships (ontology in RDFS/OWL)
News
Videocast
Grant Application
Research
Events
Organisation
Gene Database
Edinburgh, 27 November 2006
3
Types of vocabularies. Formality
Controlled
vocabularies
Thesauri
“narrower term”
relation
Terms/
glossary
Dublin Core
Formal
is-a
Frames
(properties)
Formal
instance
Informal
is-a
MeSH
ISWC
FOAF
TGN
KWeb
OntoWeb
Add your vocabularies
here

BIRNLex
GO
General
Logical
constraints
Value
Restrs.
Disjointness,
Inverse, Part-Of ...
CulturalTour
FundFinder
GALEN
Lassila O, McGuiness D. The Role of Frame-Based Representation on the Semantic Web.
Technical Report. Knowledge Systems Laboratory. Stanford University. KSL-01-02. 2001.
Edinburgh, 27 November 2006
4
Metadata annotation
 Different types of annotation depending on the type of vocabulary used
Based on Dublin Core
The contributor and creator is the flight booking service “www.flightbookings.com”.
The date would be January 1st, 2003, in case that the HTML page has been generated on that
specific date.
The description would be something like “flight details for a travel between Madrid and Seattle via
Chicago on February 8th, 2004”.
The document format is “HTML”.
The document language is “en”, which stands for English
Based on thesauri
Madrid is a reference to the term with ID 7010413 in the
thesaurus, which refers to the city of Madrid in Spain.
Spain is a reference to the term with ID 1000095, which refers to
the kingdom of Spain in Europe.
Chicago is a reference to the term with ID 7013596, which refers
to the city of Chicago in Illinois, US.
United States of America is a reference to the term “United
States” with ID 7012149, which refers to the US nation.
Seattle is a reference to the term with ID 7014494, which refers
to the city of Seattle in Washington, US.
Based on ontologies
Concept instances relate a part of the document to one or several concepts in an ontology. For example, “Flight details” may
represent an instance of the concept Flight, and can be named as AA7615_Feb08_2003, although concept instances do not
necessarily have a name.
Attribute values relate a concept instance with part of the document, which is the value of one of its attributes. For example,
“American Airlines” can be the value of the attribute companyName.
Relation instances that relate two concept instances by some domain-specific relation. For example, the flight
AA7615_Feb08_2003 and the location Madrid can be connected by the relation departurePlace
Ontology-based document annotation: trends and open research problems. Corcho, O.
International Journal of Metadata, Semantics and Ontologies 1(1):47-57. 2006
Edinburgh, 27 November 2006
5
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data (Integration) Web
 Semantic Knowledge Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
6
Integration use a uniform common model in RDF
Connecting through shared terms and shared instances
Preserving context and provenance
D2R
R2O
BIRN Mediator
Agents
Smart portals
Data mining
Social networking
Smart search
Knowledge Discovery
Information Integration
Edinburgh, 27 November 2006
and aggregation
7
Resource Description Framework
[instanceOf]
SwissProt_seq
urn:data1
[similar_sequence_to]
[input]
[performsTask]
urn:hit1…
urn:BlastNInvocation3
urn:hit2….
[contains]
[output]
Find similar sequence
urn:hit50….
.
urn:data2
urn:data12
[input]
[instanceOf]
urn:compareinvocation3
[distantlyDerivedFrom]
[output]
Missed sequence
[hasHits]
Blast_report
[instanceOf]
[output]
urn:hit5…
urn:hit8….
[contains]
[hasName]
Sequence_hit
[directlyDerivedFrom]
urn:data:3
urn:data:f1
[instanceOf]
urn:hit10….
.
[type]
[output]
urn:invocation5
[type]
DatumCollection
urn:data:f2
[hasName]
New sequence
LSDatum
Data generated by
services/workflows
[ ]
Properties
Concepts
Services
literals
Edinburgh, 27 November 2006
8
Metadata Matters
 Flexible and extensible self describing schemas that don’t
have to be nailed down
 “Lets describe my data set, or the output format of my tool, that
changes all the time”
 Open world
 “I need to comment on that experiment”
 “That fact is now incorrect because …”
 Data fusion across different data models
 cross linked by shared instances and shared concepts
 Global naming scheme
 E.g. LSID: Life Science Identifiers
Edinburgh, 27 November 2006
9
Don’t Prescribe, Describe!!
 The tyranny of the table
•The tyranny of the tree
“Not everything fits in one
taxonomy”
Edinburgh, 27 November 2006
10
-- Maryanne Martone (US BIRN)
Seamark Demo:
ID new drug
candidates for
BRKCB-1
GO2Keyword.rdf
Keywords.rdf
ProbeSet.rdf
Keyword
GO2UniProt.rdf
GO2OMIM.rdf
Probe
Protein
Gene
MIM Id
IntAct.rdf
OMIM.rdf
GO.rdf
UniProt.rdf
Organism
Enzyme
GO2Enzyme.rdf
Citation
Compound
Taxonomy.rdf
PubMed.xml
Courtesy Joanne Luciano
Enzymes.rdf
KEGG.rdf
Pathway
Edinburgh, 27 November 2006
11
RDF for Proteomic Standards
Edinburgh,
27 November 2006
12
http://www.naturebiotechnology.org
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data Web
 Semantic Knowledge (Reasoning) Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
13
Inference
Logic-based classification and validity checking using OWL
Rules using SWRL (Semantic Web Rule Language)
RDF queries Just making connections because so much stuff is
connected!
8q24
PVT1
Rearrangement of a DNA
sequence homologous
to a cell-virus junction
fragment in several Moloney
murine leukemia
virus-induced rat thymomas
Edinburgh, 27 November 2006
James Hendler Science and the Semantic Web Science 299: 520-521, 2003
14
In summary
Expressive models
SWRL
Inference
Model fusion
OWL
Controlled vocabularies
RDF
XML
Annotation
Extensible metadata
schemas that you don’t
have to nail down
RDF(S)
Integration
Integration
Data fusion
Edinburgh, 27 November 2006
15
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data (Integration) Web
 Semantic Knowledge (Reasoning) Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
16
EU-STREP Project OntoGrid
 SEMANTIC OGSA
 Middleware for the Semantic Grid
 Capabilites & Behaviors
 P2P Metadata Storage & Querying
for Semantic Grids
(Atlas).
 Principled way of
Ontology Access: WS-DAIOntrealization
RDF(S)
Annotation:
 Applications
 Insurance Settlement
 Satellite Image Quality
Analysis
• Data and provenance
Knowledge Parser
• Services
ODE-SGS
 Business process monitoring
 Negotiation
 Coordination
Disclaimer: Talking about Grid does not necessarily
mean High Performance
Edinburgh, 27 November 2006
Computing and Parallelisation, but mainly management of distributed systems
17
S-OGSA
 Semantic-OGSA (S-OGSA) is...
Our proposed Semantic Grid reference architecture
A low-impact extension of OGSA
• Mixed ecosystem of Grid and Semantic Grid services
Services ignorant of semantics
Services aware of semantics but unable to process them
Services aware of semantics and able to process (part of) them
• Everything is OGSA compliant
Defined by
• Information model
New entities
Model
provide/
consume
expose
Capabilities
Mechanisms
• Capabilites
New functionalities
• Mechanisms
use
How it is delivered
Edinburgh, 27 November 2006
18
S-OGSA Model
METADATA
as Semantic
Annotations
Edinburgh, 27 November 2006
19
S-OGSA Model: Metadata is a first-class resource
Benefits of treating Metadata as a first-class resource:
-- Clear AuthZ mechanisms
-- Clear lifetime
-- Metadata can be also distributed
Edinburgh, 27 November 2006
-- ...
20
S-OGSA Capabilities: From OGSA to S-OGSA
Application N
Semantic-OGSA
OGSA
Application 1
Security
Optimization
Data
Execution
Management
Semantic
Services
Resource
management
Information
Management
Infrastructure Services
Edinburgh, 27 November 2006
21
S-OGSA Capabilities: From OGSA to S-OGSA
Application N
Optimization
Execution
Management
Resource
management
Data
Semantic Provisioning
Services
Ontology
Semantic
Services
Reasoning
Information
Management
Semantic binding
Security
Knowledge
Semantic-OGSA
OGSA
Application 1
Metadata
Annotation
Infrastructure Services
Edinburgh, 27 November 2006
22
S-OGSA Patterns. Semantic Aware and Capable Service
 Deployed in Globus Toolkit 4
Ontology
Service
Metadata
Service
1.1
Farm out
request
Properties
Lifetime
Metadata
Seeking
Client
1
Access/Query Semantic Bindings
Semantics
Others…
Resource
Service
Semantic aware interface
Edinburgh, 27 November 2006
23
S-OGSA Scenario. Satellite Image Quality Analysis
Scenes:



Satellite Routine Operations
Routine operations
Metadata generation
Report retrieving
Satellite LifeCycle:
 Launch and Early Orbit
Phase (~ 3 days)
 Calibration and Validation
campaign (~ 6-9 months)
 Routine operations (~ 5-9
years)
 Satellite de-orbiting.
Product processing
continues
Edinburgh, 27 November 2006
24
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data (Integration) Web
 Semantic Knowledge (Reasoning) Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
25
S-OGSA Metadata Access/Management Protocols
Semantic Binding Service Suite
create
WS-Addressing: epr
SB Factory
WS-RP: Get/Set/Query Properties
Client
create
query
SB
WS-Notif: Subscribe / Notify
WS-RL: Destroy , SetTerminationTime
Semantic
Binding
SB
Inspectprops . . .
SB
RDF
WS-RL ++: archive
Query w/o Inference, UpdateContent
Query( over unified view)
Metadata Query
query
Edinburgh, 27 November 2006
26
S-OGSA Metadata Lifecycle
 Metadata is normally in
stable situation
 If the entity it refers to or
the knowledge entity it
uses change, then it may
move to a stale situation
Stable
GE
changed
Stale
KE
changed
 Checks needed
 Possibly reannotation
 Metadata can be archived
or deleted from the system
Archived
Deleted
“Periodically, we will have to
reannotate
everything”
Edinburgh, 27 November 2006
27
-- Maryanne Martone (US BIRN)
Data Integration
 Information integration
from gLite and GT4
information services
 BDII
 RGMA
 MDS
 Trade-off between...
 Continuous update or ondemand access, fresh
information
 Consolidated data but
possibly non-fresh
information
Edinburgh, 27 November 2006
28
Outline
 Metadata, annotations... What are they and where are they
used?
 Semantic Annotation Web
 Semantic Data (Integration) Web
 Semantic Knowledge (Reasoning) Web
 Our approach to systematic metadata management
 OntoGrid and S-OGSA
 The S-OGSA model: Semantic Bindings
 S-OGSA capabilities and mechanisms
 One S-OGSA scenario of use
 Ongoing work
 Conclusions
Edinburgh, 27 November 2006
29
Conclusions
 Metadata can be used for many purposes
 Simply for the sake of annotation
• Reuse and sharing  Look at the Web 2.0 success
 For integration
• Open and flexible schemas. Describe, not prescribe
 For reasoning
• Complex applications
 S-OGSA
 Metadata as a first-class citizen  Semantic Binding
 Semantic Binding Service already available for use
• Robust metadata management
• Distributed
 Metadata lifecycle
Edinburgh, 27 November 2006
30
Access to S-OGSA
 Publications
 An overview of S-OGSA: a Reference Semantic Grid
Architecture. Corcho O, Alper P, Kotsiopoulos I, Missier P,
Bechhofer S, Goble C. Journal of Web Semantics 4(2):102115. June 2006
 Source code
 http://www.ontogrid.net/, For Downloading Distributions
 Access to CVS
Connection type: pserver
user: ontogrid
password: not needed
Host: rpc262.cs.man.ac.uk
Port: 2401
Repository path: /local/ontogrid/cvsroot
module: prototype
Edinburgh, 27 November 2006
31
Questions
 Thank you for your attention!
 Questions?
 Acknowledgements
 Carole Goble
 OntoGrid team members at Manchester
• Pinar Alper, Ioannis Kotsiopoulos, Sean
Bechoffer, Ian Dunlop, Wei Xing
 OntoGrid Consortium
Edinburgh, 27 November 2006
32
(Ontology-based) Metadata:
What is it,
Where and How can we use it,
and How can we share it?
www.ontogrid.eu
Oscar Corcho
University of Manchester
Oscar.Corcho@manchester.ac.uk
National e-Science Centre, Edinburgh
27/11/06
Download