Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Overview • • • • • • • BiodiversityWorld Metadata Repository BDW Ontology Chosen ontology tools Ontology structure Metadata Agents Metadata in the BDW environment Further developments BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Repository (MDR) • To store BDWorld metadata - “data about data” • Composed of: • Resource Registry • BDWorld Ontology • Ontology tools – inc access, update, query • Access methods: • Web front-end (eg Resource Registry) • Ontology GUI • Metadata Agents BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Repository • To allow resources to publish their metadata • • • • Describe the type of resource Supported operations Supported data types Access methods • Supply information to workflow manager: • Locating suitable resources & operations • Ensure operation compatibility • Manage workflow provenance information BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology: a definition • “An ontology defines a common vocabulary for researchers who need to share information on a domain. It includes machine-interpretable definitions of basic concepts in the domain and relations among them.” • Some reasons: • Extracting domain knowledge (explicit, analysis) • Reuse of domain knowledge • Sharing common understanding of the structure of information BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology: What is it? • Ontology is a formal explicit description of: • concepts in a domain of discourse (classes, concepts) Entity/Object • properties of each concept describing various features and attributes of the concept (slots, roles, properties) Attribute • restrictions on slots (facets, role restrictions) Value • Knowledge base - an ontology, with a set of individual instances of classes/concepts BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Why do we want an ontology? • Storing and sharing of semantic information • Answering questions: • Which resources can provide information on legumes? • Find operations that are similar to a given operation. • Which operations accept map coordinates? • Given a specific operation, which other operations will accept its output? BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Protégé & Jena • Protégé • • • • Open source ontology editor Knowledge base framework Number of plug-ins available Customisable • Jena • Java framework for building Semantic Web apps • Programmatic environment – RDF, RDFS, OWL • Rule-based inference engine BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology Contents • Resources • Datatype: Name, Description, Type, Owner, … • Object: Operation(s), Endpoint(s), … • Operations • Datatype: Name, Description, Usage, Num ports, … • Object: Ports, Resource(s), Author/Owner, Similar Operation(s), … • Ports • Datatype: Type, Optional, Default • Object: Data type, Operation(s) BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology Contents • Data Types • Java, XML, etc • Link to operations • Acronyms: expansion, definition(s) • Keywords • Definition(s) • Synonyms & Antonyms • Similar terms • Two levels of metadata … BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Levels of Metadata • Metadata - “data about data” • Meta-metadata – data about metadata • BDW metadata: Meta-metadata • Resource Name • Description Ontology metadata • MDR data: Metadata • “AVH Database” • “Australia's Virtual Herbarium” BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Instance data BDW Ontology in Protégé Datatype property Object property BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology Viewing BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology Viewing BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Instances and Relationships BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Terms/Keywords BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Definitions Related terms Opposite Interchangeable The Ontology Online BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents • An interface to the MDR • Multiple instances for a single MDR • Primary role of Metadata agents: • Resource Locating • Resource Matching • Resource Discovery • Provide metadata and semantic information to workflow manager BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Initial Operation Set • Vocabulary queries • Retrieve all resource names • Retrieve all operation names • Selection queries • Retrieve all resources meeting specific criteria • Retrieve all operations with specific output • Matching queries • Which operations accept data type x? • Which operations output specific data type? BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 How the MDR will fit into BDW Workflow Manager (Triana) Wrapper(s) User Ontology GUI Protégé Inference engine Metadata Agent(s) Multiple wrappers & resources JENA Multiple instances BDW Ontology (OWL) Currently: OWL Future: Database BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Ontology Instance Data (Database) Further Development • • • • • • More work on inference engine (Jena) Linking Metadata Agents to Jena Refining/Improving ontology Testing of resource registration methods Storage of more metadata Putting it all together – Semantic Mediation BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005