Semantic Grid Group Members: Phạm Đức Đệ Võ Bảo Hùng Hồ Phương Outline Introduction Semantic Web S-OGSA Implementation ( e-Science & myGrid ) 1 What is Grid? The "Grid” ◦ flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources - virtual organizations. 2 What is the Semantic Grid ? 3 An extension of the current Grid in which information and services are given well-defined and explicitly represented meaning, so that it can be shared and used by humans and machines, better enabling them to work in cooperation Why we need the Semantic Grid? Example: To illustrate, consider if a machine’s operating system is described as “SunOS” or “Linux.” To query for a machine that is “Unix” compatible, a user either has to: 1. Explicitly incorporate the Unix compatibility concept into the request requirements by requesting a disjunction of all Unix-variant operating systems, e.g., (OpSys=“SunOS” || OpSys=“Linux”) 2. Wait for all interesting resources to advertise their operating system as Unix as well as either Linux or SunOS, e.g., (OpSys=“SunOS,” “Unix”), and then express a match as set-membership of the desired Unix value in the OpSys value set, e.g., hasMember(OpSys, “Unix”). 5 Why we need the Semantic Grid? Example (cont) Apply Semantics… - Knowledge base: “SunOS and Linux are types of Unix operating system” - Request: “Need the Unix compatibility OS” 6 Semantic Web Current Web ( WWW ) ◦ Is a huge library of interlinked documents that are transferred by computers and presented to people ◦ Anyone can contribute to it ◦ Quality of information or even the persistence of documents cannot be generally guaranteed ◦ Contains a lot of information and knowledge, but machines usually serve only to deliver and present the content of documents describing the knowledge ◦ People have to connect all the sources of relevant information and interpret them themselves Machine can Process the content But 7 Machine can’t Understand content Semantic Web Definition The Semantic Web is an extension of the current web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. --- Tim Berners-Lee 8 Ontology Ontology is a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to describe the domain. Implement by XML, XML Namespace, XML Schema, RDF, RDF Schema và OWL 9 Ontology example 10 Semantic Web Architecture (1) 11 Scale of Interoperability Semantic Grid Semantic Web Semantic Grid Classical Web Classical Grid Scale of data and computation 12 Based on an idea by Norman Paton What is Semantic Grid An extension of the Grid Rich metadata is exposed and handled explicitly, shared, and managed via Grid protocols 13 What is Semantic Grid The Semantic Grid uses metadata to describe information in the Grid. Turning information into something more than just a collection of data means understanding the context, format, and significance of the data. Therefore: ◦ Understand information ◦ Discovery and reuse 14 S-OGSA A Grid usually consist of several different services by OGSA: ◦ ◦ ◦ ◦ ◦ 15 VO management service Resource discovery and Management service Job Management service Security service Data Management service The S-OGSA should (will) provide the metadata + semantic services to those services. S-OGSA The Solution: ◦ Attached the semantic to Grid entities. ◦ Binding them together by semantic binding service. ◦ Normal grid services can be “semantic” by the semantic binding service. 16 S-OGSA 17 S-OGSA Defined by ◦ Information model New entities ◦ Capabilites New functionalities ◦ Mechanisms How it is delivered Model provide/ consume expose Capabilities Mechanisms use 18 S-OGSA Model 19 S-OGSA Model 20 Grid Entities: Grid resources and services Knowledge Entities: represent/operate with some form of knowledge (e.g ontologies, rules, knowledge bases …) Semantic Bindings: entities associate of a Grid Entity with one or more Knowledge Entities S-OGSA Model Example METADATA as Semantic Annotations 21 From OGSA to the S-OGSA Application N Optimization Data Semantic Provisioning Services Execution Management Resource management Semantic Ontology Provisioning Services Reasoning Information Management Infrastructure Services 22 Semantic binding Security Knowledge Semantic-OGSA OGSA Application 1 Metadata Annotation S-OGSA Capabilities Annotation Service Ontology Service Reasoning Service Metadata Service WebMDS Is-a Knowledge Service Semantic Binding Provisioning Service Grid Service Is-a Is-a 1..m Semantic Provisioning Service 1..m CAS Grid Entity SAML file uses Is-a Ontology Is-a Semantic aware Grid Service Knowledge Resource 1..m 0..m 0..m Knowledge Grid Resource 1..m consume produce Rule set 23 OGSA-DAI Is-a Is-a Knowledge Entity Is-a 0..m Semantic Binding Semantic Grid DFDL file JSDL file 0..m Is-a Grid S-OGSA Capabilities Semantic Provisioning Services – SPS provisioning and management of explicit semantics and its association with Grid entities creation, storage, update, removal and access of different forms of knowledge and metadata ◦ Knowledge provisioning services ontology services , reasoning services . ◦ Semantic binding provisioning services metadata services, annotation services . 24 S-OGSA Capabilities Semantically Aware Grid Services ◦ Be able to consume Semantics Bindings and being able to take actions based on knowledge and metadata ◦ Sample Actions : Metadata aware authorization of a given identity by a VO Manager service Execution of a search request over entries in a semantic resource catalogue Incorporation of a new concept in to an ontology hosted by an ontology service 25 S-OGSA Mechanisms Treating Knowledge Entities and Semantic Bindings as Grid Resources ◦ Common Information Model (CIM) Resource Model ◦ Grid Entities : class CIM-ManagedElement in the CIM Model. ◦ Knowledge Entities : class S-OGSAKnowledgeEntity ◦ S-OGSA-SemanticBinding:Semantic Binding, the association between a Grid Entity (CIMManagedElement) and a Knowledge Entity (SOGSA-KnowledgeEntity) 26 S-OGSA Mechanisms 27 Access Patterns to Grid Resource Metadata Query/Retrieval Result 4 3 Metadata Service Metadata Retrieval/Query Request Ontology Service 5 Obtain schema for Semantic Bindings Metadata Seeking Client Semantic Binding Ids Retrieval Request 1 Resource Specific Lifetime 2 State/properties/metadata access port Resource Semantic Binding Ids 28 ... • Deliver Metadata pointers through resource properties • Zero impact on existing protocols Service Outline • e-Science • myGrid project Introduction myGrid Services và Architecture myGrid workbench 29 e-Science (1) ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ ‘e-Science will change the dynamic of the way science is undertaken.’ John Taylor, DG of UK OST ‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’ Tony Blair, 2002 30 e-Science (2) Requirements of e-Science Grid Application Projects determine services required by Grid middleware UK Projects focus more on Grid Data Services than Teraflop/s HPC systems 31 UK e-Science Initiative 32 $180M Programme over 3 years $130M is for Grid Applications in all areas of science and engineering Particle Physics and Astronomy (PPARC) Engineering and Physical Sciences (EPSRC) Biology, Medical and Environmental Science $50M ‘Core Program’ to encourage development of generic ‘industrial strength’ Grid middleware e-Science core program Network of e-Science Centres UK e-Science Grid Support for e-Science Applications Grid Network Issues Generic/Industrial Grid Middleware e-Health Grid ‘Grand Challenges’ Outreach/International Activities 33 UK e-Science Grid 34 UK e-Science Grid All e-Science Centres donating resources plus four JCSR funded dedicated compute/data clusters – Supercomputers, clusters, storage, facilities All Centres run same Grid Software – Starting point is Globus 2 and Condor: Storage Resource Broker (SRB) being evaluated 35 Some UK e-Science Projects (1) GRIDPP (PPARC) ASTROGRID (PPARC) Comb-e-Chem (EPSRC) DAME (EPSRC) DiscoveryNet (EPSRC) GEODISE (EPSRC) myGrid (EPSRC) RealityGrid (EPSRC) 36 Climateprediction.com (NERC) Oceanographic Grid (NERC) Molecular Environmental Grid (NERC) NERC DataGrid (NERC + OST-CP) Biomolecular Grid (BBSRC) Proteome Annotation Pipeline (BBSRC) High-Throughput Structural Biology (BBSRC) Global Biodiversity (BBSRC) Some UK e-Science Projects (2) 37 Biology of Ageing (BBSRC Interdisciplinary Research Collaborations ‘Grand + MRC) Challenge’ Sequence and Structure Data (MRC) ◦ Advanced Knowledge Molecular Genetics (MRC) Technologies Cancer Management ◦ Meical Images and Signals (MRC + PPARC) ◦ Equator Clinical e-Science ◦ DIRC (Dependability Framework (MRC) Neuroinformatics Modeling Tools (MRC) Support for e-Science Projects Grid Support Centre in operation ◦ supported Grid middleware & users ◦ see www.grid-support.ac.uk National e-Science Institute ◦ Research Seminars ◦ Training Programme ◦ See www.nesc.ac.uk National Certificate Authority ◦ Issue digital certificates for projects ◦ Goal is ‘single sign-on' 38 myGrid project 39 myGrid (1) 40 The goal is to design, develop and demonstrate higher level functionalities over an existing Grid infrastructure An e-science research project Develop open source high-level service-base middleware Using database and computation analysis The project is pioneering the use of semantic web technology, to manage annotation, ontologies and sematic discovery myGrid (2) 41 The ultimate is to supply collection of services as a toolkit to build end applications. Outline Introduction 2. myGrid Services and Architecture 1. • • • • • 3. 42 Tools Forming and executing experiments Semantic service Supporting the e-science scientific method Applications and application services myGrid workbench myGrid service and architecture (1) • The myGrid middleware framwork employs service-base • Firstly prototype with web service but with an anticipated migration to the OGSA • The primary services to support routine in silico experiments fall into fours categories: Services that are the tools that will contitute the experiments Service for forming and executing experiments Semantic services Service for supporting the e-science scientific method 43 myGrid service and architecture (2) 44 Tools (1) • Development of domain services that can deliver data and computation analysis • To access bioinformatices tools and data • Bioinformatics service Retrieval database and analysis tools EMBOSS application suite of over eight analysis tools: MEDLINE, SRS, OMIM, NCBI and WU BLAST sequence alignment tools, … Soaplab, connector for command line based system and provides a universal glue to web service 45 Tools (2) Text extraction services AMBIT is system for Acquiring Medical and Biological Information AMBIT provides an information extraction service based on natural language processing 46 myGrid service and architecture 47 Forming & executing experiments (1) FreeFluo workflow enactment engine Can handle WSDL based web service invocation Supporting two XML workflow language: IBM’s web service flow language and Xscufl OGSA distributed query processor Distributed query processing Query language initiate OQL The initial prototype is to be release in August 2003 48 Forming & executing experiments (2) myGrid information repository An information model tailored to e-science Include experiment data and provenance records of its origin Store workflow specifications, information about person and project Metadata storage o Annotations are stored in an RDF triple, such as The Jena Semantic Web Toolkit o Annotation is a key tool used to link related objects 49 Forming & executing experiments (3) myGrid information repository An organisation have a single mIR OGSA-DAI service supports to access repository local and remote The first version of mIR has been built over the relational database product DB2 primarily The second extras a federated architecture, using mediator and extensive use of annotation and shared identifiers 50 myGrid service and architecture (2) 51 discovery & metadata management (1) Registries and registry views Service descriptions are centrally published To extend the idea of a registry in three way: o Personalised views over distributed registries o Extensible metadata storage o Addition semantic descriptions o DAML+OIL semantic description o Semantic description of workflow has been used to discover revelant workflows 52 discovery & metadata management (2) Discovery components To enable more sophisticated semantic discovery Indexing and searches over DAML+OIL A service browser module with the workbench Annotation components myGrid is using semantic web annotation tools 53 discovery & metadata management (3) Ontology services To provide a single point of reference for concepts and to support description DAML+OIL logic reasoning of concept expressions 54 myGrid service and architecture (2) 55 Service for supporting e-science (1) 56 Notification services When new or update data analytical software become available A notification service to mediate an asynchronous interaction between services Servers may register type of notification events Be used to automatically trigger workflow Be defined with ontological descriptions in metadata Service for supporting e-science (2) 57 Provenance management Provence information is used to determine whether a notification service needs to be re-run Freefluo generates provenanece logs in the form of xml file which is stored in mIR Provenance attributes: start time, end time and attribute service instance Service for supporting e-science (3) 58 Personalisation opportunities Difference users can be provided with appropriate views of the mIR the registry view gives a user perspective over the services discovery & metadata management (3) Ontology services To provide a single point of reference for concepts and to support description DAML+OIL logic reasoning of concept expressions 59 Applications and application services Applications can interact with services directly or via a Gateway The Gateway provides an optional unified single point of programmatic access to the whole system To create client software 60 Outline Introduction 2. myGrid Services and Architecture 1. • • • • • 3. 61 Tools Forming and executing experiments Semantic service Supporting the e-science scientific method Applications and application services myGrid workbench myGrid workbench NetBean platform and JAVA Graves Disease is caused by an autoimmune response against the thyroid, causing hyperthyroidism 62 Graves Disease Autoimmune disease of the thyroid (1) 63 Graves Disease Autoimmune disease of the thyroid (2) 64 As soon as the identity of the relevant genes is known the myGrid workbench is used to run workflows that gather information about those genes, help design new molecular biology experiments to focus on the genes of interest, and to predict the 3D structure of the protein products of the genes 65 Graves Disease Autoimmune disease of the thyroid (3) 66 Graves Disease Autoimmune disease of the thyroid (4) 67 Graves Disease Autoimmune disease of the thyroid (5) (1) The notification service informs the user via a notification client in the workbench that new data has been added to the mIR which can be browsed in the workBench (2) In this case it is the identity of a new gene with changed expression in Graves’ Disease 68 Graves Disease Autoimmune disease of the thyroid (6) (3) The user can then discover workflows via a wizard in the workbench The wizard itself makes use of a semantic find service, which finds relevant services and workflows in the myGrid registry using description logic reasoning over associated semantic descriptions A registry browser is also available in the workbench to allow the user to browse more freely for a workflow or service using a hierarchical categorisation based on each individual semantic description (4) 69 Graves Disease Autoimmune disease of the thyroid (7) If an appropriate workflow does not exist, a new one can be created in the Taverna editor (5) The workflow and associated data are submitted to the FreeFluo enactor The enactor provides a detailed provenance record stored in the mIR describing what was done, with what services and when. This can also be viewed within the workbench (6) 70 THANK YOU FOR YOUR ATTENTION