e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Ontologies in the context of the Neurobase project Bernard Gibaud VisAGeS, U746 Inserm/INRIA, IRISA, Rennes 1 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Acknowledgements • NeuroBase participants • Ontology part – Lynda Temal (ViSAGeS, Rennes) – Gilles Kassel (LaRIA, Amiens) – Michel Dojat (Inserm, Grenoble) 2 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Context: ressources produced by research in neuroimaging • Data, denoting knowledge about brain – Functional maps – Morphological and physiological abnormalities related to the various brain diseases – Behavioral data • Know-how – Exploration methods: paradigms, imaging techniques e.g. specific MR sequences… – Image processing tools • Segmentation, registration, quantification, etc. • Statistical analysis – Image processing pipelines • Suitable for a specific problem 3 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Specific motivations • Integration of heterogeneous data At stake: feasibility of – Image data – Processed data • Interpretation of image content – Associated clinical information • Interoperability of processing tools large scientific studies, and clinical trials with thousands of cases At stake: feasibility of open platforms (e.g. XIP) – Input data : images and parameters receiving portable – Output data : images, registration data, « plug-ins », and etc. deployment of wide– Semantics of data processing scale GRID implementation of image 5 processing e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Approach Application Application Application Mediator-based integration Common semantic reference wrapper wrapper wrapper Data Data Data Site #1 Proc. Tools Site #2 Proc. Tools Site #n Proc. Tools 7 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Common ontology • « a formal, explicit specification of a shared conceptualization » (Gruber 1993) – Necessary to write applications and wrappers (entities, range of values) 8 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Ontology: scope General concepts (upper level ontology) (non specific) General concepts of domain of interest (domain-specific) All other concepts of domain of interest (i.e. to deploy in a real life application) Ex: process, state, natural object, artefact, etc. Ex: patient, scan, study, pathology, image series, etc. Ex: interictal state (in epilepsy), deep brain stimulation (in Parkinson), design matrices (fMRI), etc. 9 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Neurobase ontology • Scope – Studies (subjects, experimental context, clinical aspects, etc.) – Datasets and description of their content (images, ROI, registration data, etc.) – Image processing (processing tools, processing, etc.) • Method – Integration of multiple sources (fMRIDC, DICOM, Neurobase partners’ experience) – Representation : UML, then Protégé • Implementation in the demonstrator: relational schema 10 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Results 11 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Taxonomy Event Scan -Session Assessment Thing Process Study Body -Process Data -Processing Object Person Anatomical -Structure Artefact Acquisition -Equipment Processing -Tool Group of people Information Specification Report Data Experimental -Group 12 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Datasets Thing Event Scan -Session Process Assessment Study Body -Process Artefact Object Data -Processing Person Anatomical -Structure Acquisition -Equipment Group of people Information Processing -Tool Specification Report Data Experimental -Group Dataset Reconstructed Dataset NonReconstructed Dataset MEEG Data MRRaw Data SPECT Projection Static CT Image CT Image Dynamic CT Image MR Image MR Anat. Image Template Dataset SPECT Image PET Image MR Funct. Image Static PET Image MEG Current DipoleList Registration Dataset Graph Segmentation Dataset Mesh Multi Dimensional Image Dynamic PET Image 14 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Dataset : properties 15 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Reports Thing Event Scan -Session Process Assessment Study Body -Process Artefact Object Data -Processing Person Anatomical -Structure Acquisition -Equipment Group of people Information Processing -Tool Specification Report Data Experimental -Group Report InterpretationOf DatasetComponent InterpretationOf MeshComponent InterpretationOf BinaryVoxel Information Scientific Publication DataProcessing LogFile Ethics Commission Authorization InterpretationOf VoxelValues InterpretationOf ProbabilisticVoxel Information 16 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 InterpretationOfBinaryVoxelInformation : properties 17 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Data Processing Event Scan -Session Thing Process Assessment Study Body -Process Artefact Object Data -Processing Person Anatomical -Structure Acquisition -Equipment Information Group of people Information Processing -Tool Specification Report Data Experimental -Group Process Artefact isInvolvedIn Report Data Specification Information InData Processing hasValue Atomic 0,1 0,* Processing Tool hasPort 0,* concernsPort 1 isValuedBy 1 0,* Port 1 0,* Data Processing 1 involves 0,* Dataset Data Processing LogFile 0,1 0,1 inTheContext0f 18 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Discussion - ontology • Need to evolve toward a formal ontology – i.e. expressed in a logical language (e.g. OWL) – Necessary for : • Management of « intelligent » queries • Wrappers • Articulated to consensual (!) « foundational ontologies », – e.g. DOLCE (WonderWeb) or BFO (Barry Smith et al.) – interoperability with external terminology systems, e.g.: • Unified Medical Language System (UMLS, NLM) • Foundational Model of Anatomy (FMA, UW Seattle) • Difficult trade-off – Complexity / practical usability 20 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Ontology: work in progess Towards a formal ontology for medical images and processing tools in neuroimaging 21 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Methods • Basic principles – Modularity – Re-use of existing components • Foundational ontology • (Core) Domain ontologies – Use formal ontologies • Methodology – ONTOSPEC (Gilles Kassel et al.) • Semi-formal approach • Easy translation in OWL 22 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Excerpt from an ONTOSPEC document Guarino’s meta-properties Essential Property / Subsumption Link with Differentia 23 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Overview DOLCE Particulars Foundational ontology Reused ontologies I&DA Participant roles Documents Core ontologies Reasonings OntoKADS Programs & Software Domain ontologies Medical images COPS New-built ontologies Image processing tools Temal et al., FOMI 2006, Trento (Italy) 24 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Based on DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) IST WonderWeb project 25 Dolce e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 26 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Information & Discourse Acts (I& DA) Conceptualizations: means by which Agents reason about the world Proposition: to describe situations Concept: to classify entities Expressions: non-physical forms of knowledge ordered by a communication language Inscriptions: forms of knowledge inscribed on some physical medium 27 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Datasets as Propositions Datasets: - expressed according to a Dataset Expression (i.e. encoding format) - inscribed on some physical medium (i.e. File), or rendered as an Image 28 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Participant roles Based on Guarino’s meta-properties (rigidity, identity, dependence, etc.) and related classifications: Role: Anti-Rigid (~R) and Dependent (+D) Material role: Anti-Rigid (~R), Dependent (+D), with Identity criteria (+I) Formal role: Anti-Rigid (~R), Dependent (+D), without Identity criteria (-I) 29 Bruaux et al., K-CAP 2005, Banff (Canada) e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Reasonings Image generation processes (i.e. processing) are considered as Reasonings because they operate in the non-physical world 30 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Ontology of Programs and Software (COPS) Programs are involved in Reasonings (Actions) through relation AllowsToCarryOut (representation of the programs’ functionality) Conceptualizations or Expressions participate in such Reasonings as Data (i.e. input) 31 or Result (i.e. output) e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Current work • To model Datasets content as functions – (i.e. mathematical functions) – Range, Domain • Sampling characteristics • To model Interpretations as Propositions relating, e.g. segmented regions to real-life objects – Anatomical structure associated to a 3D binary mask – Pathological process (e.g. tumor evolution) associated to a time series of 3D surfaces • To model Processing Tools – model input and output variables as Data and Results (formal roles) – Model actual processing as Actions, in which Datasets may participate (material roles) 32 e-Science Workshop « A roadmap for data integration », nov. 27-28, 2006 Conclusion / perspectives • Experience from exploratory phase – very positive – intuition that potential impact is important in neuroimaging, but also in other fields e.g. genomics, or cancer research (CaBIC) • Work being pursed – Ontology (datasets, processing tools) – Applications • To get feedback from real-life applications 33