The BIRNLex: Principles and practices of community ontology development Maryann Martone

advertisement

The BIRNLex: Principles and practices of community ontology development

Maryann Martone

The Ontology Task Force: Cross Test Beds

Carol Bean (co-chair), NIH-NCRR

Maryann Martone (co-chair), BIRN CC

Amarnath Gupta, BIRN CC

Bill Bug, Mouse BIRN

Christine Fennema-Notestine, Morph BIRN

Jessica Turner, FBIRN

•Jeff Grethe, BIRN CC

•Daniel Rubin, NCBO

•David Kennedy, Morph BIRN

• Provide a dynamic knowledge infrastructure to support integration and analysis of

BIRN federated data sets, one which is conducive to accepting novel data from researchers to include in this analysis

Identify and assess existing ontologies and terminologies for summarizing, comparing, merging, and mining datasets. Relevant subject domains include clinical assessments, assays, demographics, cognitive task descriptions, neuroanatomy, imaging parameters/data provenance in general, and derived (fMRI) data

• Identify the resources needed to achieve the ontological objectives of individual test-beds and of the BIRN overall. May include finding other funding sources, making connections with industry and other consortia facing similar issues, and planning a strategy to acquire the necessary resources

Concept Based User Interface

• Has been developed based on feedback from community at

Ontology boot-camp and test bed AHMs

• Provides access to

BIRN ontological sources

• Allows for the construction of queries based on familiar concepts - architecture handles the generation of integrated views

Currently, over 2000 tables registered from

BIRN databases, internal and external knowledge sources

BONFIRE: BIRN Knowledge Sources

Bonfire Ontology

Browser and Extension

Tool

BIRNLex

• Grew out of BIRN Ontology Workshops

• UMLS difficult to work with

• Duplicate terms

• No definitions

• Inconsistent and sometimes incomprehensible relationships

– Meant to cover all domains of interest to BIRN: imaging, neuroanatomy, experimental techniques, behavior

– Presented at this year’s SFN meeting; version 1.0 to be released very soon

– Draft version posted on the web ( see OTF Wiki )

– Current domain areas: neuroanatomy, behavioral paradigms, mouse strain nomenclature, experimental procedures

– Developed in Protégé using OWL

BIRNLex - General Principles

OTF has adopted and refined best practices for ontology development being promoted by NCBO/OBO Foundry

• re-use existing community ontologies covering BIRN require domains - e.g. OBI, CARO, BFO, GO Cellular Component, NCBI taxonomy

• novel domains - behavioral paradigms, imaging protocols, etc. submit to OBO Foundry or contribute to relevant community effort

(e.g., imaging experiments and processing going into OBI) for all

• BIRNLex entities - must have Aristotelian definitions (genera & differentia)

• OTF and other BIRN members are holding regular curation sessions

• heavy use of curatorial metadata to support automated evaluation/analysis/maintenance of ontology

• Use OWL and other supporting technologies enabling us to leverage variety of mature and emerging tools to support ontology curation, ontology-centric annotation, and ontology-driven semantic querying

Core Ontologies

• Imported into Protégé

– BFO: Basic Foundational Ontology

– skos (simple knowledge organization system)

• Preferred labels

• Alternative labels

– OBI: Ontology of Biomedical Investigation

• Manually imported: NeuroNames brain anatomy, paradigm classes from Peter Fox

– Each term is identified by its source and its source unique identifier

• Included cross reference to UMLS identifiers

– Utilize synonyms

– Maps to other efforts using UMLS

• End user doesn’t have to worry about these categories

Use of Foundational Ontologies

UBO - Upper Bio Ontology

BFO - Basic Formal Ontology

•Facilitates alignment with other ontologies across scales and modalities

•Adopted framework proposed by Barry Smith and colleagues for biological ontologies (Rosse et al.,

2005, AMIA proceedings)

•Based anatomical work on the FMA

•Don’t want to concern ourselves about the upper level ontologies; want to focus on our domain

•Using as a rough guide for now while these ontologies are being built

BIRNLex is a Lexicon, not a terminology

• A is a B which has C

– Defines class structure

– Defines properties

• Electron microscope is a type of microscope which uses electrons to form an image

– Microscope

• Electron microscope

– Has property

» Image formation

BIRNLex Curation

• Meet on a semi-regular basis (many interruptions)

• Identify domains and strategies

– Not mixing structure and function big help in moving forward

• We slip up quite a bit

• Revise, revise, revise

• Tools for biologists are inadequate; better if you’re a computer scientist (I handed off BIRNLex, reluctantly, several months ago)

• Divide up the work

• Assign curation status

– We don’t argue too long

– Curated, graph position temporary, uncurated, raw import from source

Strict rules for developing taxonomies

• Behavioral Paradigm

– Oddball paradigm

• Auditory oddball paradigm

• Visual oddball paradigm

• Working memory paradigm

– Serial item recognition task

– Radial maze

• Forebrain:

– Has part: Amygdala

• Limbic system

– Has part: Amygdala

The state of Neuroanatomy in BIRN

Hippocampus

Cerebral Cortex

Cerebral White Matter

Putamen

Globus Pallidus

Caudate Nucleus

Thalamus

Ventral Diencephalon

•Assessed the usage of anatomical terms in each atlas used by

BIRN

•Inconsistency in application of terms

•Resolution of technique was not considered

•Create standard “atomic” definitions for core brain parts

•Create a volumetric hierarchy

•Provides a basis for accounting for resolution

•Goal: which structures give rise to signals measured by a technique

•Structure not function

•no arguments about whether the amygdala exists functionally

•No arguments about whether the fornix is functionally part of the hypothalamus

•Imported Neuronames hierarchy for volummetric relations among brain parts

•e.g., hippocampal formation has part

•Mostly gray matter = dentate gyrus, hippocampus

•Mostly white matter = alveus

•Develop consistent application rules:

•“My hippocampus” = dentate gyrus + hippocampus”

•Need descriptors for topological relationships and spatial overlap

Neuroepithelial cell

Glia Neuron

“has regional part”

Compartment

Cell type

Ontology

Microglia Macroglia

Subcellular Anatomy

Ontology: Extending anatomy to subcellular dimension; based on FMA

“has constitutional part”

Dendritic Spine

Dendrite Axon

Shaft

Cell body

Macromolecule Compartment Component macromolecule Component

SER

Compartment

Properties

Post synaptic Morphometrics

Component

SER

RER

Ribosome

Lysosome macromolecule

Actin Filament

Component

Ribosome

PSD

Shape

Distribution

Gene

Ontology

Ribosome

Microtubule

Actin Filament

Orientation

Spine

Properties

Morphometrics

Shape

Distribution

Orientation

Next Steps

• Community extension and curation

– Import into Bonfire

– BIRNLex “Wikipedia”

• Integration with BIRN imaging, workflow and analysis tools

• Work to evaluate and extend PATO for imaging data

– Spatio-temporal relationships

• Better web interface

• Begin transition into fully structured ontology:

– MIND Ontology: Multiscale Investigation of

Neurological disease

Relationships in complex scenes

•Incorporation of ontologies into segmentation tools for electron tomography

•Describe each “scene” as an instance of the ontology

•Capture not only entities but relationships among entities

•Electron microscopic data are sparse

•Discover “rules” for subcellular anatomy

Vlad Mitsner, Masako Terada, Stephen Larson

Images as Instances

Data

FUGO

Biological Entity

Technique

OBI

Annotator

PATO

Analysis

PATO: Phenotype and Trait Ontology

• Way of expressing complex phenotypes in way that is more scientifically “sound”

• BIRN provides valuable test cases for PATO

• BIRN data immediately becomes interoperable with Zebrafish and fly communities

Genotype npo r210 tm84

Bsb[2]

Entity gut gut retina

Attribute structure relative size pattern brain d/v pattern formation blood islands structure qualitative relative number elongation of arista literal process adult behaviour behavioral activity C-alpha[1D]

Suzanna Lewis, Chris Mungall et al.

Value dysplastic small irregular fused abnormal number increased arrested uncoordinated

2003 trial data: FB & ZFIN

Lessons learned

•BIRN has made a good faith effort to evaluate and employ existing ontologies; we are patient but we’ve got work to do

•Ontology building is not for people with thin skins

–We are not attempting to build formal ontologies for everything

–Provide a formal and consistent structure for describing data

•A man who consults one ontologist knows what to do; a man who consults two ontologists is never sure

•Don’t want to be victims in the ontology wars

•NCBO/MGI have been very helpful

•The principles suggested to us so far have been useful; they make the process easier, not harder

•Reference ontologies are useful, because they take care of the categories, e.g., dependent enduring entity, that tend to drive domain scientists a little nuts

–Challenge to develop tools on top of shifting infrastructure

–expect that we’ll have to redo annotation periodically

Download