doc - Bio-Ontologies 2016

advertisement
The SOFG Anatomy Entry List (SAEL) as an annotation tool for functional genomics
experiments
Authors:
Stuart Aitken
Richard Baldock
Jonathan Bard
Albert Burger
Duncan Davidson*
Terry Hayamizu
Helen Parkinson**
Alan Rector
Martin Ringwald
Jeremy Rogers
Cornelius Rosse
Christian J. Stoeckert
University of Edinburgh
MRC-HGU
University of Edinburgh
Herriot Watt University
MRC-HGU
The Jackson Laboratory
EBI
University of Manchester
The Jackson Laboratory
University of Manchester
University of Washington
University of Pennsylvania
stuart@inf.ed.ac.uk
Richard.Baldock@hgu.mrc.ac.uk
jbard@staffmail.ed.ac.uk
ab@macs.hw.ac.uk
Duncan.Davidson@hgu.mrc.ac.uk
terryh@informatics.jax.org
parkinson@ebi.ac.uk
rector@cs.man.ac.uk
ringwald@informatics.jax.org
jrogers@cs.man.ac.uk
rosse@u.washington.edu
stoeckrt@pcbi.upenn.edu
* Communicating Author
** Presenting Author
Introduction
The long study of anatomy and the need for common annotation for biology and medicine
have resulted in a proliferation of biomedical ontologies built for different purposes, using
different knowledge representation tools and often very rich in terms, structure and
relationship types. Anatomy components in biomedical ontologies serve varied purposes, for
example, descriptions of medical procedures in GALEN (Rector et al.1999), or description of
traits or phenotype. As there are multiple anatomy ontologies they often contain nonorthogonal concepts, though these are often defined and structured differently within each
ontology.
For example, the Foundational Model of Anatomy or FMA (Rosse and Mejino 2003), which
takes a structural view of anatomy, contains the concept “liver”, which is defined in free text
as “Lobular organ the parenchyma of which consists of lobules which communicate with the
biliary tree” Liver is also described formally by various attributes, including: member-of,
bounded-by, component-of, adjacency etc. If we consider liver in the Mouse Anatomical
Dictionary (Hunter A. et al. 2003), which has a developmental view, the “liver” is part-of the
“liver and biliary system” and developmental stage information is provided. The level of detail
provided by these ontologies is variable and the purposes of the ontologies are clearly
different, though both contain the concept liver. Merging ontologies is a complex process
(Rector, 2004) as the different structures and relationship types must be reconciled and this
process may not be necessary for functional genomics applications.
Selection of anatomical terms for annotation of functional genomics annotation therefore
requires some knowledge of where to look, some information on the purpose and scope of
the ontology queried, and an ability to critically assess whether the term returned is accurate.
These tasks may be intuitive for the average scientist who has some notion of the concept of
each term. However, in a high throughput situation it is time consuming to query large and
complex ontologies directly, and in our experience many scientists simply annotate with free
text. This causes data exchange and query problems for those who manage functional
genomics data.
With these points in mind Standards and Ontologies for Functional Genomics (SOFG,
www.sofg.org) has set up an international effort to integrate human and mouse anatomy
ontologies (http://www.sofg.org). As part of this effort, a workshop group comprising
representatives from Galen (Rector et al. 1999), the FMA (Rosse and Mejino 2003), the
Mouse Anatomical Dictionary for Mouse Development (Bard et al. 1998) and the Human
Developmental Anatomy ontology (Hunter et al. 2003), the Anatomical Dictionary for the Adult
Mouse (http://www.informatics.jax.org/searches/AMA_form.shtml), the RNA Abundance
Database (RAD, Manduchi et al. 2004) and ArrayExpress (Brazma et al. 2003) was formed to
consider the following:
1. Explore whether an entry list to existing ontologies would be useful in a functional
genomics context, specifically microarrays
2. Determine the use cases, limitations and criteria for building such a list
3. Produce an anatomy entry list of terms
4. Test the draft set of terms against existing ontologies and functional genomics data
repositories
5. Consider implementation issues and a web services model for querying mapped
ontologies.
The result of the workshop is the SOFG Anatomy Entry List (SAEL) consisting of an
unstructured list of approx 100 vertebrate/mammalian anatomy terms representing the major
body substances (e.g. blood) and dissectable parts (e.g. liver) likely to be used in a
microarray or other functional genomics experiments.
SAEL was drawn up by looking at various source ontologies and was subsequently tested
against user-supplied vertebrate anatomy annotation in the following gene expression
databases: ArrayExpress (Brazma et al. 2003), RAD (Manduchi et al. 2004), GXD (Hill et al.
2004), and SMD (Sherlock et al. 2001).
The list is intended for use both as an annotation resource and as an entry point to mapped
ontologies for both biologists and curators. The SAEL terms are uniquely identified and are
available as an OBO format file from www.sofg.org/sael. The terms are purposely not defined
and are presented as an unstructured alphabetical list as they are intended for simple
annotation and mapping, rather than as an independent anatomy ontology. The participating
databases and ontologies are preparing to map SAEL terms to their resources so that users
can move easily from SAEL to these more sophisticated resources.
The SAEL will be made available at the SOFG web site (www.sofg.org), as will the mappings
from SAEL entries to participating source ontologies. The knowledge acquisition tool COBrA,
developed as part of the XSPAN project (www.xspan.org), is used to record mappings and
relevant provenance data. We invite interested communities to provide feedback on SAEL.
We will also use XSPAN's web service and user interfaces to provide program-level and
additional user access to SAEL and its mappings. In the first instance, this web service
merely returns the accession numbers and names of mapped anatomical structures in the
source ontologies. In the second phase, the central web service will interoperate with
additional web services, to be deployed at the source ontology sites, which will facilitate
access to a standardised set of properties of the anatomical structures in the source
ontologies. The corresponding WSDL descriptions will be posted on the SOFG web site.
References:
Bard JBL, Kaufman MH, Dubreuil C, Brune RM, Burger A, Baldock RA, Davidson DR. 1998.
An internet-accessible database of mouse developmental anatomy based on a systematic
nomenclature. Mech. Dev. 74:111-120
Brazma, A. et al. (2003). “ArrayExpress--a public repository for microarray gene expression
data at the EBI.” Nucleic Acids Res 31(1): 68-71
Hill DP, Begley DA, Finger JH, Hayamizu TF, McCright IJ, Smith CM, Beal JS, Corbani LE,
Blake JA, Eppig JT, Kadin JA, Richardson JE, Ringwald M. 2004. The Mouse Gene
Expression Database (GXD): updates and enhancements. Nucleic Acids Res. 32 Database
issue:D568-571
Hunter A, Kaufman MH, McKay A, Baldock R, Simmen MW, Bard JB. 2003. An ontology of
human devleopmental anatomy. J. Anat. 203:347-355
Manduchi E., Grant G.R., He H., Liu J., Mailman M.D., Pizarro A.D., Whetzel P.L., Stoeckert
C.J. Jr. (2004) RAD and the RAD Study-Annotator: an approach to collection, organization
and exchange of all relevant information for high-throughput gene expression studies.
Bioinformatics, 20(4): 452-459
Sherlock, G., Hernandez-Boussard, T., Kasarskis, A., Binkley, G., Matese, J.C., Dwight, S.S.,
Kaloper, M., Weng, S., Jin, H., Ball, C. A., Eisen, M.B., Spellman, P.T., Brown, P.O., Bostein,
D., Cherry, J.M. (2001) The Stanford Microarray Database. Nucleic Acids Res, 29:152-5
Rector, A.L., Zanstra, P.E., Solomon, W.D., Rogers, J.E., Baud, R., Ceusters, W., W
Claassen, Kirby, J., Rodrigues, J.-M., Mori, A.R., Haring, E.v.d. and Wagner, J. Reconciling
Users' Needs and Formal Requirements: Issues in developing a Re-Usable Ontology for
Medicine. IEEE Transactions on Information Technology in BioMedicine, 2 (4). 229-242
Rector, A.L., Defaults, context and knowledge: Alternatives for OWL-Indexed Knowledge
bases. in Pacific Symposium on Biocomputing (PSB-2004), (Kona, Hawaii, 2004), World
Scientific, 226-238
Rosse C, Mejino JVL. 2003. A reference ontology for biomedical informatics: the Foundational
Model of Anatomy. J Biomed Inform. 36:478-500
Download