Application of ontologies in Cancer - Maria

advertisement
APPLICATION OF ONTOLOGIES IN CANCER
NANOTECHNOLOGY RESEARCH
SEMANTIC WEB ARTICLE
Student: Andreea Buga
Group: 1241E – FILS
Coordinating Teacher: Maria Iuliana Dascalu
APPLICATION OF ONTOLOGIES IN
CANCER NANOTECHNOLOGY
RESEARCH
SEMANTIC WEB ARTICLE
PART 1
Introduction
Looking deeper to
INTRODUCTION
ontologies
Problem statement and definition.
Ontologies and
LOOKING DEEPER TO ONTOLOGIES
A brief presentation of ontologies and the way they can improve
different processes and activities in working with data.
ONTOLOGIES AND BIOMEDICAL RESEARCH
The importance of ontologies in biomedical research and the actual
area in which they work.
biomedical research
How
nanotechnology,
cancer research and
ontologies work
together
Existing solutions
HOW NANOTECHNOLOGY, CANCER
Simple Use Case
RESEARCH AND ONTOLOGIES WORK
Applications
TOGETHER
Conclusions
Bibliography
EXISTING SOLUTIONS
SIMPLE USE CASE
APPLICATIONS
CONCLUSIONS
BIBLIOGRAPHY
1
INTRODUCTION
We live in the century of information and speed. Every domain has its own knowledge stored and displayed to
humans as data. The scientific progress and the development of informatics systems has led to a large amount
of data that has to be processed daily. Life sciences are very important in the nowadays studies and latest
discoveries are important steps to a better life in the future. Data mining and analyzing is a very important tool
in understanding life processes and establishing new theories and setting results.
One of the most important issues of our daily lives is finding a cure for the diseases that have started to spread
and affect us more and more. Cancer research is one the directions of study that gained importance due to the
impact of the solutions proposed. Data mining and analysis offers a better understanding of the causes, effects
and treatments of cancer. But the large amount of data needed to be processed needs improved tools of
classification, taxonomy and creating hierarchies.
We will focus our attention on the cancer nanotechnology research – an interdisciplinary area using
nanotechnology methods in the treatment, diagnosis and detection of cancer. The ontologies containing the
specific vocabulary organized in a hierarchy provide the knowledge framework for the annotation, knowledgebased searching, data mining and interpretation and diagnosis.
LOOKING DEEPER TO ONTOLOGIES
There are several ontologies developed for cancer nanotechnology research that try to structure and efficiently
use the information obtained from patients, previous studies, and analysis. But what is an ontology more exactly
and how it can be used in such applications?
“An ontology may take a variety of forms, but necessarily it will include a vocabulary of terms, and
some specification of their meaning. This includes definitions and an indication of how concepts are inter-related
which collectively impose a structure on the domain and constrain the possible interpretations of terms."1
Ontologies have been used in collaborative – research and working with databases in several ways. For example,
ontologies can provide specific terminology used in a certain area by specialists and computers; they can provide
semantic sharing and integration of data gathered, create the logical connections between it and allow the later
search, retrieval and diagnosis with the aid of the data stored. Inferences have an important role in assuring the
quality of the logical connections.
ONTOLOGIES AND BIOMEDICAL RESEARCH
Biomedicine is an area containing a large vocabulary of specific terms related to diseases, symptoms,
equipment, treatment, and diagnostics. Organizing the available knowledge has an important role in processing
the data, trying to find analogies, propose treatments, and make inferences on the available data. The following
definition offers a better view:
1
2
M. Uschold, M. King, S. Moralee, and Y. Zorgios. The Enterprise Ontology. ,The Knowledge Engineering Review, 13(1):3189, 1998.
“In biomedical research, ontologies are used to represent the knowledge of a specific domain of interest in
machine-processible form and to integrate experimental data that is annotated with terms from these
ontologies.”2
HOW NANOTECHNOLOGY, CANCER RESEARCH
AND ONTOLOGIES WORK TOGETHER
“Nanotechnology involves the application of scientific knowledge from a variety of disciplines in science and
engineering to understand, manipulate, and control the properties of matter at nanoscale (1-100 nm) size
dimensions.”3
Nanotechnology solutions have some advantages that may overcome the problems faced by the conventional
engineering in cancer treatment and research. One of the main problems of the normal size approaches is the
fact that they cannot be very accurate in working with cancer cells (that are very small). Cancer cells have
common characteristics with healthy cells and the lack of specificity of drugs may affect healthy cells. The
nanomaterials used in cancer research are called NP-CDTs and will be further on referred like this. Food and
drug administration has introduced in 2005 a treatment based on NP-CDTs for metastatic cancer and several
other such treatments are being tested on clinical trials.
Informatics methods are considered to be useful tools in the advancement of nanotechnology cancer research.
NP-CDT are very diverse and may have a wide large of applications, as studies revealed. The diversity is offered
by the large number of interactions that may change the chemical composition. Therefore, making a small
change in the chemical properties of such a material will lead to generating new medicine data sets. The preclinical evaluation of such a NP-CDT needs a lot of experimental characterization, which also generates
information. Even though the number of NP-CDT data is much smaller than the genomic data, the richness,
therapy and diagnostic relevance of NP-CDT data leads to a combinatorial complexity exceeding genomic data.
Based on the available data, inferences and searches can be done in order to see the problem existent with
some of the NP-CDT treatment; how it can be reformulated so that it would have a benefic effect.
“Informatics approaches are likely to be valuable for such reformulation efforts, especially if one has access to an
integrated resource that yields rich information regarding both the physicochemical and functional properties of
NP-CDTs, as well as tumor physiological properties.”4
Ontologies will have an important role on the development of databases containing information about cancer
and nanotechnologies, on building inferences and helping researchers to better understand the relations
between the diagnosis methods, the cancerous cells, and the applied treatment.
2
Dennis G. Thomas, Rohit V. Pappu, Nathan A. Baker, Journal of Biomedical Informatics 44 (2011) 59–74,NanoParticle Ontology for
cancer nanotechnology research, February 2011
3
3
Nanoscale Science, Engineering and Technology Subcommitee, Committee on Technology, National Science and Technology Council.
The National Nanotechnology Initiative Strategic Plan. 2004.
4 Dennis G. Thomas, Rohit V. Pappu, Nathan A. Baker, Journal of Biomedical Informatics 44 (2011) 59–74,NanoParticle Ontology for
cancer nanotechnology research, February 2011
EXISTING SOLUTIONS
caNanoLab Project stores, searches and shares data generated from characterization studies of nanomaterials
used in cancer research. But this database needs a specific vocabulary that will allow the connection with other
cancer related databases and data sharing. National Cancer Institute Thesaurus with other organizations
developed few terminologies, but this is only the beginning of a large vocabulary that has to be created. Some
existing vocabularies (from bioinformatics, genomics, cancer medicines) can be used to define terms needed for
this area of expertise, but a specific vocabulary for cancer nanotechnologies does not exist.
Ontologies and their machine-interpretable structure will open the path for communication between
researchers from different fields, will ensure semantic interoperability between applications and databases,
provide new analytical tools. We need such ontologies to represent knowledge that can be used for data
integration, knowledge-based search, drawing inferences, making classifications.
SIMPLE USE CASE
Consider the following scenario as an example: a chemist has synthesized a dextran-coated nanoparticle, but
wants to make a rough prediction about its in vitro and in vivo properties. So, s/he plans to compare it with
nanoparticles that have characterization data available in a database such as caNanoLab. To make the best
predictions, the researcher must identify that nanoparticle which most closely correlates with the dextran-coated
nanoparticle. For this, the researcher must know what descriptors to choose from for comparing the
nanoparticles, and also know the optimal descriptors needed to help determine the type of nanoparticle that
highly correlates with the dextran-coated nanoparticle. These descriptors can be provided by the ontology. At the
simplest level, if the descriptor is type of coating material, then by classifying nanoparticles based on this type of
coating material will help identify the highly correlated classes of nanoparticles that are either the sibling classes
or child classes of dextran-coated nanoparticles. In this way, the researcher only needs to look at nanoparticle
data annotated with the ontology classes, and to compare results of the different nanoparticles identified from
the classification in the ontology. Better predictive models such as Structure Activity Relationship (SAR) models
can be developed using data annotated and integrated by an optimized set of ontology-based descriptors for
every case in question.
PROPOSED SOLUTION
The paper will analyze the solution to develop an ontology for nanoparticle (NPO) proposed by D. Thomas, R.
Pappu and N. Baker in their article.
For the beginning, a specific list of terms used in nanotechnology to describe NP – CDTs is created. From this list
of terms and their definition one can notice the complexity of the classes defined and related in the ontology.
After choosing the terms and defining them, the following conclusion regarding a nanoparticle structure was
drawn:
4
“In general, a nanoparticle formulation consists of chemical components that can be enumerated as 1)
nanoparticles, 2) active chemical constituents, which are part of the chemical makeup of the nanoparticle, and 3)
active chemical components which functionalize the nanoparticle. There can be one or more types of
nanoparticle in a nanoparticle formulation, depending upon the nanoparticle structure, function or chemical
composition. All of the chemical components can be described by their molecular structure, biochemical role,
or function. Besides enumerating and describing the chemical components, one needs to describe the types
of chemical linkages (e.g., amide linkage, disulphide linkage, encapsulation, etc.) that exist between them, and
thereby provide an overall qualitative description of the chemical composition in a nanoparticle formulation.
Other descriptions include: physical locations of chemical components in a nanoparticle (e.g., core, surface,
etc.);shape of the nanoparticle (e.g., spherical, cylindrical, etc.); physical state of the formulation (e.g., emulsion,
hydrogel, etc.); physical, chemical or functional properties of the active chemical components / constituents
(e.g., organic, hydrophilic, magnetic, etc.); intended functions and applications of chemical components for
cancer treatment, diagnosis and therapy; underlying mechanisms guiding the design of the nanoparticle
formulation (e.g., endocytosis, active targeting, etc.), and; type of stimulus that may be required for activating
nanoparticles (e.g., magnetic field, ultrasound, pH change, etc.) and the nanoparticle's response to the
stimulus (e.g., drug release from nanoparticle in response to magnetic field stimulus, heat generation from
nanoparticle in response to stimulus of infrared light, etc.).”2
Nanoparticle representation in NPO
The ontology has been built on the fundamentals of Basic Formal Ontology (BFO) framework and implemented
in OWL using well-defined ontology principles from which we remember:
5
1. Principle of unbiased representation: Following BFO design principles, any term in the ontology should
represent an entity as known in reality and not represent it from the biased view of an individual.
2. Principle of asserted single “is a” inheritance: Again following BFO principles, each term should have no
more than one parent term in the asserted OWL hierarchy. This principle offers the advantages of
making the ontology easily extensible and interoperable with other ontologies that have a formal
structure.
3. Principle of inferred multiple “is a” inheritances: Multiple parent-child relationships for a term are not
present in the asserted hierarchy. However, a term can have more than one parent in the inferred
hierarchy that is constructed by invoking an appropriate “OWL reasoner” on the asserted hierarchy.
Rules for inferring these relationships are expressed using OWL description logics and specified as
OWL necessary and sufficient or necessary conditions, in the ontology. The OWL reasoner uses the OWL
expressions to create the inferred hierarchy.
4. Preferred name and textual definition: Every OWL class and OWL property (object, datatype) must have
a preferred name and a textual definition using the NPO's OWL annotation properties: “preferred name”
and “definition”.
5. Synonym: If a class or OWL property has multiple names, these names must be provided as synonyms
using the NPO's “synonym” OWL annotation property.
6. Code: Every class must have an identification code that starts with the prefix “NPO_” (e.g., NPO_100)
7. Rdf ID and rdf:Label: Every class specifically defined in the NPO must have its NPO code as its rdf:ID. The
rdf:ID of every class borrowed from an external ontology found in the OBO Foundry list, must be
preserved in the NPO. Every class in the NPO must also have its preferred name as its rdf:Label.
The obtained results was an ontology having 1564 classes, 45 object properties specifying class – level
associations and 5 OWL annotation properties (definition, synonym, code, preferred name, dBXreflId).
All the domain – specific entities are classified under the BFO classes, Entity being the top- - most class. There
are also independent entities that refer to a Nanomaterial, MolecularEntity, Instrument, Material Site, Material
Boundary and also entities related to the biological processes: Molecular Function, Nanoparticle Response to
Stimulus, Tumor Targeting. We can notice the great number of entities involved in defining this ontology that
appear due to the vast interaction of various domains.
The following image shows an inference parentchild relationship between different particles:
6
A wider view of the ontology is given by the following hierarchy related of “nanoparticle” formulation in NPO:
7
APPLICATIONS
It is clear that the ontology serves as an important tool in cancer nanotechnology research: diagnosis,
treatments, analysis. But the application domain is wider than we can imagine. There are numerous publications
and journals needed by researchers. Search results may be irrelevant and may lead to a time waste for the
scientists. Such an ontology solves the search issues.
NPO provides the needed terminology and enlarges the search possibilities (synonyms, search by topics,
associations and relations based on the ontology). Therefore, the search can be done with knowing the details
from the cancer nanotechnology area and increases inter-domains operability.
Data indexing, retrieval and integration can be done using NPO annotation. This part is an important step in data
mining and knowledge discovery and will be essential in future research progress.
CONCLUSIONS
The aim of this paper is to present the advantages of an ontology developed for cancer nanotechnology
research. The ontology is founded on the basis of BFO and is implemented in OWL. Knowledge embedded in this
ontology is related to chemical proposition, properties, preparation of nanomaterial and it is also related to
other cancer research databases.
We have seen the importance of the ontology in establishing connections with other domains, making
inferences and helping the scientists develop their current research. Knowledge-based search, logical
connections, semantic integration and data mining are the first steps in future technology and they may be a key
factor in the discovery of new treatments, predictions and studies related to cancer.
BIBLIOGRAPHY

M. Uschold, M. King, S. Moralee, and Y. Zorgios. The Enterprise Ontology. ,The Knowledge Engineering
Review, 13(1):31-89, 1998.

http://www.w3.org/standards/semanticweb/inference, Inference, 09.05.2013 – 11:38 AM
Dennis G. Thomas, Rohit V. Pappu, Nathan A. Baker, Journal of Biomedical Informatics 44 (2011) 59–
74,NanoParticle Ontology for cancer nanotechnology research, February 2011
http://www.nano-ontology.org/; 09.05.2013 -1:37 PM
Data Mining in Cancer Research, Paulo J.G. Lisboa, Liverpool John Moores University, UK, IEEE
COMPUTATIONAL INTELLIGENCE MAGAZINE, FEBRUARY 2010



8
Download