Paper Title (use style: paper title) - ODU Computer Science

advertisement
Performance Evaluation of Oracle Semantic Technologies: Data
Ashraf Yaseen, Kurt J. Maly, Steven J. Zeil and Mohammad Zubair
Department of Computer Science
Old Dominion University
Norfolk, VA 23529-0162 USA
{ayaseen, maly, zeil, zubair}@cs.odu.edu
Abstract—Ontology-based reasoning systems have a native rule
base but allow also for the addition of application domainspecific rules. Previous work, comparing the performance of
these systems, mainly considered systems performance with
respect to system supported rule bases. In this paper we
present an evaluation of Oracle as an ontology reasoning
system with respect to domain-specific rule bases, in the
context of a question/answer system called ScienceWeb. We
also present UnivGenerator, a tool to generate synthetic data
samples in accordance to the ScienceWeb ontology, as a
component of the evaluation.
Keywords-component;
Oracle Semantic
ontology, reasoning system, domain-specific rules
I.
Technology,
INTRODUCTION
The basic elements of the semantic web [1] are Resource
Descriptive Framework (RDF), RDF Schema (RDFS) and
the Web Ontology language (OWL). With these elements,
domain concepts, properties and relationships can be
specified. Other elements may include a query language for
RDF (SPARQL) [2] and a language to express custom rules
(SWRL) [3].
ScienceWeb is a question/answer system that is being
developed on top of a reasoning and inference layer.
ScienceWeb focuses on scientific research and the researcher
who performs it. Information freely available on the internet
is being harvested, organized, stored, and queried in a
collaborative environment. The objective of the system is to
provide answers to qualitative queries that represent the
evolving agreement of the community of researchers. The
system
allows
qualitative
descriptors
such
as:
“groundbreaking researchers” or “tenurable record”. Having
rule-based definitions of custom descriptors makes it
possible to develop systems capable of answering questions
that contain qualitative descriptors even when those
descriptors involve transitive reasoning as in “What is the
PhD advisor genealogy of professor x?” or require algebraic
computation across populations such as in “Who are the
groundbreaking researchers in Software Engineering”.
In a collaborative environment, where users are able to
customize their qualitative queries, a fast response is
required from the reasoning system. This is a challenge
when dealing with custom rules that are required to answer
qualitative queries.
In order to evaluate the performance (scalability) of a
candidate reasoning system, we need large samples of
semantic data that are instances of a rich ontology. In this
paper, we present a tool to generate synthetic data according
to the ScienceWeb ontology called UnivGenerator. We then
present an evaluation of the suitability of the Oracle11g
reasoning system based upon the data samples generated in
the context of a ScienceWeb-like system.
The rest of the paper is organized as follows: previous
related work is presented in section 2, the ontology and the
generator program are explained in section 3, our experiment
is explained in section 4, the results and discussions are in
section 5 and finally the conclusion is in section 6.
II.
RELATED WORK
A. Reasoning Support Systems
A number of ontology systems have been developed for
reasoning and querying in the semantic web. Examples of
such systems include Jena [4], a Java framework for
semantic web applications with a rule-based inference engine
over RDFS and OWL. Pellet [5] is a Java-based reasoner
supporting OWL-DL. KAON2 [6] is a Java reasoner
supporting OWL-Lite and OWL-DL. SPARQL is used as a
query language and SWRL for custom rules. OWLIM [7]
provides semantic repository and reasoning capabilities over
RDFS, OWL Horst and OWL 2 RL, and Oracle 11g [8]
semantic technologies.
Ontology reasoning systems differ in the storage of
semantic data, the degree of reasoning support and the way
reasoning is performed. Storage of semantic data varies:
some systems store all data in main memory while others are
secondary storage-based. In regards to reasoning support,
some systems only provide support for RDF/RDFS
inference, while others provide more support by including
partial or full support of OWL inference. Some systems
perform the reasoning during the loading stage; some
provide a separate step right after loading; and others
perform the reasoning when querying the system.
Many current systems provide another level of reasoning
support beyond the subclass/superclass types of reasoning by
allowing users to define their own rules. These custom rules
enable the addition of specialized reasoning capabilities.
owl:Thing
Project
Funded Proj.
Researcher
Unfunded Proj.
Faculty
Department
University
ResearchField
Publication
Student
Patent
Adjunct Prof.
Asst. Prof.
Assoc. Prof.
Full Prof.
MS Student
ComputerProgram
PhD Student
Article
JournalArticle
bookChapter
PublishedVolume
conferencePaper
Thesis
MastersThesis
Book
Journal
PhDThesis
Collection
ConfProceedings
Figure 1. Inheritance hierarchy for chosen subset of the ScienceWeb ontology
Oracle Semantic Technologies provides persistent,
secondary storage of semantic data and different degrees of
reasoning support by allowing users to select the required
level. It provides for domain-specific rules, and reasoning is
performed in a separate step.
B. Performance Evaluation
A number of studies have been done on the performance
of reasoning systems [12-18]. These studies address the
scalability of the reasoning systems with regard to the size
and complexity of the ontology.
Benchmarks were created and used to facilitate systems
evaluation. For example LUBM [9, 10] is a widely used
benchmark in evaluating the performance of ontology
reasoning systems. It has a university domain ontology and
dataset generator that is able to provide samples that are
repeatable and scalable to a specific size. Fourteen test
queries come with this benchmark, and several performance
metrics.
Another benchmark is the UOBM [11], an extended
version of LUBM, which provides a higher degree of
reasoning by covering OWL Lite and OWL DL. Moreover,
it adds relations to the ontology enriching the ontology’s
complexity, albeit not to the level needed for ScienceWeb.
Previous work on ontological systems’ evaluation
addresses the reasoning capabilities, the scalability and the
efficiency of each system [12-18]. All of these compared and
evaluated the performance of reasoning systems over
predefined rule bases, for example RDF/RDFS rules or
OWL. We are not aware of any study that addresses the
systems’ capabilities and reasoning performance over
customized rule bases, including Oracle’s study of
performance using the LUBM benchmark [19].
The authors have undertaken a study of the performance
of a variety of systems; both open-source and commercial,
when performing reasoning over customized rule sets, based
both upon the LUBM and ScienceWeb ontologies. [21][22].
Oracle was selected for the more detailed investigation
presented in this paper both because of its promising overall
performance and because of its prominent position in the
industry.
Our target is to evaluate Oracle as an ontology reasoning
system in terms of reasoning and querying using custom
rules in the context of the ScienceWeb system.
III.
THE SCIENCEWEB ONTOLOGY AND BENCHMARK
GENERATOR
At the core of ScienceWeb is an ontology that covers the
concepts and their relationships used in the science research
community. For this performance study we have used the
subset depicted in Figure 1. The ontology includes the
concepts of Department, Publication, and Researcher, and
numerous properties (not shown) such as advisorOf,
authorOf, worksFor, etc. This subset has 32 classes and 48
properties (18 object properties and 30 data type properties).
All the concepts of the ontology found in the LUBM [10]
benchmark can be found in the ScienceWeb ontology, albeit
the exact names for classes and properties may not be same.
For ScienceWeb the scalability issues arise from both the
complexity of the class tree as well as the complexity of the
relationships. We found both LUBM and UOBM wanting in
that regard. Consider for example, the complexity of
Publication subtree shown in Figure 1. This is already more
elaborate than the corresponding structures in LUBM and
UOBM and, as ScienceWeb matures, we anticipate that a
still richer ontological structure will be required.
This complexity has a direct effect on the generation of
benchmark data. The LUBM generator, for example,
generates a flat list of publication objects all of the base
Publication type. We felt that a more realistic distribution
would be required for even basic prototyping of
ScienceWeb, and developed our own tool for this purpose.
The tool is called UnivGenerator, and the instances
generated are repeatable and scalable to different sizes. It
generates semantic OWL data in XML/RDF format. In order
for samples’ generation to be repeatable, a seed value is set
at the beginning of each run. Using different seed values and
keeping the same settings for the rest of the parameters will
generate a different sample with similar size.
To provide randomness in the output set and to control
the size of the output we have a property for each class that
gives the range of possible values. For instance, we will have
a range property for the number of universities, departments,
and number of full professors in a department. The minimum
and the maximum can be the same if only one value is
desired. For example a Full Professor might have the range
of (1,1) for the property hasWebsite. On the other hand the
range of number of authors for a paper might be set to (1,8)
in the field of Computer Science.
University
Project
*
*
Department
*
Faculty
Researcher
*
Student
Figure 2. Examples of aggregation in ScienceWeb
We are in the process of developing a statistical model of
these ranges for the field of Computer Science by obtaining
real data for the ranges of the top level classes in the tree
(e.g., universities, departments, researchers) and sampling
the various other relationships (e.g., number of projects per
faculty, number of conference papers, number of authors). In
the meantime we have estimated these ranges according to
our perception.
To control the size of the output (number of OWL
triples), the user of the generator fixes the top level ranges
and provides appropriate ranges for the properties that should
be randomly selected. These ranges describe desired values
for the arity of these properties, as opposed to the more
relaxed ranges that might be encoded into the OWL ontology
specification to describe the legal ranges of property arities.
This distinction is important to the generation of a
representative internal structure for the generated knowledge
base.
We have developed a spreadsheet that will allow the user
to experiment with these parameters and it will provide an
estimate of the number of triples that will be generated by
the tool. Once a suitable set of parameters has been derived,
the parameters can be input to the generator from the
spreadsheet through a text file.
There are two types of properties that have ranges
associated with them: those that reflect relationships between
objects and those that reflect the scalar value of a property.
For object-valued properties, the input parameter ranges
specify the arity of the relationship. For scalar-valued
properties, the range controls the value of that scalar. For
example, if the Journal to JournalArticle property contains()
has a parameter range of (1,20), this would indicate that the
generator should attempt to provide 1 to 20 articles for each
issue of a journal. If the journal property publicationYear()
has a range of (1965,2010), then any value in that range
might be generated as a property value. In a future version of
the generator, we hope to provide a selection of common
distributions rather than simply the current selection from a
uniform distribution.
The generator program achieves this linking of objects by
processing the ontology in a specific order (the classes are
set and fixed according to the ScienceWeb ontology, so any
changes of the ontology require changes in the program
source code. In the future we may want to generalize this for
other ontologies in this domain).
There are a small handful of aggregation hierarchies
rooted at the types ResearchField, PublicationType, Projects,
Universities, and Researchers. For example, universities
contain departments which contain faculty (Figure 2). In
general, the aggregation hierarchies are first used to guide
creation of instances. Other object-valued properties can be
used, but arity information on aggregation relationships is
often easily estimated. What is essential is that a minimal
collection of properties be selected that span the set of
classes in the ontology. These are visited in the order listed,
and, for each instance, a random number (according to the
range values supplied in the generator input parameters) of
contained objects are created. For example, for each
university object, a random number of departments are
created, then within each department a random number of
faculty are created.
When an instance is created of a class that is a non-leaf in
the inheritance hierarchy, parameters are used to control
which of its subclasses will be selected as the class for the
new instance. The code controlling this is currently specific
to each subclass and may be generalized in the future.
Properties that do not constitute one of the primary
aggregation relationships (including properties whose
membership span the different aggregation hierarchies) are
populated after all aggregation hierarchies have been
processed. This can then be done by selection from among
the already generated instances of the appropriate type.
Again, input parameters to the generator will specify the
minimum and maximum multiplicity of the assigned
property instances (min and max number of co-authors on a
journal paper) and a random selection is made within the
specified range.
For example, each Publication has a number of authors.
This number can vary within the specified range. Initially, an
empty list of authors to the current publication is created in
the PublicationType phase. The actual assignment of authors
will be done once researchers have been created in the
Researcher phase. Some publication types can reference
other publication types; the number of such references is
again governed by a specified range. After all publication
templates have been created references can be linked (at this
time we do reference only within the generated universe); the
program makes sure that no self references are generated.
In the University phase, the program will create the
specified number of universities and for each university, the
program will create a number of departments; and for each
department, the program will create a number of researchers.
Each such object has random values selected for such
properties as ID, name, contact Info, website. Completing
this simple generation phase will be generation of
researchers: A researcher can be a faculty member (full
professor, associate professor, assistant professor or
Lecturer) or a student (PhD student or master student). For
each of these objects we set property values such as: number
of research interests, number of publications and details such
as: name, contact info, graduation info.
In the Researcher phase, a list of publications is created
for each researcher type, with a specific size. At this stage,
for the current researcher type, the program will go through
the list of publications, randomly selecting and assigning
publications to the current researcher type. While doing so,
there are preconditions the program needs to satisfy: The
selected publication-type must belong to a research field that
had been selected for the current researcher, that it had not
been already selected (no duplications) and that the selected
publication does not exceed its max-number-of-authors.
(Since there is a limit of the number of authors that can be
assigned to one publication type.)
IV.
EXPERIMENTAL DESIGN
The basic concern addressed in this study is: How does
performance scale when answering queries in the context of
ScienceWeb, where domain-specific rules are added to
native logic?
A. Targeted system
The Oracle Database (11g r2) Semantic Technologies
was used in this study [8].
Although Oracle provides both incremental and batch
loading mechanisms of semantic data, systems with domainspecific rules are limited to batch loading because inference
entailments must be computed before any queries employing
those rules can be answered. Oracle provides mechanisms
for submitting queries directly in SQL-related notation but
also provides APIs via which other systems, such as Jena or
Sesame, may submit queries.
Inference is done all at once, using forward chaining;
Triples are inferred and stored ahead of query time. i.e., there
is no “on-the-fly” reasoning. This should result in fast query
response times. Incremental inference is supported, but only
over various native rule bases provided, such as RDF, RDFS,
and OWL. Incremental inference over domain-specific rule
bases is not supported[8, 19]. While working with domainspecific rules, any changes on the contents of the Oracle-DB,
whether a change to the rules, the ontology or even the
semantic data, requires us to redo the whole procedure of
loading and computing inference entailments from the
beginning.
B. Experiment Settings and Performance Metrics
We conducted a number of tests addressing the
scalability and the efficiency of the targeted system. The
experiment was done on a PC with a 2.40 GHz Intel Xeon
processor and 16 G memory, running Windows Server 2008
R2 Enterprise.
The performance metrics used are:
 Load time: the time spent loading the sample data from
the input files into the DB.

Inference time or reasoning time: the time it takes to
reason about the data stored in the database.

Query response time: the time it takes to query the
database
C. The test procedure
For each test sample, we created an RDF_data table, a
semantic model and a staging table. Then, we loaded the
sample data into the database. After that entailments were
computed and stored. Finally, we submitted selected queries
about the sample data. Time was recorded during each stage:
loading time, inference time, and query response time.
D. Rules and Queries
Each query of the custom query set we created answers a
question that requires reasoning over a specific rule or
sometimes more than one rule. Because the ScienceWeb
knowledge base does not yet exist, we cannot claim that
these rules and queries exercise the reasoning system in ways
that are representative of the eventual system. However, they
are designed to exhibit the key kinds of reasoning that we
anticipate; including summary statistics over large sets,
reasoning over transitive closures, and reasoning over deeply
recursive rules.
The following are the rules and the queries used in our
experiment :
Coauthor rule: Authors of a common publication are called “co-authors”
of one another.
[(?x :authorOf ?p) (?y :authorOf ?p) notEqual(?x,?y) -> (?x :coAuthor ?y)]
Collaborator Of rule: Researchers are “Collaborators” if they are coauthors or if one is the advisor of the other.
[(?x :coAuthor ?y)-> (?x :collaboratorOf ?y)]
[(?x :advisorOf ?y)-> (?x :collaboratorOf ?y)]
[(?x :advisedBy ?y)-> (?x :collaboratorOf ?y)]
Ground Breaking rule : A “Ground Breaking” researcher in a specific
field, is an author of a publication, published prior to 1990, in that
research field with a Google citation count greater than 1,000.
[ (?x :authorOf ?p) (?p :inResearchField ?f) (?p :hasGCCount ?c) (?p
:inYear ?n) greaterThan(?c,50000)lessThan(?n,1990) -> (?x
:IsGroundBreaking ?f)]
Cool Project rule: A project that is funded with more than $500,000/yr is
said to be a “cool” project.
[ (?x :hasAmount ?c) (?x rdf:type :Projects) (?x :inResearchField ?f)
greaterThan(?c,"500000"^^xsd:int)-> (?x :coolProject ?f)]
Same Advisor rule: Researchers have the “same advisor” if they were
advised by the same person but they themselves are distinct people.
[(?x :advisedBy ?z) (?y :advisedBy ?z) notEqual(?x,?y) -> (?x
:sameAdvisor ?y)]
Colleague Of rule : Two faculty are “Colleagues” if they work for the
same department but are distinct people.
[(?x :worksFor ?z) (?y :worksFor ?z) (?x rdf:type :Faculty) (?y rdf:type
:Faculty) notEqual(?x,?y) -> (?x :colleagueOf ?y)]
Research Ancestor rule (transitive): A researcher is a “research
ancestor” of another researcher if the researcher is a (direct or indirect)
advisor of that researcher.
[(?a :advisorOf ?b) -> (?a :RAncestorOf ?b)]
[ (?a :advisorOf ?c) (?c :RAncestorOf ?b) -> (?a :RAncestorOf ?b)]
Multi-Generational Advisor rule (recursive): A “Multi-generational
advisor” is a researcher who is the advisor of two different researchers
where one of those researchers is an advisor of the other. Any researchancestor of a multi-generational advisor is a multi-generational advisor
himself/herself
[ (?z :advisorOf ?y) (?z :advisorOf ?w) (?w :advisorOf ?y) ->
(:Multi_generational rdf:type Class ) (?z rdf:subClassOf
:Multi_generational)]
[ (?z :RAncestorOf ?y) (?y rdf:subClassOf :Multi_generational) -> (?z
rdf:subClassOf :Multi_generational)]
The queries used for our experiment are all expressed as
counts of a set of selected objects both to exercise summary
capabilities, to simplify the evaluation, and to eliminate the
time for formatting and output of large result sets as a factor
in the timing. Again, formal statements can be found in [cite
full version].
Query1: returns and counts number of triples in the model-table
select count(*) from univ_rdf_data;
Query2: returns and counts all triples in the Model (including the
inferred ones)
select count(*) from (SELECT s, p , o FROM
TABLE(SEM_MATCH('(?s ?p ?o)', SEM_Models('univ'),
SEM_Rulebases('OWLPRIME','UNIV_RB'),SEM_ALIASES(SEM_ALI
AS('','http://www.owlontologies.com/OntologyUniversityResearchModel.owl#')), null)));
Query3: returns and counts all co-authors(as a result of executing the
co-author rule)
select count(*)from (SELECT a as "Author", b as "Co-Author" FROM
TABLE(SEM_MATCH('(?a :coAuthorOf ?b)',SEM_Models('univ'),
SEM_Rulebases('OWLPRIME','UNIV_RB'),SEM_ALIASES(SEM_ALI
AS('','http://www.owlontologies.com/OntologyUniversityResearchModel.owl#')), null)) ) ;
Query4: returns and counts all Research Ancestors(as a result of
executing research ancestors rule)
select count(*)from (SELECT a as "Advisor", b as "Advisee" FROM
TABLE(SEM_MATCH('(?a :RAncestorOf ?b)', SEM_Models('univ'),
SEM_Rulebases('OWLPRIME','UNIV_RB'),SEM_ALIASES(SEM_ALI
AS('','http://www.owlontologies.com/OntologyUniversityResearchModel.owl#')), null)) ) ;
Query5: returns and counts all Multi-generational(as a result of
executing multi-generational advisor rule)
select count(*)from ( SELECT z as "Multi-generational" FROM
TABLE(SEM_MATCH('(?z :Multi_generational ?z)',
SEM_Models('univ'),SEM_Rulebases('OWLPRIME',
'UNIV_RB'),SEM_ALIASES(SEM_ALIAS('','http://www.owlontologies.com/OntologyUniversityResearchModel.owl#')), null)));
E. Data Samples
Three tests were conducted. In each test a number of data
samples were generated. Each sample was studied alone,
first by loading the sample into the DB, creating the
entailment, and query about that data in that sample. Time
was recorded during each stage; loading time, inference time,
and query response time.
The first test is a direct examination of the effect of
sample size upon performance time using a minimal domainspecific rule set. 24 data samples of different sizes were
generated, the sizes ranging from 3,500 to 4 million triples.
One single domain-specific rule was used in this test, the coauthor rule, and only the co-author query was used.
The second test examined the variance in performance,
due to random changes in internal structure of the data set,
on samples that were of approximately the same size, 24
samples were generated with the same input parameters, but
with different seed values each time, which results in
multiple samples of similar size but varying internal
connectivity. Again, one single domain-specific rule was
used in this test, the co-author rule, and its corresponding
query, the co-author query.
The third test focused on the effect of adding more
elaborate domain-specific rules. 16 samples of different sizes
were generated. These samples are more complicated than
the ones used in the previous tests. All domain-specific rules
were used in this test and all queries were executed. To
support this, the number of relations was increased among
some classes. In this test, each sample was tested (loaded in
the database, entailments generated, and queries applied) six
times, with the ordering of tests determined randomly. This
procedure was chosen to explore some unexpected variance
in observed times, as discussed later.
Oracle bulk load was used to load the sample data into
the database. It requires the data to be in N-Triple format, so
another tool, RDF2RDF [20], was used for format
translation. Loading the data into a semantic model in the
database is done in two steps. First the data is loaded from
the input file into a staging table using sql-loader; a tool from
Oracle.
After that, the data is loaded from that staging table into
the semantic model, by calling a specific PL/SQL procedure.
The inference is done by executing a “create_entailment”
procedure; Inferred triples are generated and saved in the
DB.
V.
RESULTS AND DISCUSSIONS
The results from the first test are presented in Figures 3
and 4. In this test, inferencing was performed over OWL
Prime type relationships and a single domain-specific rule.
Figure 3 shows a faster than linear growth, as the sample
size increases, in the time required to load the sample into
the staging table, the time required to load it from the staging
table into the DB and the time required to reason about the
data as well. A log-log plot of this same data has a linear
correlation coefficient of 0.98 and a linear regression slope
of 1.5, suggesting that the growth is polynomial and
significantly slower than the square of the sample size.
Figure 3. Load/Inference performance (Test 1)
Figure 4. Query performance & caching effect (Test 1)
Since all the reasoning is done at once upon creating the
entailment, where all the inferred data is saved ahead in the
database, querying does not take much time. Figure 4 shows
the query response times. Note that the total time required is
orders of magnitude smaller than the times shown in Figure
3. Again, faster than linear growth is observed as the sample
size increases. The logarithms of the sample size and the first
execution time have a linear correlation of 0.97 and a linear
regression slope of 1.75, again indicating a low-level
polynomial growth rate.
The same query was executed 3 times over the same
sample data to measure the possible effects of caching. The
3rd execution of the query takes less time than the 2nd which
in its turn takes less time than the 1st execution. Any number
of executions after the 3rd one (not shown in the figure) will
result in approximately the same response time as the 3rd
one.
Figures 5 and 6 show the results of the second test. In this
the second test, UnivGenerator was fed with different seed
values, but same values for the rest of the parameters. Hence,
the samples generated have similar sizes and complexities.
But, random selection results in samples with different
internal connectivity. Each one of the samples undergoes the
same testing procedure we conducted on the first test. The
time for loading and inference was recorded and compared.
Figure 5. Load/Inference performance (Test 2)
Figure 6. Query performance & cashing effect (Test 2)
Working with samples of similar sizes & similar
complexities should result in similar performance metrics.
This is confirmed in Figure 5, which shows the
loading/inference time for these samples.
Figure 6 shows that queryingQuerying over samples of
similar sizes and similar complexities will result in a similar
performance metrics. The figure also shows the caching
effect when executing the same query more than once over
each sample.
The results of the third test are shown in Figures 7, 8 and
9. Having all rules now in the domain-specific rule base, and
5 queries, the testing procedure was conducted 6 times. The
idea behind executing the same sample 6 times is to test the
system’s consistency over the same data samples.
Figure 7. Load data from input file into S-Table (Test 3)
Figure 8. Load data from S-Table to model (Test 3)
Figure 9. Create entailment (Test 3)
When comparing the results of one sample, the time it
takes to load the data of that sample into a staging table is
almost the same from one run to another. This loading
process takes effect outside the Oracle DBMS, through the
use of an external oracle tool called sql-loader. With an
average standard deviation of 0.21 seconds for each of the 16
samples (executed 6 times each), the variance is negligible.
When loading the samples in the second step from the
staging table into the semantic model, moderate differences
were spotted from one run to another, with an average
standard deviation of 1.0 seconds.
After creating the entailments for the samples in this test,
we noticed significant differences in inference time among
the different runs of the same sample and among all samples
as well. For example, on the 2nd run of sample2, the
inference took 1,344 seconds. However, in the 4th run the
inference took 180.16 seconds. The standard deviation of the
inference times of sample2 executed 6 times was 507
seconds, quite large compared to the mean of 804 seconds.
As the number of triples in a sample data set increases,
one would expect an increase in the inference time.
However, Figure 9 shows a different behavior. For example,
sample 8 of size 291,720 triples has an average inference
time of 2664 seconds over six runs. Whereas, sample 9 of
size 328,035 triples has an average inference time of 1942
seconds. The system was sometimes able to perform the
reasoning in less time for a bigger sample size!
We extended our experiment for this test to closely study
the variance observed in the inference stage of Test 3.
Suspecting possible caching effects, we varied the order in
which samples were tested. Suspecting possible interference
from other processes running on the server, we performed
the experiment multiple times over periods of several days
and at many different times of day. Neither of these
postulated effects appeared to be a significant contributor to
the high variance.
Suspecting an only partly understood relationship
between the complexity of the domain-specific rule base and
the variance, we excluded all domain-specific rules except
one rule, the coauthor rule. And we conducted the whole
testing procedure again on the same samples. Figure 10
shows the performance results of the inference. The system
shows consistency when running the same sample multiple
times, and as expected the bigger the size of the sample, the
more time it takes for the inference.
Hence, the number of domain-specific rules has an
impact on this high variance of Oracle when reasoning over
sample data.
We then selected one sample, with the use of all custom
rules. We ran this sample many times to see how often this
behavior will occur. Figure 11 shows the variation in the
inference time among the different runs on the same sample.
The minimum inference time for sample 3, recorded on
the 5th run, was 3 minutes and 17.69 seconds, and the
maximum inference time recorded on the 33rd run was 20
minutes and 25 seconds. The overall standard deviation
equals 332 seconds, a significant fraction of the expected
value of 816 seconds. The reason for this large, very real
variation, is an open question
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
Figure 10. Create entailment (one rule)
[11]
Figure 11. Create entailment on sample 3 w/all rules 35 times
VI.
CONCLUSION
We have explored in this paper the feasibility of
developing a system to handle qualitative queries, using
ontologies and reasoning systems that can handle customized
rules. When working with a realistic model (ScienceWeb)
challenging issue of scalability arises when the size of the
semantic data increases to millions of triples. Moreover,
another serious challenging issue of real-time response arises
when many customized rules are used.
Oracle offers a decent scalability for the kinds of datasets
anticipated for ScienceWeb. Time required for critical
operations generally grew more slowly than the square of the
dataset size, it also offers good response time when querying
the semantic data. This fast response comes in part from the
heavy pre-computing of the entailed relationships at the
beginning. Although this has a positive implication in query
response times, it has a negative one for evolving systems.
Any change in the ontology, the semantic data or the rule set
requires setting up the system from the beginning.
Of some concern is the absolute magnitude of the time
required for operations. Loading new information and
computing the consequent entailments is far too slow for an
interactive system, even though that time grows quite slowly.
Also of concern is a high degree of variance in time required
to perform inferencing for even small number of rules, which
raises concerns about the consistent responsiveness of
systems based upon this platform.
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
REFERENCES
[1]
[2]
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web.
Scientific american, vol. 284, no. 5, pp. 28-37, 2001.
E. Prud'hommeaux and A. Seaborne. SPARQL Query Language for
RDF (W3C Recommendation 15 January 2008). World Wide Web
Consortium 2008.
[22]
I. Horrocks, P. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, and
M. Dean, "SWRL: A semantic web rule language combining OWL
and RuleML," W3C Member submission, vol. 21, 2004.
Jena team and Hewlett-Packard, Jena Semantic Web Framework.
2010; http://jena.sourceforge.net/
Pellet: The Open Source
OWL2 Reasoner. 2010;
http://clarkparsia.com/pellet/
Information Process Engineering (IPE), Institute of Applied
Informatics and Formal Description Methods (AIFB), and I. M. G.
(IMG), “KAON2-Ontology Management for the Semantic Web,”
2010; http://kaon2.semanticweb.org/.
Ontotext.
OWLIM-OWL
Semantic
Repository.
2010;
http://www.ontotext.com/owlim/
Oracle Corporation. Oracle Database 11g R2. 2010;
http://www.oracledatabase11g.com
Lehigh
University
Benchmark
(LUBM).
http://swat.cse.lehigh.edu/projects/lubm/
Y. Guo, Z. Pan, and J. Heflin, "LUBM: A benchmark for OWL
knowledge base systems," Web Semantics: Science, Services and
Agents on the World Wide Web, vol. 3, no. 2-3, pp. 158-182, 2005.
L. Ma, Y. Yang, Z. Qiu, G. Xie, Y. Pan, and S. Liu, “Towards a
Complete OWL Ontology Benchmark,” The Semantic Web: Research
and Applications, 2006, pp. 125-139.
C. Lee, S. Park, D. Lee, J.-w. Lee, O.-R. Jeong, and S.-g. Lee, “A
comparison of ontology reasoning systems using query sequences,”
Proc. of the 2nd international conference on Ubiquitous information
management and communication Suwon, Korea: ACM, 2008.
B. Motik and U. Sattler, “A Comparison of Reasoning Techniques for
Querying Large Description Logic ABoxes,” Logic for Programming,
Artificial Intelligence, and Reasoning, 2006, pp. 227-241.
T. Gardiner, I. Horrocks, and D. Tsarkov, "Automated benchmarking
of description logic reasoners," Proc. of the Workshop on Description
Logics (DL’06). vol. 189 of CEUR Lake District, UK, 2006, pp. 167–
174.
J. Bock, P. Haase, Q. Ji, and R. Volz, "Benchmarking OWL
reasoners," Proc. of the ARea2008 Workshop. vol. 350 Tenerife,
Spain, 2008.
E. Franconi, M. Kifer, W. May, T. Weithöner, T. Liebig, M. Luther,
S. Böhm, F. von Henke, and O. Noppens, "Real-World Reasoning
with OWL," The Semantic Web: Research and Applications. vol.
4519: Springer Berlin / Heidelberg, 2007, pp. 296-310.
K. Rohloff, M. Dean, I. Emmons, D. Ryder, and J. Sumner, "An
evaluation of triple-store technologies for large data stores," Proc. of
the 2007 OTM Confederated international conference on On the
move to meaningful internet systems, 2007, pp. 1105-1114.
T. Weithöner, T. Liebig, M. Luther, and S. Böhm, "What's Wrong
with OWL Benchmarks?," Proc.of the Second Int. Workshop on
Scalable Semantic Web Knowledge Base Systems (SSWS 2006),
Athens, GA, USA, 2006, pp. 101-114.
P. D. Xavier Lopez, Director and P. D. Souripriya Das, Architect,
"Semantic Technologies in Oracle Database 11g Release
2:Capabilities, Interfaces, Performance," Sessions at the Semantic
Technology Conference Jun-2010.
Minack,
Enrico,
RDF2RDF.
July,
2010;
http://www.l3s.de/~minack/rdf2rdf/
H. Shi, K. Maly, S. Zeil, and M. Zubair “Comparison of Ontology
Reasoning Systems Using Custom Rules”, WIMS 2011, Article No.:
16 , Sogndal, Norway, May 2011
H.Shi “Adaptive Reasoning for Semantic Queries: a White Paper”,
http://www.cs.odu.edu/~maly/papers/AdaptiveReasoning.docx
Download