ordring2013 - Stream Reasoning

advertisement
SLUBM: An Extended LUBM Benchmark for Stream Reasoning
Tu Ngoc Nguyen, Wolf Siberski
L3S Research Center, Universität Hannover, Germany
{tunguyen, siberski}@l3s.de
1
Outline
1. Motivation
2. Benchmark
•
•
Dataset
Methodology
3. Tested Systems
4. Evaluation
• Settings and Results
5. Conclusion
2
Motivation
RDF Stream is everywhere
-
social network, feed, financial market, network sensor
The need of processing heterogeneous and noisy RDF
- Stream-based reasoner
Application developers have to choose
-
Best practice
Benchmark
3
Benchmark
Extended Lehigh University Benchmark [LUBM]
•
Synthetic data, fixed list of 14 queries
Can be scaled to arbitrary sizes
•
Generate data of University domain
Familiar but not trivial ontology
•
University, Faculty, Professors, Students, Courses, …
•
Realistic structural properties
•
Artificial literal data
“Professor1”, “GraduateStudent216“, “Course7“
4
Dataset
•
Simulate temporal University data
•
•
Partition data by semesters
RDF triples + time annotations
• e.g., (<GraduateStudent31, ub:takescourse, GraduateCourse1>, semester2)
•
Predicate dynamic classification
• Three classes: dynamic, near-dynamic and static
• Examples:
• Dynamic: teaches, takes course
• Near-dynamic: has a member
• Static: has a degree from
5
Methodology
System pipeline
6
Methodology
•
Data Generator:
•
•
Re-generate University -domain facts
A semester counter for the loop
ub:takescourse
ub:Student
rdfs:subClassOf
ub:GradStudent
ub:GradCourse
rdfs:subClassOf
semester ++
ub:Undergrad
7
Methodology
•
RDF Handler:
•
•
Parse RDF stream into RDF triples
Annotate RDF with timestamp according to the semester counter
8
Methodology
out-dated facts need to be removed before adding new facts
•Rules for dynamic facts (with dynamic predicates):
• a time-to-last △t
• a produced fact will be removed
after △t
9
Tested Systems
1. BaseVISor
•
•
Forward chaining inference engine
Based on Rete algorithm
2. Pellet
•
OWL-DL reasoner
3. (Pellet)+Jena
•
RDF Framework, supports triple-based abstraction
4. (Pellet)+OWLAPI
•
RDF Framework, supports higher level of OWL abstraction syntax, the axioms
5. C-SPARQL
• language for continuous queries over streams of RDF data
• potential but not yet reasoning support
10
Evaluation Settings
•
Intel(R) Xeon(R) E7520 1.87GHz processor
80GB memory
OpenJDK 1.6.0 24
Linux 2.6.x 64 bit
•
•
•
14 LUBM Queries
1 dynamic predicate: takecourses (approx. 10 percent of generated data are
dynamic)
Metrics: load time, query response time
11
Evaluation Results
BaseVISor Query time for LUBM queries for extended
LUBM (1,0,5), which is LUBM(1,0) over 5 semesters
•Query 5: (type Person ?X) (memberOf ?X
http://www.Department0.University0.edu)
•Query 6: (type Student ?X)
•Query 13: (type Person ?X) (hasAlumnus
http://www.University0.edu ?X)
•Query 14: (type UndergraduateStudent ?X)
“UndergraduateStudent
”
BaseVISor Query time for Query 14 for extended LUBM
(1,0,5), (5,0,5), (10,0,5) and (50,0,5)
12
Evaluation Results
Query time for Query 14 for extended LUBM (10,0,5)
“UndergraduateStudent
”
Load time for extended LUBM (5,05), (10,0,5), (20,0,5)
and (50,0,5)
13
Evaluation Results
Query time for extended LUBM (1,0,5), (5,0,5), (10,0,5),
(20,0,5) and (50,0,5) (for Query 14)
“UndergraduateStudent
”
14
Evaluation Results
15
Conclusion
Identified strong need for a stream-based reasoning benchmark
•
For stream-based application and stream-based reasoning developers
Extended LUBM towards a stream-based benchmark
•
Other benchmarks can be extended similarly
Preliminary experiment with (adapted) stream-based reasoners
•
BaseVISor shows potential performance
16
Download