Efficient Linked-List RDF Indexing in Parliament

advertisement
GEOSPARQL IN
PARLIAMENT
Terra Cognita
Dave Kolas
November 12, 2012
Parliament

Parliament
 In
continuous customer use for ~10 years (Originally
DAML-DB)
 Triple Store with SPARQL support
 Implemented as a persistence layer for Jena/Sesame
 Includes spatial and temporal indexing/processing
 Open source! http://parliament.semwebcentral.org/
Design
3
Joseki
Spatial Index
Processor
Part of Jena
Parliament
Framework
External Storage
Model
Spatial Index
(deegree)
IndexingGraph
Parliament Graph
Temporal Index
Processor
Parliament (C++)
Temporal Index
(BDB)
Parliament’s Indexing Strategy



Applications often require efficient statement insertion
Goal: Balanced insertion, query performance, and
space required
Parliament stores triples using two components:
Resource dictionary
 Statement table


Additional indices can be added for specific purposes
and vocabularies
Spatial Index
 Temporal Index

Parliament’s Spatial Index




First created before GeoSPARQL, used terms
derived from GeoRSS
Now supports most of GeoSPARQL specification
Index is based on R tree in deegree library
(deegree.org)
Approach:
 Explicit
geometries, no qualitative reasoning
 Optimization so far on triple patterns, not functions
GeoSPARQL Implementation

Parliament supports:
 Both
GML and WKT literals, and can interchange
between them
 All three vocabularies for spatial relations (simple
features, rcc8, and Egenhofer)
 Triple-pattern spatial relations
 Filter functions for spatial relations and spatial
combinations
 A large number of coordinate reference systems
 RDFS Reasoning
GeoSPARQL Missing Pieces

The following features of GeoSPARQL are not
currently implemented in Parliement:
 Feature-to-feature
spatial relations via query rewriting
 Optimization on FILTER functions
 Qualitative reasoning
 Standard properties for Geometry
 dimension,
spatialDimension, isEmpty, isSimple,
hasSerialization
 Function
getSRID
Parliament’s Temporal Index



Parallel to spatial index
Terminology taken from OWL-Time (using Allen
relations for overlapping intervals, etc)
Uses Java version of Berkeley DB for persisting
index
Build Process Improvements



Until very recently, GeoSPARQL support was on a
branch, and required building for your desired
platform
GeoSPARQL support has been merged into the
trunk and prebuilt binaries are now available for
Windows, Mac, and Linux
Parliament build structure has been improved again
to require fewer dependencies
Examples


Data on geosparql.bbn.com
Data sets:
 USGS
 Rails,
data in Atlanta, GA
Rivers
 Geonames
data
 Administrative
areas
 Points for buildings, such as schools
Example Query 1

Find All Schools within Georgia
SELECT DISTINCT ?school
WHERE {
GRAPH <http://example.org/data> {
# get Georgia geometry
gu:_1705317
geo:hasGeometry ?ga_geo .
# get schools within Georgia
?school a gn:Feature ;
geo:hasGeometry ?school_geo ;
gn:featureCode gn:S.SCH .
?school_geo geo:sfWithin ?ga_geo .
}
}
Example Query 2

Find Geonames features within 10k
of the Nixon Grove School
SELECT ?x
WHERE {
GRAPH <http://www.geonames.org> {
<http://sws.geonames.org/4212826/> geo:hasGeometry ?geo1 .
?geo1 geo:asWKT ?wkt1 .
BIND (geof:buffer(?wkt1, 10000, units:metre) as ?buff) .
?x geo:hasGeometry ?geo2 .
?geo2 geo:asWKT ?wkt2 .
FILTER (geof:sfContains(?buff, ?wkt2))
}
}
Download