Features of an Enterprise-ready Triple Store Ben Szekely June, 2006

Features of an Enterprise-ready Triple Store Ben Szekely June, 2006 © 2006 IBM Corporation IBM Internet Technology Most examples of RDF triple stores focus on specific difficult problems  Focused on inference or standards  Preoccupied with “Billions of Triples”  Little thought given to application programming model.  Not multi-user (limited security) Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Boca Overview – Multi-user, distributed enterprise RDF repository  Selective RDF replication from server to client machines  Security, including named-graphbased RDF access control  Audit trails of changes to data within named graphs  Near real-time event notifications  Sophisticated programming model Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Named Graphs  A named graph is the logical unit of RDF storage in Boca.  Each triple exists in exactly one named graph – If a triple exists in more than one named graph, it exists twice. – Adding and removing triples is done in the context of a named graph  Each named graph has a metadata graph, containing information such as ACLs  Named graphs can be exposed via LSIDs, URLs, Web Services  Named Graph applications – LSID metadata – Workflow documents – Atom feeds – FOAF profiles Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Underlying Technologies  Relational Database (DB2, Oracle, MySQL) – RDF triples stored in a table (subject, predicate, object, graphid) – Space saved by normalizing URIs and strings to integer ids. – Extra tables for history, ACLs, replication  J2EE (Jetty, Tomcat, WebSphere) – Jetty: Standalone server, checkout from CVS and run for testing – WAS: Enterprise-ready Web-application server for real deployment  JMS Server (Active MQ, WebSphere MQ) – pub-sub messaging used for real-time notifications of triple updates. Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Replication  Boca clients have a persistent local RDF store that mirrors a subset of the triples on the Boca server.  Replicated subset specified by: – Triple patterns; e.g. (<http://tdwg.org/meetings/GUID-2#>, <http://tdwg.org/preds/hasParticipant>,*) – Named graph URIs – Triple patterns within named graphs  When a replication is initiated, the service computes what has changed in the subset based on pattern and graph subscriptions.  Replication can work as a background process on the client, or be explicitly initiated.  Applications can query/write against graphs in the local and server models. Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Notification – maintaining the replica in real-time  Updates to named graphs on server are published in near real-time to clients.  Local replicas can be kept up-to-date between replications.  Notification is central to distributed RDF applications – Ex: workflow, collaboration Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Access Controls  Boca uses can have the following system-wide permissions: – canInsertNamedGraphs -- a user must have this permission in order to create a new named graph (i.e. insert statements into a graph that does not yet exist in the system)  Boca users can have the following per-named-graph permissions (these apply also to the system graph): – canRead -- a user with this permission may view the triples in the named graph and in its metadata graph – canAdd -- a user with this permission may insert new triples into the named graph – canRemove -- a user with this permission may remove triples from the named graph – canChangeNamedGraphACL -- a user with this permission may change the ACL triples in the metadata graph – canRemoveNamedGraph -- a user with this permission may entirely remove the named graph from the system Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Versioning  SVN-like approach to versioning  When a triple is added to or removed from a named graph, a new revision of that named graph is created.  Simple API for reading old revisions  Provides a straightforward mechanism for concurrent distributed computing. – When a client submits an update to a named graph, it may specify the version number that it currently has. The update will fail if the graph has been more recently modified. Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology The Boca Programming Model  Named Graphs  Commands  Transactions  Versioning  Replication  Notification Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Abandoned features – Collections, Statement ACLs & Reification  Collections – a statement can exist in multiple collections – A more difficult programming model, what happens when I delete in the context of one collection? – Expensive to maintain – Not a widely accepted programming model (as named graphs are)  Statement-level ACLs – Too expensive – Difficult to program – Not particularly useful, other than the odd, very important statement – In that case, such a statement can live in its own named graph  Reification – Queries were very difficult to formulate – Most RDF applications do not deal with reification – Reification semantics often confused with true quoting – Reification is an arbitrary layer of indirection that can be solved with ontologies Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Future Features  Arbitrary query-based replication/notification  Distributed servers  Open source Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation IBM Internet Technology Applications  Executing OWL-S in a distributed fashion  Storing annotations  Providing LSID metadata  Web 2.0 application backend – Wikis, Blogs, Tagging, Atom  National Cancer Institute research platform Features of an Enterprise-ready Triple Store – Metadata and Ontologies Workshop © 2006 IBM Corporation

Features of an Enterprise-ready Triple Store Ben Szekely June, 2006

Related documents

Products

Support

Features of an Enterprise-ready Triple Store Ben Szekely June, 2006

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib