Ontology Query What is an Ontology • Ontologies resemble faceted taxonomies but use richer semantic relationships among terms and attributes, as well as strict rules about how to specify terms and relationships. Because ontologies do more than just control a vocabulary, they are thought of as knowledge representation. The often-quoted definition of ontology is "the specification of one's conceptualization of a knowledge domain." (~Tom Gruber) – iawiki.net/IAGlossary Some Related Languages For Web Ontology • XML provides a surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents. • XML Schema is a language for restricting the structure of XML documents and also extends XML with data types. • RDF is a data model for objects ("resources") and relations between them, provides a simple semantics for this data model, and these data models can be represented in an XML syntax. • RDF Schema is a vocabulary for describing properties and classes of RDF resources, with a semantics for generalization-hierarchies of such properties and classes. • OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes. Ontology Related Language language Description Query Language XML Structured Documents XQuery, XPath data model for objects RDQL, RQL, Versa, Squish data model + Relations OWL-QL, Jena Data model + relations + rules (owl + RuleML ) ?? Extensible Markup Language RDF Resource Description Framework OWL Web Ontology Language SWRL Semantic Web Rule Language RDQL • • • • RDF Data Query Language. is a query language for RDF. is declarative. Considers RDF model as a set of triples: – (Subject Property Value) • Permits to specify patterns that match versus triples of the model to return a result. RDQL: RDF Data Query Language • Basic Syntax: SELECT vars FROM documents WHERE Expressions AND Filters USING Namespace declarations Clauses • SELECT Clause – Identifies the variables to be returned to the application. If not all the variables are needed by the application, then specifying the required results can reduce the amount of memory needed for the results set as well as providing information to a query optimizer. • FROM Clause – The FROM clause specifies the model by URI. • WHERE Clause – This specifies the graph pattern as a list of triple patterns. Clauses • AND Clause – Specifies the Boolean expressions, – indicates constraints that RDQL variables must follow • USING Clause – A way to shorten the length of URIs. This mechanism helps make for an easier to understand syntax. This is not a namespace mechanism; instead it is simple an abbreviation mechanism for long URIs by defining a string prefix. Clauses in-depth: SELECT • The SELECT portion of the query let's you indiate which RDQL variables you want to be returned by the query, if you use SELECT ?x,?y,?z then you will receive an array of tuples containing values for ?x,?y and ?z. You can use other variables in the query such as ?a1,?p,?foo but they won't be returned since they are not present in the select part of the query. Clauses in-depth: FROM • The FROM part of the query indicates the RDF sources to be queried, each source is enclosed by angle brackets (<&>). If you indicate more than one source sepparate them using commas. Clauses in-depth: WHERE • The where part is the most important part of the RDQL expression, in the where part you indicate constraints that RDF triples (subject, predicate, object) must accomplish in order to be returned. The where part is expressed by a list of restrictions separated by commas, each restriction takes the form: (subject, predicate, object) where the subject, predicate and object can be a literal value or a RDQL variable. Clauses in-depth: WHERE • For the predicate you can express property names using a namespace declared in the USING section for example: <dc:name> which indicates that the predicate must match the "name" local-name for the namespace declared as "dc" in the using part. Clauses in-depth: AND • The AND part indicates constraints that RDQL variables must follow. In the PHP implementation the AND part is a PHP expression where variables are RDQL variables such as ?x,?y etc. Clauses in-depth: USING • The USING section declares all the namespaces that will be used for RDF properties, declarations are sepparated by commas and use the notation: Syntax: Where, And • Where: indicate constraints that RDF triples (subject, predicate, object) • And: indicates constraints that RDQL variables must follow Example Example • q1 contains a query: SELECT ?x WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> "John Smith") • For executing q1with a model m1.rdf: java jena.rdfquery --data m1.rdf --query q1 • The outcome is: x ============================= <http://somewhere/JohnSmith/> Example • Return all the resources that have property FN and the associated values: SELECT ?x, ?fname WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> ?fname) • The outcome is: x | fname ================================================ <http://somewhere/JohnSmith/> | "John Smith" <http://somewhere/SarahJones/> | "Sarah Jones" <http://somewhere/MattJones/> | "Matt Jones" Example • Return the first name of Jones: SELECT ?givenName WHERE (?y <http://www.w3.org/2001/vcard-rdf/3.0#Family> "Jones"), (?y <http://www.w3.org/2001/vcard-rdf/3.0#Given> ?givenName) • The outcome is: givenName ========= "Matthew" "Sarah" Other Examples • Indicate name and age of all the individuals older than 20 SELECT ?y FROM <people.rdf> WHERE (?x,<dt:age>,?z),(?x,<dt:name>,?y) AND ?z>20 USING dt for <http://foo.org#>, rdf for <http://www.w3.org/1999/02/22rdf-syntax-ns#> RQL - cont • Class & Property Querying – Which classes can appear as domain and range of the property creates? • SELECT $C1, $C2 • FROM {$C1}creates{$C2} • SELECT X, Y • FROM Class{X}, Class{Y}, {;X}creates{;Y} – What if? • SELECT X, Y • FROM {X}creates{Y} RQL - cont • Querying Resource Descriptions • SELECT X • FROM Artist{X} • SELECT X, Y • FROM {X}creates{Y} RQL - cont • SELECT Y • FROM {X}title{Y} • WHERE X like "*www.artchive.com*“ Using from Java Code • It is possible to run RDQL queries from the Java application. • The following classes are to be used for this: – Query – QueryExecution – QueryEngine – QueryResults – ResultBinding Example SELECT ?x, ?fname WHERE (?x <http://www.w3.org/2001/vcard-rdf/3.0#FN> ?fname) Query query = new Query("SELECT...") ; query.setSource(model); QueryExecution qe = new QueryEngine(query) ; QueryResults results = qe.exec(); for (Iterator iter = results; iter.hasNext();) { ResultBinding res = (ResultBinding) iter.next(); Resource x = (Resource) res.get("x"); Literal fname = (Literal) res.get("fname"); System.out.println("x: " + x + " fname: " + fname); } Persistent Models • Jena permits to create persistent models: – such as with relational databases. • Jena 2 supports: – MySQL – Oracle – PostgreSQL • To create a persistent model: – ModelFactory.createModelRDBMaker(conn).createModel() Example // Create a connection to DB DBConnection c = new DBConnection(DB_URL, DB_USER, DB_PASS, DB_TYPE); // Create a ModelMaker for persistent models ModelMaker maker = ModelFactory.createModelRDBMaker(c); // Create a new model Model model = maker.createModel("modelo_1"); // Start transaction model.begin(); // Read a model from an XML archive model.read(in, null); // Commit a transaction model.commit();