Hands-on exercises in SPARQL, for querying RDF and OWL documents

advertisement
Hands-on exercises in SPARQL, for querying RDF and OWL documents
1. Aims of Lab/Lecture






to gain hands-on experience with using various endpoints for SPARQL
to gain hands-on experience with using the 'SPARQL query' tab in Protégé
to write simple queries in SPARQL, using prefixes, variables, Dot ('.') as intersection, obtaining multiple
results, using semicolumn (';') for the same subject, comma (',') for the same subject and object,
dealing with optional graphs
to write ASK, CONSTRUCT and DESCRIBE queries in SPARQL
to learn how to explore existing RDF or OWL documents, by using SPARQL to find concepts, listing
distinct RDF types, listing OWL classes, listing top level OWL classes, listing root and derived concepts
exploring the FOAF dataset.
2. Protégé download
Some of these queries can be run in Protégé. If so, you need to install it first, from:
http://protege.stanford.edu/download/protege/4.3/installanywhere/Web_Installers/
After installation, If the SPARQL Query tab is unavailable in your Protégé workspace, make sure the SPARQL
Query item in the Window | Tabs menu is checked. Alternatively, you can add the Query view widget to any
other tab by selecting Window | Views | Query views and then placing the widget anywhere in a layout.
3. Endpoint: http://www.sparql.org/query.html
For these queries, you only need to go online to the endpoint above.
a) Simple query:
Select titles of books from an RDF database.
SELECT ?title
WHERE { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title }
b) Prefixes:
Select books of books from an RDF database.
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT $book
WHERE {$book title $title }
>> this gives an error. Why?
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT $book
WHERE {$book dc:title $title }
c) Variables:
Select books from an RDF database.
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book
WHERE {?book dc:title ?title }
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT $book
WHERE {?book dc:title ?title }
Or:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT $book
WHERE {$book dc:title $title }
Note the way the variables are written. Note the way the parser treats them.
d) Dot as AND/ INTERSECT:
Select books (from an RDF database), which have an author and a title.
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book
WHERE { ?book dc:creator ?author . ?book dc:title ?title }
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?author
WHERE { ?book dc:creator ?author}
The latter reads: Select books (from an RDF database), which have an author.
Or:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title
WHERE {?book dc:title ?title }
The latter reads: Select books (from an RDF database), which have a title.
Additional exercises: You can try out other queries there, but you need to use the Dublin Core vocabulary (such
as dc:creator and dc:title in the example):
http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements
e) Multiple results:
Select books, their author and title (from an RDF database).
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?title
WHERE
{ ?book dc:creator ?author . ?book dc:title ?title }
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author
WHERE
{ ?book dc:creator ?author}
The latter reads: Select books and their author (from an RDF database).
Or:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title
WHERE
{?book dc:title ?title }
The latter reads: Select books and their title (from an RDF database).
If you want to find out what data is available for query at the site above, you can do:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?x ?title
WHERE
{ ?book ?x ?title }
f) Semicolon for same subject:
Select books, their author and title (from an RDF database).
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?title
WHERE
{ ?book dc:creator ?author ; dc:title ?title }
Please note here that we have omitted the subject in the second triple, as it is shared with the first triple.
g) Comma for same subject and object:
Select books, their author and title (from an RDF database).
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?creator
WHERE
{ ?book dc:creator ?author , ?creator }
Please note here that we have omitted the subject and object in the second triple, as it is shared with the first
triple.
Compare with:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?creator
WHERE
{ ?book dc:creator ?author; ?creator ?author }
Please note here that we have omitted only the title, but that our predicate is a variable.
h) Dealing with optional graphs
Please note that the parser runs even if it cannot find the data.
Here we are looking for book, author, title and date of that book.
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?title ?date
WHERE
{ ?book dc:creator ?author . ?book dc:title ?title . ?book dc:date ?date }
Above: If the query cannot find the answer to one of the triples, in this case it will return no result, even if the
data for the other triples is available.
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?author ?title ?date
WHERE
{ ?book dc:creator ?author . ?book dc:title ?title . OPTIONAL { ?book dc:date ?date } }
Above: the date is left optional, and will only be selected if it is found. The other two triples are searched for
independently on the existence or not of the date.
Or, more complex queries:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX vc: <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?book ?author ?title ?fullname ?firstname ?surname
WHERE
{ ?book dc:creator ?author ; dc:title ?title .
OPTIONAL { ?author vc:FN ?fullname } .
OPTIONAL { ?author vc:N ?x} .
OPTIONAL { ?x vc:N ?y . ?y vc:Given ?firstname . ?y vc:Family ?surname}
}
ORDER BY ?book
What is the difference in the above query, when compared with the previous ones? Compare!
i) ASK queries:
Is there a book that has been created by an author (in an RDF database)?
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
ASK { ?book dc:creator ?author }
j) CONSTRUCT queries:
This is the regular query: Select books, their authors, titles, and, if existent, fullname, firstname and surname of
their authors.
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX vc: <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?book ?author ?title ?fullname ?firstname ?surname
WHERE
{ ?book dc:creator ?author ; dc:title ?title .
OPTIONAL { ?author vc:FN ?fullname } .
OPTIONAL { ?author vc:N ?x} .
OPTIONAL { ?x vc:N ?y . ?y vc:Given ?firstname . ?y vc:Family ?surname}
}
ORDER BY ?book
Versus:
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX vc: <http://www.w3.org/2001/vcard-rdf/3.0#>
CONSTRUCT { ?book vc:hasSurname ?surname }
WHERE
{ ?book dc:creator ?author ; dc:title ?title .
OPTIONAL { ?author vc:FN ?fullname } .
OPTIONAL { ?author vc:N ?x} .
OPTIONAL { ?x vc:N ?y . ?y vc:Given ?firstname . ?y vc:Family ?surname}
}
ORDER BY ?book
Compare the above query with the previous one!
The constructor allows for the format of the data to be changed, and another triple to be created. The query
will create a triple of books and the surname of their authors, if this is available.
k) DESCRIBE queries:
Describe the structure of the book element connected in a triple to its author (from an RDF database).
PREFIX books: <http://example.org/book/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
DESCRIBE ?book
WHERE
{ ?book dc:creator ?author }
4. Endpoint: http:// sparql.org/sparql.html or Protégé
For these queries, you only need to go online to the endpoint above. The queries here can also be run at:
http://demo.openlinksw.com/sparql (make sure to allow at ‘Sponging’ to ‘Retrieve remote RDF data for all
missing source graphs’
Or at: http://librdf.org/query
a) Analysing RDF documents (same queries as in Exploration below)
e.g., a possible document to analyse is available at: http://athena.ics.forth.gr:9090/RDF/VRP/Examples/tap.rdf
Also try this and the following query with dataset: http://www.w3.org/People/Berners-Lee/card.
What is the structure of Berners-Lee card?
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?x ?y ?z
FROM <http://www.w3.org/People/Berners-Lee/card>
WHERE { ?x ?y ?z }
b) FOAF
After we have found out the structure of an RDF document, we can go on to ask more specific queries, e.g.:
What is the name of persons registered via the FOAF card?
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name
WHERE { ?person foaf:name ?name }
5. Endpoint: http://dbpedia.org/snorql/
For these queries, you only need to go online to the endpoint above.
a) Exploration:
i.
Using SPARQL to find Concepts
The SPARQL Webforms listed in SPARQL Online, such as Dbpedia SNORQL query explorer, the BBC Backstage
SPARQL Editor or GeoSparql can be used to try out the following queries. Try the same at
http://librdf.org/query (don’t use the FROM, instead put the URI in the ‘RDF content URIs’ slot. )
ii.
Listing DISTINCT RDF Types
The following query lists the distinct rdf:types used in a dataset:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?type
WHERE { ?s rdf:type ?type }
iii.
Listing OWL Classes
List distinct OWL classes:
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?class
WHERE { ?class a owl:Class . }
iv.
Listing Top Level OWL Classes
To view top level OWL classes i.e. Subclasses of owl:Thing, run the following query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT * WHERE {?s rdfs:subClassOf owl:Thing }
v.
Listing Root and Derived Concepts
Based on an example given by "Brandon Ibach" July 12, 2009 on pellet-users@lists.owldl.com
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?class
WHERE { ?class rdfs:subClassOf owl:Thing .
FILTER ( ?class != owl:Thing && ?class != owl:Nothing ) .
OPTIONAL { ?class rdfs:subClassOf ?super .
FILTER ( ?super != owl:Thing && ?super != ?class ) } .
FILTER ( !bound(?super) ) }
The query above should list Root-Concepts. To get Derived-Concepts, remove the "!" from the last line.
Download