Semantic Web - Spring 2006
Computer Engineering Department
Sharif University of Technology
• XQuery
– Querying on XML Data
• RDQL
– Querying on RDF Data
• SparQL
– Another RDF query language (under development)
2
David Maier, W3C XML Query Requirements:
• Closedness : output must be XML
• Composability : wherever a set of XML elements is required, a subquery is allowed as well
• Can benefit from a schema, but should also be applicable without
• Retains the order of nodes
• Formal semantics
3
• In most query languages, there are two aspects to a query:
– Retrieving data (e.g., from … where … in SQL)
– Creating output (e.g., select … in SQL)
• Retrieval consists of
– Pattern matching (e.g., from … )
– Filtering (e.g., where … )
… although these cannot always be clearly distinguished
4
• A language for querying XML document.
• Data Model identical with the XPath data model
– documents are ordered, labeled trees
– nodes have identity
– nodes can have simple or complex types
(defined in XML Schema)
• XQuery can be used without schemas , but can be checked against
DTDs and XML schemas
• XQuery is a functional language
– no statements
– evaluation of expressions
5
6
<titles>
{for $r in doc("recipes.xml")//recipe return
$r/title}
</titles> returns
<titles>
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
<title>Ricotta Pie</title>
…
</titles>
7
Part to be returned as it is given {To be evaluated}
<titles> doc( String ) returns input document
{for $r in doc("recipes.xml")//recipe return
Iteration $var - variables
$r/title}
</titles>
XPath
Sequence of results, one for each variable binding
8
• The result is a new XML document
• A query consists of parts that are returned as is
• ... and others that are evaluated (everything in {...} )
• Calling the function doc(
String
) returns an input document
• XPath is used to retrieve nodes sets and values
• Iteration over node sets: let binds a variable to all nodes in a node set
• Variables can be used in XPath expressions
• return returns a sequence of results , one for each binding of a variable
9
• doc("recipes.xml")//recipe[1]/title returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title> an element
• doc("recipes.xml")//recipe[position()<=3]
/title returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,
<title>Ricotta Pie</title>,
<title>Linguine Pescadoro</title> a list of elements
10
• doc("recipes.xml")//recipe[1]/ingredient[1]
/@name
→ attribute name {"beef cube steak"} a constructor for an attribute node
• string(doc("recipes.xml")//recipe[1]
/ingredient[1]/@name)
→ "beef cube steak" a value of type string
11
• <first-ingredient>
{string(doc("recipes.xml")//recipe[1]
/ingredient[1]/@name)}
</first-ingredient>
→ <first-ingredient>beef cube steak</first-ingredient> an element with string content
12
• <first-ingredient>
{doc("recipes.xml")//recipe[1]
/ingredient[1]/@name}
</first-ingredient>
→ <first-ingredient name="beef cube steak"/> an element with an attribute
13
• <first-ingredient oldName=" {doc("recipes.xml")//recipe[1]
/ingredient[1]/@name} ">
Beef
</first-ingredient>
→ <first-ingredient oldName="beef cube steak">
Beef
</first-ingredient>
An attribute is cast as a string
14
Syntax: for $ var in xpath-expr
Example: for $r in doc("recipes.xml")//recipe return string($r)
• The expression creates a list of bindings for a variable $ var
If $ var occurs in an expression exp , then exp is evaluated for each binding
• For-clauses can be nested: for $r in doc("recipes.xml")//recipe for $v in doc("vegetables.xml")//vegetable return ...
15
<my-recipes>
{for $r in doc("recipes.xml")//recipe return
<my-recipe title=" {$r/title} ">
{for $i in $r//ingredient return
<my-ingredient>
{string($i/@name)}
</my-ingredient>
}
</my-recipe>
Returns my-recipes with titles as attributes and my-ingredients with names as text content
}
</my-recipes>
16
Syntax: let $ var := xpath-expr
• binds variable $ var to a list of nodes, with the nodes in document order
• does not iterate over the list
• allows one to keep intermediate results for reuse
(not possible in SQL)
Example: let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"]
17
<calory-content>
{let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"] for $r in $ooreps return
<calories> Calories of recipes
{$r/title/text()}
{": "} with olive oil
{string($r/nutrition/@calories)}
</calories> }
</calory-content>
Note the implicit string concatenation
18
The query returns:
<calory-content>
<calories>Beef Parmesan: 1167</calories>
<calories>Linguine Pescadoro: 532</calories>
</calory-content>
19
Syntax: where <condition>
• occurs before return clause
• similar to predicates in XPath
• comparisons on nodes:
– " = " for node equality
– " << " and " >> " for document order
• Example: for $r in doc("recipes.xml")//recipe where $r//ingredient/@name="olive oil" return ...
20
• Syntax: some / every $ satisfies var in
<node-set>
<expr>
• $ var is bound to all nodes in <node-set>
• Test succeeds if <expr> is true for some/every binding
• Note: if <node-set> is empty , then
“ some ” is false and “ all ” is true
21
• Recipes that have some compound ingredient for $r in doc("recipes.xml")//recipe where some $i in $r/ingredient satisfies $i/ingredient
Return $r/title
• Recipes where every ingredient is non-compound for $r in doc("recipes.xml")//recipe where every $i in $r/ingredient satisfies not($i/ingredient)
Return $r/title
22
“To every recipe, add the attribute calories!”
<result>
{let $rs := doc("recipes.xml")//recipe for $r in $rs return
<recipe>
{$r/nutrition/@calories} an attribute
{$r/title}
</recipe> } an element
</result>
23
The query result:
<result>
<recipe calories="1167">
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
</recipe>
<recipe calories="349">
<title>Ricotta Pie</title>
</recipe>
<recipe calories="532">
<title>Linguine Pescadoro</title>
</recipe>
</result>
24
The function distinct-values( Node Set )
– extracts the values of a sequence of nodes
– creates a duplicate free sequence of values
Note the coercion: nodes are cast as values!
Example: let $rs := doc("recipes.xml")//recipe return distinct-values($rs//ingredient/@name) yields
"beef cube steak onion, sliced into thin rings
...
25
Syntax: order by expr [ ascending | descending ] for $iname in doc("recipes.xml")//@name order by $iname descending return string($iname) yields
"whole peppercorns",
"whole baby clams",
"white sugar",
...
26
(cntd.)
The interpreter must be told whether the values should be regarded as numbers or as strings
(alphanumerical sorting is default) for $r in $rs order by number($r/nutrition/@calories) return $r/title
Note:
– The query returns titles ...
– but the ordering is according to calories , which do not appear in the output
Not possible in SQL!
27
Aggregation functions count, sum, avg, min, max
Example: The number of simple ingredients per recipe for $r in doc("recipes.xml")//recipe return
<number>
{attribute {"title"} {$r/title/text()}}
{count($r//ingredient[not(ingredient)])}
</number>
28
The query result:
<number title="Beef Parmesan with Garlic Angel Hair
Pasta">11</number>,
<number title="Ricotta Pie">12</number>,
<number title="Linguine Pescadoro">15</number>,
<number title="Zuppa Inglese">8</number>,
<number title="Cailles en Sarcophages">30</number>
29
“The recipe with the maximal number of calories!” let $rs := doc("recipes.xml")//recipe let $maxCal := max($rs//@calories) for $r in $rs where $r//@calories = $maxCal return string($r/title) returns
"Cailles en Sarcophages"
30
• Galax is an open-source implementation of
XQuery ( http://www.galaxquery.org/ )
– The main developers have taken part in the definition of
XQuery
31
Querying on RDF data
• R DF D ata Q uery L anguage
• JDBC/ODBC friendly
• Simple:
SELECT some information
FROM somewhere
WHERE this match
AND these constraints
USING these vocabularies
33
34
• q1 contains a query:
SELECT ?x
WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")
• For executing q1with a model m1.rdf: java jena.rdfquery --data m1.rdf --query q1
• The outcome is: x
=============================
<http://somewhere/JohnSmith/>
35
• Return all the resources that have property FN and the associated values:
SELECT ?x, ?fname
WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)
• The outcome is: x | fname
================================================
<http://somewhere/JohnSmith/> | "John Smith"
<http://somewhere/SarahJones/> | "Sarah Jones"
<http://somewhere/MattJones/> | "Matt Jones"
36
• Return the first name of Jones:
SELECT ?givenName
WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),
(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)
• The outcome is: givenName
=========
"Matthew"
"Sarah"
37
• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :
SELECT ?x
WHERE (?x, vCard:FN, "John Smith")
USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?givenName
WHERE (?y, vCard:Family, "Smith"),
(?y, vCard:Given, ?givenName)
USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
38
• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :
SELECT ?resource
WHERE (?resource, info:age, ?age)
AND ?age >= 24
USING info FOR <http://somewhere/peopleInfo#>
39
SELECT
?title ?description ?orbit ?satellite ?sensor ?date
FROM
<http://earth.esa.int/showcase/ers/dublin.rdf>
WHERE
(?item <dc:title> ?title)
(?item <dc:description> ?description)
(?item <isc:orbit> ?orbit)
(?item <isc:satellite> ?satellite)
(?item <isc:sensor> ?sensor)
(?item <dc:date> ?date)
USING isc FOR <http://earth.esa.int/standards/showcase/> dc FOR <http://purl.org/dc/elements/1.1/> rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#> rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>
40
• Jena
– http://jena.sourceforge.net/
• Sesame
– http://sesame.aidministrator.nl/
• RDFStore
– <http://rdfstore.sourceforge.net/>
41
• Does not take into account semantics of RDFS
• For example: ex:human rdfs:subClassOf ex:animal ex:student rdfs:subClassOf ex:human ex:john rdf:type ex:student
Query: “ To which class does the resource John belong?”
Expected answer: ex:student, ex:human, ex:animal
However, the query:
SELECT ?x
WHERE (<http://example.org/#john>, rdf:type, ?x)
USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
Yields only:
<http://example.org/#student>
• Solution: Inference Engines
42
• A RDF query language currently under development by W3C
• Builds on previous RDF query languages such as rdfDB, RDQL, and SeRQL.
44
45
• Simple Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?url
FROM <bloggers.rdf>
WHERE {
?contributor foaf:name "Jon Foobar" .
?contributor foaf:weblog ?url .
}
46
• Optional block:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?depiction
WHERE { ?person foaf:name ?name .
OPTIONAL { ?person foaf:depiction ?depiction . }
}
47
• Alternative matches:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?name ?mbox
WHERE {
?person foaf:name ?name .
{
{ ?person foaf:mbox ?mbox } UNION
{ ?person foaf:mbox_sha1sum ?mbox }
}
}
• There are many other features in SparQL which is out of scope for this class.
Refer to references for more information.
48
• http://www.w3.org/TR/xquery/
• A Programmer's Introduction to RDQL
– http://jena.sourceforge.net/tutorial/RDQL/
• http://rdfstore.sourceforge.net/
• http://jena.sourceforge.net
• http://sesame.aidministrator.nl/
• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
• http://www-128.ibm.com/developerworks/java/library/j-sparql/
49