07.Query on Semantic Web.ppt

advertisement

Querying on the Web:

XQuery, RDQL, SparQL

Semantic Web - Spring 2006

Computer Engineering Department

Sharif University of Technology

Outline

• XQuery

– Querying on XML Data

• RDQL

– Querying on RDF Data

• SparQL

– Another RDF query language (under development)

2

Requirements for an XML Query

Language

David Maier, W3C XML Query Requirements:

• Closedness : output must be XML

• Composability : wherever a set of XML elements is required, a subquery is allowed as well

• Can benefit from a schema, but should also be applicable without

• Retains the order of nodes

• Formal semantics

3

How Does One Design a Query

Language?

• In most query languages, there are two aspects to a query:

– Retrieving data (e.g., from … where … in SQL)

– Creating output (e.g., select … in SQL)

• Retrieval consists of

– Pattern matching (e.g., from … )

– Filtering (e.g., where … )

… although these cannot always be clearly distinguished

4

XQuery Principles

• A language for querying XML document.

• Data Model identical with the XPath data model

– documents are ordered, labeled trees

– nodes have identity

– nodes can have simple or complex types

(defined in XML Schema)

• XQuery can be used without schemas , but can be checked against

DTDs and XML schemas

• XQuery is a functional language

– no statements

– evaluation of expressions

5

Sample data

6

A Query over the Recipes Document

<titles>

{for $r in doc("recipes.xml")//recipe return

$r/title}

</titles> returns

<titles>

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

<title>Ricotta Pie</title>

</titles>

7

Query Features

Part to be returned as it is given {To be evaluated}

<titles> doc( String ) returns input document

{for $r in doc("recipes.xml")//recipe return

Iteration $var - variables

$r/title}

</titles>

XPath

Sequence of results, one for each variable binding

8

Features: Summary

• The result is a new XML document

• A query consists of parts that are returned as is

• ... and others that are evaluated (everything in {...} )

• Calling the function doc(

String

) returns an input document

• XPath is used to retrieve nodes sets and values

• Iteration over node sets: let binds a variable to all nodes in a node set

• Variables can be used in XPath expressions

• return returns a sequence of results , one for each binding of a variable

9

XPath is a Fragement of XQuery

• doc("recipes.xml")//recipe[1]/title returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title> an element

• doc("recipes.xml")//recipe[position()<=3]

/title returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,

<title>Ricotta Pie</title>,

<title>Linguine Pescadoro</title> a list of elements

10

Beware: XPath Attributes

• doc("recipes.xml")//recipe[1]/ingredient[1]

/@name

→ attribute name {"beef cube steak"} a constructor for an attribute node

• string(doc("recipes.xml")//recipe[1]

/ingredient[1]/@name)

→ "beef cube steak" a value of type string

11

XPath Attributes (cntd.)

• <first-ingredient>

{string(doc("recipes.xml")//recipe[1]

/ingredient[1]/@name)}

</first-ingredient>

→ <first-ingredient>beef cube steak</first-ingredient> an element with string content

12

XPath Attributes (cntd.)

• <first-ingredient>

{doc("recipes.xml")//recipe[1]

/ingredient[1]/@name}

</first-ingredient>

→ <first-ingredient name="beef cube steak"/> an element with an attribute

13

XPath Attributes (cntd.)

• <first-ingredient oldName=" {doc("recipes.xml")//recipe[1]

/ingredient[1]/@name} ">

Beef

</first-ingredient>

→ <first-ingredient oldName="beef cube steak">

Beef

</first-ingredient>

An attribute is cast as a string

14

Iteration with the For-Clause

Syntax: for $ var in xpath-expr

Example: for $r in doc("recipes.xml")//recipe return string($r)

• The expression creates a list of bindings for a variable $ var

If $ var occurs in an expression exp , then exp is evaluated for each binding

• For-clauses can be nested: for $r in doc("recipes.xml")//recipe for $v in doc("vegetables.xml")//vegetable return ...

15

Nested For-clauses: Example

<my-recipes>

{for $r in doc("recipes.xml")//recipe return

<my-recipe title=" {$r/title} ">

{for $i in $r//ingredient return

<my-ingredient>

{string($i/@name)}

</my-ingredient>

}

</my-recipe>

Returns my-recipes with titles as attributes and my-ingredients with names as text content

}

</my-recipes>

16

The Let Clause

Syntax: let $ var := xpath-expr

• binds variable $ var to a list of nodes, with the nodes in document order

• does not iterate over the list

• allows one to keep intermediate results for reuse

(not possible in SQL)

Example: let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"]

17

Let Clause: Example

<calory-content>

{let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"] for $r in $ooreps return

<calories> Calories of recipes

{$r/title/text()}

{": "} with olive oil

{string($r/nutrition/@calories)}

</calories> }

</calory-content>

Note the implicit string concatenation

18

Let Clause: Example (cntd.)

The query returns:

<calory-content>

<calories>Beef Parmesan: 1167</calories>

<calories>Linguine Pescadoro: 532</calories>

</calory-content>

19

The Where Clause

Syntax: where <condition>

• occurs before return clause

• similar to predicates in XPath

• comparisons on nodes:

– " = " for node equality

– " << " and " >> " for document order

• Example: for $r in doc("recipes.xml")//recipe where $r//ingredient/@name="olive oil" return ...

20

Quantifiers

• Syntax: some / every $ satisfies var in

<node-set>

<expr>

• $ var is bound to all nodes in <node-set>

• Test succeeds if <expr> is true for some/every binding

• Note: if <node-set> is empty , then

“ some ” is false and “ all ” is true

21

Quantifiers (Example)

• Recipes that have some compound ingredient for $r in doc("recipes.xml")//recipe where some $i in $r/ingredient satisfies $i/ingredient

Return $r/title

• Recipes where every ingredient is non-compound for $r in doc("recipes.xml")//recipe where every $i in $r/ingredient satisfies not($i/ingredient)

Return $r/title

22

Element Fusion

“To every recipe, add the attribute calories!”

<result>

{let $rs := doc("recipes.xml")//recipe for $r in $rs return

<recipe>

{$r/nutrition/@calories} an attribute

{$r/title}

</recipe> } an element

</result>

23

Element Fusion (cntd.)

The query result:

<result>

<recipe calories="1167">

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

</recipe>

<recipe calories="349">

<title>Ricotta Pie</title>

</recipe>

<recipe calories="532">

<title>Linguine Pescadoro</title>

</recipe>

</result>

24

Eliminating Duplicates

The function distinct-values( Node Set )

– extracts the values of a sequence of nodes

– creates a duplicate free sequence of values

Note the coercion: nodes are cast as values!

Example: let $rs := doc("recipes.xml")//recipe return distinct-values($rs//ingredient/@name) yields

"beef cube steak onion, sliced into thin rings

...

25

The Order By Clause

Syntax: order by expr [ ascending | descending ] for $iname in doc("recipes.xml")//@name order by $iname descending return string($iname) yields

"whole peppercorns",

"whole baby clams",

"white sugar",

...

26

The Order By Clause

(cntd.)

The interpreter must be told whether the values should be regarded as numbers or as strings

(alphanumerical sorting is default) for $r in $rs order by number($r/nutrition/@calories) return $r/title

Note:

– The query returns titles ...

– but the ordering is according to calories , which do not appear in the output

Not possible in SQL!

27

Grouping and Aggregation

Aggregation functions count, sum, avg, min, max

Example: The number of simple ingredients per recipe for $r in doc("recipes.xml")//recipe return

<number>

{attribute {"title"} {$r/title/text()}}

{count($r//ingredient[not(ingredient)])}

</number>

28

Grouping and Aggregation (cntd.)

The query result:

<number title="Beef Parmesan with Garlic Angel Hair

Pasta">11</number>,

<number title="Ricotta Pie">12</number>,

<number title="Linguine Pescadoro">15</number>,

<number title="Zuppa Inglese">8</number>,

<number title="Cailles en Sarcophages">30</number>

29

Nested Aggregation

“The recipe with the maximal number of calories!” let $rs := doc("recipes.xml")//recipe let $maxCal := max($rs//@calories) for $r in $rs where $r//@calories = $maxCal return string($r/title) returns

"Cailles en Sarcophages"

30

Running Queries with Galax

• Galax is an open-source implementation of

XQuery ( http://www.galaxquery.org/ )

– The main developers have taken part in the definition of

XQuery

31

RDQL

Querying on RDF data

Introduction

• R DF D ata Q uery L anguage

• JDBC/ODBC friendly

• Simple:

SELECT some information

FROM somewhere

WHERE this match

AND these constraints

USING these vocabularies

33

Example

34

Example

• q1 contains a query:

SELECT ?x

WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")

• For executing q1with a model m1.rdf: java jena.rdfquery --data m1.rdf --query q1

• The outcome is: x

=============================

<http://somewhere/JohnSmith/>

35

Example

• Return all the resources that have property FN and the associated values:

SELECT ?x, ?fname

WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)

• The outcome is: x | fname

================================================

<http://somewhere/JohnSmith/> | "John Smith"

<http://somewhere/SarahJones/> | "Sarah Jones"

<http://somewhere/MattJones/> | "Matt Jones"

36

Example

• Return the first name of Jones:

SELECT ?givenName

WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),

(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)

• The outcome is: givenName

=========

"Matthew"

"Sarah"

37

URI Prefixes : USING

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?x

WHERE (?x, vCard:FN, "John Smith")

USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?givenName

WHERE (?y, vCard:Family, "Smith"),

(?y, vCard:Given, ?givenName)

USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

38

Filters

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?resource

WHERE (?resource, info:age, ?age)

AND ?age >= 24

USING info FOR <http://somewhere/peopleInfo#>

39

Another Example

SELECT

?title ?description ?orbit ?satellite ?sensor ?date

FROM

<http://earth.esa.int/showcase/ers/dublin.rdf>

WHERE

(?item <dc:title> ?title)

(?item <dc:description> ?description)

(?item <isc:orbit> ?orbit)

(?item <isc:satellite> ?satellite)

(?item <isc:sensor> ?sensor)

(?item <dc:date> ?date)

USING isc FOR <http://earth.esa.int/standards/showcase/> dc FOR <http://purl.org/dc/elements/1.1/> rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#> rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>

40

Implementations

• Jena

– http://jena.sourceforge.net/

• Sesame

– http://sesame.aidministrator.nl/

• RDFStore

– <http://rdfstore.sourceforge.net/>

41

Limitation

• Does not take into account semantics of RDFS

• For example: ex:human rdfs:subClassOf ex:animal ex:student rdfs:subClassOf ex:human ex:john rdf:type ex:student

Query: “ To which class does the resource John belong?”

Expected answer: ex:student, ex:human, ex:animal

However, the query:

SELECT ?x

WHERE (<http://example.org/#john>, rdf:type, ?x)

USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

Yields only:

<http://example.org/#student>

• Solution: Inference Engines

42

SparQL

Introduction

• A RDF query language currently under development by W3C

• Builds on previous RDF query languages such as rdfDB, RDQL, and SeRQL.

44

Example RDF

45

Example

• Simple Query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?url

FROM <bloggers.rdf>

WHERE {

?contributor foaf:name "Jon Foobar" .

?contributor foaf:weblog ?url .

}

46

Example (cont.)

• Optional block:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?depiction

WHERE { ?person foaf:name ?name .

OPTIONAL { ?person foaf:depiction ?depiction . }

}

47

Example (cont.)

• Alternative matches:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?name ?mbox

WHERE {

?person foaf:name ?name .

{

{ ?person foaf:mbox ?mbox } UNION

{ ?person foaf:mbox_sha1sum ?mbox }

}

}

• There are many other features in SparQL which is out of scope for this class.

Refer to references for more information.

48

References

• http://www.w3.org/TR/xquery/

• A Programmer's Introduction to RDQL

– http://jena.sourceforge.net/tutorial/RDQL/

• http://rdfstore.sourceforge.net/

• http://jena.sourceforge.net

• http://sesame.aidministrator.nl/

• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/

• http://www-128.ibm.com/developerworks/java/library/j-sparql/

49

Download