CSE 636 Data Integration XML Query Languages XPath

advertisement
CSE 636
Data Integration
XML Query Languages
XPath
XPath
• http://www.w3.org/TR/xpath (11/99)
• Building block for other W3C standards:
–
–
–
–
•
XSL Transformations (XSLT)
XML Link (XLink)
XML Pointer (XPointer)
XQuery
Was originally part of XSL
2
Example for XPath Queries
<bib>
<book>
<publisher> Addison-Wesley </publisher>
<author> Serge Abiteboul </author>
<author>
<first-name> Rick </first-name>
<last-name> Hull </last-name>
</author>
<author> Victor Vianu </author>
<title> Foundations of Databases </title>
<year> 1995 </year>
</book>
<book price=“55”>
<publisher> Freeman </publisher>
<author> Jeffrey D. Ullman </author>
<title> Principles of Database and Knowledge Base Systems </title>
<year> 1998 </year>
</book>
</bib>
3
Data Model for XPath
/
Document
XML PI
The root
Element
bib
Comment
Element
book
The root element
Element
book
Element
publisher
Element
author
Text
Addison-Wesley
Text
Serge Abiteboul
…
…
4
XPath: Simple Expressions
/bib/book/year
Result: <year> 1995 </year>
<year> 1998 </year>
/bib/paper/year
Result: empty
(there were no papers)
5
XPath: Restricted Kleene Closure
//author
Result:<author> Serge Abiteboul </author>
<author>
<first-name> Rick </first-name>
<last-name> Hull </last-name>
</author>
<author> Victor Vianu </author>
<author> Jeffrey D. Ullman </author>
/bib//first-name
Result:<first-name> Rick </first-name>
6
XPath: Functions
/bib/book/author/text()
Result:
Serge Abiteboul
Jeffrey D. Ullman
Rick Hull doesn’t appear because he has firstname, lastname
Functions in XPath:
– text() = matches the text value
– node() = matches any node (= * or @* or text())
– name()= returns the name of the current tag
7
XPath: Wildcard
//author/*
Result:
<first-name> Rick </first-name>
<last-name> Hull </last-name>
* Matches any element
8
XPath: Attribute Nodes
/bib/book/@price
Result: “55”
@price means that price is has to be an attribute
9
XPath: Qualifiers
/bib/book/author[first-name]
Result:<author>
<first-name> Rick </first-name>
<last-name> Hull </last-name>
</author>
10
XPath: More Qualifiers
/bib/book/author[firstname][address[//zip][city]]/lastname
Result:<lastname> … </lastname>
<lastname> … </lastname>
11
XPath: More Qualifiers
/bib/book[@price < “60”]
/bib/book[author/@age < “25”]
/bib/book[author/text()]
12
XPath: Summary
bib
matches a bib element
*
matches any element
/
matches the root element
/bib
matches a bib element under root
bib/paper
matches a paper in bib
bib//paper
matches a paper in bib, at any depth
//paper
matches a paper at any depth
paper|book
matches a paper or a book
@price
matches a price attribute
bib/book/@price
matches price attribute in book, in bib
bib/book[@price<“55”]/author/lastname matches…
13
XPath: More Details
• An XPath expression, p, establishes a relation between:
– A context node, and
– A node in the answer set
• In other words, p denotes a function:
– S[p] : Nodes  {Nodes}
• Examples:
–
–
–
–
author/firstname
. = self
.. = parent
part/*/*/subpart/../name = part/*/*[subpart]/name
14
The Root and the Root
<bib>
<paper> 1 </paper>
<paper> 2 </paper>
</bib>
•
•
bib is the “document element”
The “root” is above bib
•
•
/bib = returns the document element
/
= returns the root
•
•
•
Why?
Because we may have comments before and after <bib>
They become siblings of <bib>
15
XPath: More Details
• We can navigate along 13 axes:
ancestor
ancestor-or-self
parent
attribute
We’ve only seen these, so far
child
descendant-or-self
descendant
following
following-sibling
namespace
preceding
preceding-sibling
self
16
XPath: More Details
• Examples:
– child::author/child:lastname
= author/lastname
– child::author/descendant-or-self::node()/child::zip
= author//zip
– child::author/parent::*
= author/..
– child::author/attribute::age
= author/@age
• What does this mean ?
– /bib/book/publisher/parent::*/author
– /bib//address[ancestor::book]
– /bib//author/ancestor::*//zip
17
XPath: Even More Details
• name() = the name of the current node
/bib//*[name()=book] same as /bib//book
• What does this mean?
/bib//*[ancestor::*[name()!=book]]
Is it equivalent to the following?
/bib//*
/bib//*[name()!=book]//*
• Navigation axis gives us strictly more power!
18
XPath: Example
How do we evaluate this XPath expression?
/bib//*[name()!=book]//*
Let’s take it one step at a time
bib
A
B
book
C
D
19
XPath: Example
/bib
returns the following list of one node:
Node
bib
A
B
book
C
D
20
XPath: Example
/bib//*
when executed on the
previous node list,
returns the following
new list of nodes:
Node
A
B
book
C
D
book
C
D
C
D
21
XPath: Example
/bib//*[name()!=book]
when executed on the
previous node list,
it eliminates one node:
Node
A
B
book
C
D
C
D
22
XPath: Example
/bib//*[name()!=book]//*
gives us the resulting
node list of the XPath
expression:
Node
book
C
D
C
D
23
Keys in XML Schema
• We forgot something about XML Schema
– Keys
– Key References
• Why?
• XPath is used for keys and key references
24
Keys in XML Schema
XML:
<purchaseReport>
<regions>
<zip code="95819">
<part number="872-AA" quantity="1"/>
<part number="926-AA" quantity="1"/>
<part number="833-AA" quantity="1"/>
<part number="455-BX" quantity="1"/>
</zip>
<zip code="63143">
<part number="455-BX" quantity="4"/>
</zip>
</regions>
<parts>
<part number="872-AA">Lawnmower</part>
<part number="926-AA">Baby Monitor</part>
<part number="833-AA">Lapis Necklace</part>
<part number="455-BX">Sturdy Shelves</part>
</parts>
</purchaseReport>
XML Schema:
<key name="NumKey">
<selector xpath="parts/part"/>
<field xpath="@number"/>
</key>
25
Keys in XML Schema
XML Schema:
<xs:element name="purchaseReport">
<xs:complexType>
<xs:sequence>
<xs:element name="regions">
…
</xs:element>
<xs:element name="parts">
…
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:key name="numKey">
<xs:selector xpath="parts/part" />
<xs:field xpath="@number" />
</xs:key>
<keyref name="numKeyRef" refer="numKey">
<selector xpath="regions/zip/part" />
<field xpath="@number" />
</keyref>
</xs:element>
26
Keys in XML Schema
• In general, two flavors:
<key name=“someNameHere">
<selector xpath=“p"/>
<field xpath=“p1"/>
<field xpath=“p2"/>
…
<field xpath=“pk"/>
</key>
<unique name=“someNameHere">
<selector xpath=“p"/>
<field xpath=“p1"/>
<field xpath=“p2"/>
…
<field xpath=“pk"/>
</key>
Note
• All XPath expressions “start” at the element
currently being defined
• The fields must identify a single node
27
Keys in XML Schema
• Unique = guarantees uniqueness
• Key = guarantees uniqueness and existence
• All XPath expressions are “restricted”:
– /a/b | /a/c
– //a/b/*/c
OK for selector
OK for field
• Note: better than DTD’s ID mechanism
28
Keys in XML Schema
• Examples
Recall: must have
a single forename,
surname
<key name="fullName">
<selector xpath=".//person"/>
<field xpath="forename"/>
<field xpath="surname"/>
</key>
<unique name="nearlyID">
<selector xpath=".//*"/>
<field xpath="@id"/>
</unique>
29
Foreign Keys in XML Schema
• Examples
<keyref name="personRef" refer="fullName">
<selector xpath=".//personPointer"/>
<field xpath="@first"/>
<field xpath="@last"/>
</keyref>
30
References
• Lecture Slides
– Dan Suciu
– http://www.cs.washington.edu/homes/suciu/COURSES/590DS/0
6xpath.htm
– http://www.cs.washington.edu/homes/suciu/COURSES/590DS/1
4constraintkeys.htm
• BRICS XML Tutorial
– A. Moeller, M. Schwartzbach
– http://www.brics.dk/~amoeller/XML/index.html
• W3C's XPath homepage
– http://www.w3.org/TR/xpath
• W3C's XML Schema homepage
– http://www.w3.org/XML/Schema
• XML School
– http://www.w3schools.com
31
Download