Quick Intro to XPath

advertisement
1
Quick Intro to XPath
Roger L. Costello
14 December, 2012
2
Objective
• XML Schema 1.1 uses XPath a lot, so if you don't
know XPath then you're at a disadvantage.
• The purpose of this short tutorial is to teach you
enough XPath that you won't be at a disadvantage.
3
XPath is not a
standalone language
• XPath requires a host language. There are
currently several XML languages that host
XPath.
4
XPath is not a
standalone language
XML
Schemas
XSLT
XPath
XQuery
XPointer
Schematron
This XML document can be
represented as a tree, as shown below
Document
/
PI
<?xml version=“1.0”?>
Text
Jeff
5
Element
FitnessCenter
Element
Member
Element
Member
Element
Name
<?xml version="1.0"?>
<FitnessCenter>
<Member>
<Name>Jeff</Name>
<FavoriteColor>lightgrey</FavoriteColor>
</Member>
<Member>
<Name>David</Name>
<FavoriteColor>lightblue</FavoriteColor>
</Member>
<Member>
<Name>Roger</Name>
<FavoriteColor>lightyellow</FavoriteColor>
</Member>
</FitnessCenter>
Element
FavoriteColor
Text
lightgrey
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
6
Terminology - node
Document node
Processing Instruction (PI) node
Element nodes
Document
/
PI
<?xml version=“1.0”?>
Text nodes
Element
Name
Text
Jeff
Element
FitnessCenter
Element
Member
Element
Member
Element
FavoriteColor
Text
lightgrey
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
7
Document
/
PI
<?xml version=“1.0”?>
With
respect to
this node,
these are
its children
Element
Name
Text
Jeff
Element
FitnessCenter
Element
Member
Element
Member
Element
FavoriteColor
Text
lightgrey
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
8
Document
/
PI
<?xml version=“1.0”?>
These are
its
descendant
nodes
Element
Name
Text
Jeff
Element
FitnessCenter
Element
Member
Element
Member
Element
FavoriteColor
Text
lightgrey
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
9
Document
/
PI
<?xml version=“1.0”?>
This is the
context node
Element
Name
Text
Jeff
Element
FitnessCenter
Element
Member
Element
Member
Element
FavoriteColor
Text
lightgrey
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
10
Document
/
PI
<?xml version=“1.0”?>
Element
Member
Element
Name
Text
Jeff
Element
FitnessCenter
That's
its parent
Element
FavoriteColor
Text
lightgrey
Element
Member
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
11
Document
/
PI
<?xml version=“1.0”?>
Element
Member
Element
Name
Text
Jeff
Element
FitnessCenter
Those are
its
ancestors
Element
FavoriteColor
Text
lightgrey
Element
Member
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
12
Document
/
PI
<?xml version=“1.0”?>
Element
Member
Element
Name
Text
Jeff
Element
FitnessCenter
It has 2
siblings
Element
FavoriteColor
Text
lightgrey
Element
Member
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
13
Document
/
PI
<?xml version=“1.0”?>
Element
Member
Element
Name
Text
Jeff
Element
FitnessCenter
They are
followingsiblings
Element
FavoriteColor
Text
lightgrey
Element
Member
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
14
Document
/
PI
<?xml version=“1.0”?>
Element
Member
Element
Name
Text
Jeff
Element
FitnessCenter
It has no
precedingsiblings
Element
FavoriteColor
Text
lightgrey
Element
Member
Element
Name
Text
David
Element
Member
Element
FavoriteColor
Text
lightblue
Element
Name
Text
Roger
Element
FavoriteColor
Text
lightyellow
15
Here are the
capabilities of XPath
• XPath provides a syntax for:
–
–
–
–
navigating around an XML document
selecting nodes and values
comparing node values
performing arithmetic on node values
• XPath provides some functions (e.g.,
concat(), substring(), etc.) to facilitate the
above.
This XML document can be
represented as a tree, as shown below
<?xml version="1.0"?>
Document classification="secret">
<Para classification="unclassified">
One if by land, two if by sea;
</Para>
<Para classification="confidential">
And I on the opposite shore will be,
Ready to ride and spread the alarm
</Para>
<Para classification="unclassified">
Ready to ride and spread the alarm
Through every Middlesex, village and farm,
</Para>
</Document>
16
Document
/
PI
<?xml version=“1.0”?>
Element
Para
Text
One if …
Element
Document
Element
Para
Attribute
classification=“unclassified”
Text
And I …
Attribute
classification=“confidential”
Attribute
classification=“secret”
Element
Para
Attribute
classification=“un
Text
Ready to
See document.xml in the xpath
folder, within the examples folder.
17
Execute XPath using
Oxygen XML
Type your XPath expression here
Change this to XPath 1.0
18
Use XPath Builder
for long XPath
expressions
19
Please Run the
XPath Expressions
• The following slides contain XPath
expressions.
• It's important that you copy the expression
on the slide and paste it into Oxygen XML
to see what the expression does.
• First, copy the XML document on slide 16,
save it to a file, then drag and drop the file
into Oxygen XML.
20
Select all Para Elements
/Document/Para
21
/Document/Para
This is an absolute XPath expression
22
Establish a Context Node
Click on this to establish it as the "context node"
(any XPath expressions will be relative to it)
23
Relative XPath Expression
In Oxygen XML click on <Document> to
establish the “context node” and then type
this in the XPath box:
Para
24
Select all Para Elements
//Para
descendents
25
Select the first Para
//Para[1]
26
Select the last Para
//Para[last()]
27
Select the classification
attribute of the first Para
//Para[1]/@classification
Is the Document
element’s classification
top-secret?
/Document/@classification = 'top-secret'
28
Is the Document
element’s classification
top-secret or secret?
(/Document/@classification = 'top-secret') or
(/Document/@classification='secret')
29
30
Logical Operators
A or B
A and B
not(A)
31
Select all Para’s with a
secret classification
//Para[@classification = 'secret']
32
Check that no Para has a
top-secret classification
not(//Para[@classification = 'top-secret'])
33
Establish a New Context
Node
Make the second Para the context node
34
Select the Following Siblings
following-sibling::*
35
Select the First
Following Sibling
following-sibling::*[1]
36
Add Another Element
Add this <Test> element after the last Para
37
Select the Following
Para Siblings
following-sibling::Para
38
Select all Following Siblings
following-sibling::*
39
Select all Preceding Siblings
preceding-sibling::*
40
Make Document the Context
Click on Document to make it the
context node.
41
Equivalent!
Para[1]
child::Para[1]
42
Make Para[2] the context
Establish this as the context node.
43
Get parent element's
classification
../@classification
44
Equivalent!
../@classification
parent::*/@classification
45
Axis
following-sibling
preceding-sibling
child
parent
ancestor
descendent
self
46
Count the number of
Para elements
count(//Para)
Count the number of Para
elements with secret
classification
count(//Para[@classification = 'secret'])
47
Does the first Para
element contain the string
“SCRIPT”?
contains(//Para[1], 'SCRIPT')
48
49
Select all nodes containing
the string “SCRIPT”
//node()[contains(., 'SCRIPT')]
The node() function matches on these nodes:
- element
- text
- comment
- processing instructions (PIs)
Note that it does not match on these nodes:
- attribute
- document
Count the number of
nodes containing the
string “SCRIPT”
count(//node()[contains(., 'SCRIPT')])
50
51
Select the first 20
characters of the first Para
substring(//Para[1], 1, 20)
52
What's the length of the
content of the first Para?
string-length(//Para[1])
53
Convert Document’s
classification to lowercase
translate(/Document/@classification, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')
54
Add a new <Cost> element
Add this element and establish
Document as the context node.
55
Multiply Cost by 2
Cost * 2
56
N mod X = the remainder of
dividing N by X
Cost mod 2
57
Arithmetic Operators
*
mod
- (leave space on either side)
div
+
58
Set this to XPath 2.0
Does Document’s
classification match one in
Classifications.xml?
/Document/@classification = doc('Classifications.xml')/Classifications/li
59
60
Do the first two Para's have
the same classification?
Para[1]/@classification eq Para[2]/@classification
61
Boolean Operators
eq means equal
ne means not equal
lt means less than
gt means greater than
le means less than or equal to
ge means greater than or equal to
if Document's classification is
top-secret then there can be
no Para with a classification
not equal to top-secret
if (/Document/@classification eq 'top-secret') then not(//Para[@classification ne 'top-secret']) else true()
62
63
Two built-in functions
true()
false()
64
Cast a value to a numeric
type
number(Cost)
Check that Document's
children are: multiple Para's,
1 Test, and 1 Cost
(and nothing else)
Para[2] and Test and Cost and empty(* except (Para, Test[1], Cost[1]))
65
66
The sum() function
<?xml version="1.0"?>
<numbers>
<number>23</number>
<number>5</number>
<number>-41</number>
<number>50</number>
<number>12</number>
</numbers>
sum(//number)
 returns 49.0
67
Check that every Publisher
has a string-length le 140
<BookStore>
<Book>
<Title>My Life and Times</Title>
<Author>Paul McCartney</Author>
<Date>1998</Date>
<ISBN>1-56592-235-2</ISBN>
<Publisher>McMillin Publishing</Publisher>
<Author>John Ghostwriter</Author>
</Book>
<Book>
<Publisher>Dell Publishing Co.</Publisher>
<Author>Richard Bach</Author>
<Date>1977</Date>
<ISBN>0-440-34319-4</ISBN>
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
</Book>
<Book>
<ISBN>0-06-064831-7</ISBN>
<Title>The First and Last Freedom</Title>
<Author>J. Krishnamurti</Author>
<Date>1954</Date>
<Publisher>Harper & Row</Publisher>
</Book>
</BookStore>
68
Check that every Publisher
has a string-length le 140
every $i in //Publisher satisfies string-length($i) le 140
69
The XPath every
expression
• The form of the every expression is:
every variable in sequence satisfies boolean expression
• The result of the expression is either true or
false.
70
Equivalent
every $i in //Publisher satisfies string-length($i) le 140
not(//Publisher[string-length(.) gt 140])
Download