05-XML

advertisement
XML
Written by Dr. Yaron Kanza, Edited by Liron Blecher
Agenda
• What is XML
• Parsing XML
• DOM
• SAX
• XML Scheme
• JAXB Binding
What is XML
• XML stands for EXtensible Markup Language
• It is a meta-language that describes the content of a
document (self-describing data)
• If Java = Portable Programs then XML = Portable Data
• XML does not specify the tag set or grammar of the
language
• Tag Set – markup tags that have meaning to a language
processor
• Grammar – rules that define correct usage of a
language’s tags
3
What is XML
• An XML file has the following syntax rules:
• All data is contained within tags
• Tags are marked using <xxxxx> brackets where
xxxxx is the tag name
• Each tag must be closed (using </xxxxx> tag)
• The data between the opening and closing tags is
the value of the tags
• Tags can be nested
4
What is XML
• If a tag does not contain any value it can be
opened and closed like this: <xxxxx />
• An XML document must contain a single root tag
• Each tag can also have multiple attributes defined
in it (where the tag is opened), for example:
<country name=“Israel” capital=“Jerusalem” />
• There are two attributes: name and capital
• Each attribute value must be inside inverted commas
5
Agenda
• What is XML
• Parsing XML
• DOM
• SAX
• XML Scheme
• JAXB Binding
Parsing
• Parsing means reading some input and analyzing it
according to grammar rules
• In regular text the grammar are end of lines, word
spacing, etc.
Formal
grammar
Input
7
Analyzed
Parser
Data
The structure(s) of the input, according to the
atomic elements and their relationships (as
described in the grammar)
Parsing XML
• There are 2+1 methods for parsing XML files:
• DOM – Document Object Model
• SAX – Simple API for XML
• JAXB Binding – Only when you have an XML
definition (usually scheme file that ends with XSD)
8
Agenda
• What is XML
• Parsing XML
• DOM
• SAX
• XML Scheme
• JAXB Binding
DOM
• Parser creates a tree object out of the document
• User accesses data by traversing the tree
• The tree and its traversal conform to a W3C
standard
• The API allows for constructing, accessing and
manipulating the structure and content of XML
documents
10
DOM – Example
XML File
DOM Parser
DOM Tree
in memory
11
A
P
I
Application
DOM - Example
<?xml version="1.0"?>
<countries>
<country continent=“Asia">
<name>Israel</name>
<population year="2001">6,199,008</population>
<city capital="yes"><name>Jerusalem</name></city>
<city captial=”no”><name>Ashdod</name></city>
</country>
<country continent=“Europe">
<name>France</name>
<population year="2004">60,424,213</population>
</country>
</countries>
12
DOM – Example (The DOM Tree)
Document
countries
country
continent
name
city
Asia
Israel
population
year
2001
13
capital
city
capital
name
name
country
population
no Ashdod
6,199,008
year
yes
continent
Jerusalem
Europe
name
France 2004
60,424,213
DOM – Example (Creating the tree)
•
A DOM tree is generated by a DocumentBuilder
•
The builder is generated by a factory, in order to be
implementation independent
•
The factory is chosen according to the system
configuration
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("world.xml");
14
DOM – Example (configuring the factory)
•
The methods of the document-builder factory enable
you to configure the properties of the document
building
•
You can also add the schema file to the factory for
additional validations on the XML structure
•
For example
• factory.setValidating(true)
• factory.setIgnoringComments(false)
15
DOM – Example (the Node interface)
The nodes of the DOM tree include
• A special root (denoted document)
• The Document interface retrieved by builder.parse(…) actually extends the
Node Interface
• element nodes
• text nodes
• attributes
• comments
• and more ...
Every node in the DOM tree implements the Node
interface
16
DOM – Example (interfaces in the DOM tree)
Node
DocumentFragment
Document
Text CDATASection
CharacterData
Comment
Attr
Element
DocumentType
NodeList
Notation
Entity
NamedNodeMap
EntityReference
ProcessingInstruction
DocumentType
Figure as appears in : “The XML Companion” - Neil Bradley
17
DOM – Example (interfaces in the DOM tree)
Document
Document Type
Attribute
Text
Attribute
Element
Comment
18
Element
Element
Entity Reference
Text
Element
Text
Text
DOM – Example (Node Navigation)
Every node has a specific location in tree
Node interface specifies methods for tree navigation
• Node getFirstChild();
• Node getLastChild();
• Node getNextSibling();
• Node getPreviousSibling();
• Node getParentNode();
• NodeList getChildNodes();
• NamedNodeMap getAttributes()
19
DOM – Example (Node Navigation)
getPreviousSibling()
getFirstChild()
getChildNodes()
getParentNode()
getLastChild()
getNextSibling()
20
DOM – Example (Node Properties)
Every node has
• a type
• a name
• a value
• attributes
The roles of these properties differ according to the
node types
Nodes of different types implement different
interfaces (that extend Node)
21
DOM – Example (Node Type)
ELEMENT_NODE = 1
PROCESSING_INSTRUCTION_NODE = 7
ATTRIBUTE_NODE = 2
COMMENT_NODE = 8
TEXT_NODE = 3
DOCUMENT_NODE = 9
CDATA_SECTION_NODE = 4
DOCUMENT_TYPE_NODE = 10
ENTITY_REFERENCE_NODE = 5 DOCUMENT_FRAGMENT_NODE = 11
ENTITY_NODE = 6
NOTATION_NODE = 12
if (myNode.getNodeType() == Node.ELEMENT_NODE) {
//process node
…
}
22
DOM – Node Manipulation
Children of a node in a DOM tree can be
manipulated - added, edited, deleted, moved,
copied, etc.
To constructs new nodes, use the methods of
Document
• createElement, createAttribute, createTextNode, etc.
To manipulate a node, use the methods of Node:
• appendChild, insertBefore, removeChild, replaceChild,
setNodeValue, cloneNode(boolean deep) etc.
23
DOM – Example (Node Manipulation)
Old
New
replaceChild
deep = 'false'
cloneNode
deep = 'true'
Figure as appears in “The XML Companion” - Neil Bradley
24
examples.xml
DEMO
25
Agenda
• What is XML
• Parsing XML
• DOM
• SAX
• XML Scheme
• JAXB Binding
SAX
• XML is read sequentially
• When a parsing event happens, the parser invokes
the corresponding method of the corresponding
handler
• The handlers are programmer’s implementation of
standard Java API (i.e., interfaces and classes)
• We won’t get into this type of parser as it is very
complicated and not required for small XML files
27
Agenda
• What is XML
• Parsing XML
• DOM
• SAX
• XML Scheme
• JAXB Binding
JAXB Binding
• Marshaling
• Un-Marshaling
29
Jaxb binding
DEMO
30
Download