XML Technologies

advertisement
XML Technologies
XML
Dr Alexiei Dingli
1
What is XML?
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to carry data, not to display data
• XML tags are not predefined. You must define your own tags
• XML is designed to be self-descriptive
• XML is a W3C Recommendation
2
XML vrs HTML
• XML
– Is a meta language
– Focus on transport and storage of data
– Not a replacement to HTML!
• HTML
– Is a vocabulary of SGML
– Focus on display (formatting)
3
XML is just pure information
• Was not designed to do anything ...
• Just structure, store and transport
information
<stickynote>
<to>Joseph</to>
<from>Tom</from>
<body>Puchase tickets!</body>
</stickynote>
4
Format
• The format of a .xml document is plain text
• Only XML aware applications can interpret
it correctly
• But it can be easily viewed/edited by
anyone using a simple text editor
Tip: Internet Explore can be used as a
viewer and validator of XML (Eg1, Eg2) 5
Let’s be creative ...
• The tags in the example above (like <to> and <from>) are not
defined in any XML standard. These tags are "invented" by the
author of the XML document
• That is because the XML language has no predefined tags, it’s a
meta language!
• The tags used in HTML (and the structure of HTML) are predefined.
HTML documents can only use tags defined in the HTML standard
(like <p>, <h1>, etc.)
• XML allows the author to define his own tags and his own document
6
structure
Definition
XML is a software and hardware
independent tool for carrying
information
Note: the specification of the language can be found http://www.w3.org/XML/
7
Content Vs. Layout
• To display dynamic data in your HTML document, it will take a lot of
work to edit the HTML each time the data changes
• With XML, data can be stored in separate XML files
• User can concentrate on using HTML for layout and display, and be
sure that changes in the underlying data will not require any changes
to the HTML
• With a few lines of JavaScript, one can read an external XML file
and update the data content of the HTML.
8
Simple data sharing
• Most systems have data in incompatible
formats
• XML is stored in plain text, thus it is
software/hardware independent
• Much easier to share information
9
XML is a meta language!
• As such, you can create new languages ...
– XHTML the latest version of HTML
– WSDL for describing available web services
– WAP and WML as markup languages for
handheld devices
– RSS languages for news feeds
– RDF and OWL for describing resources and
ontology
– SMIL for describing multimedia for the web10
XML Tree (1)
• All xml documents are in the form of a tree
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
11
XML Tree (2)
12
XML Tree (3)
• Simple example ...
<stickynote>
<to>Joseph</to>
<from>Tom</from>
<body>Puchase tickets!</body>
</stickynote>
13
XML Tree (4)
• Root element
<stickynote>
• Children elements
<to>
<from>
<body>
14
XML Tree (4)
15
34U exercise 
• Create the tree for ...
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
16
XML Commandments
17
Commandment 1
For every opening Tag, there must be a
closing Tag
<p>This is a paragraph
<p>This is a paragraph</p>
18
Commandment 2
XML Tags are case sensitive
<Message>This is incorrect</message>
<message>This is correct</message>
19
Commandment 3
XML Elements Must be Properly Nested
<b><i>This text is bold and italic</b></i>
<b><i>This text is bold and italic</i></b>
20
Commandment 4
XML Documents must have a root element
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
21
Commandment 5
XML attributes must be quoted
<stickynote date=1/10/2008>
<stickynote date=“12/11/2007”>
22
Commandment 6
Some characters have special meaning in XML
Shortcut
Symbol
Meaning
<
<
less than
>
>
greater than
&
&
ampersand
'
'
apostrophe
"
"
quotation mark
<message>Meet me at Tom’s place</message>
23
<message>Meet me at Tom ' s place</message>
Commandment 7
Comments in XML
<!-- This is a comment -->
24
Elements Vrs Attributes
<book category="CHILDREN">
<title>Harry Potter</title>
</book>
• book is an element
– which can contain
• other elements (such as title)
• Or text content (such as Harry Potter in title)
• category is an attribute
– Whose value is CHILDREN
25
What’s in a name?
• Naming rules ...
– Names can contain letters, numbers and other
characters
– Names must not start with a number or
punctuation character
– Names must not start with the letters xml (or
XML, or Xml, etc)
26
– Names cannot contain spaces
Best (Name) Practices
•
Make names descriptive. Names with an underscore separator are nice: <first_name>,
<last_name>.
•
Names should be short and simple, like this: <book_title> not like this:
<the_title_of_the_book_which_i_am_currently_reading>.
•
Avoid "-" characters. If you name something "first-name," some software may think you
want to subtract name from first.
•
Avoid "." characters. If you name something "first.name," some software may think that
"name" is a property of the object "first."
•
Avoid ":" characters. Colons are reserved to be used for something called namespaces.
•
XML documents often have a corresponding database. A good practice is to use the
naming rules of your database for the elements in the XML documents.
•
Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your
software vendor doesn't support them.
27
More into attributes ...
• Generally used to provide additional
information not part of the data
• Use quotes and for a quote within a quote,
use “"”
• Some limitations of attributes
– attributes cannot contain multiple values
– attributes cannot contain tree structures
– attributes are not easily expandable in future
28
• If in doubt use elements
Spot the difference ...
<note date="10/01/2008">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note>
<date>10/01/2008</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note>
<date>
<day>10</day>
<month>01</month>
<year>2008</year>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
29
Well formed documents ...
1. XML documents must have a root
element
2. XML elements must have a closing tag
3. XML tags are case sensitive
4. XML elements must be properly nested
5. XML attribute values must be quoted
30
Valid documents ...
• Is a "Well Formed" XML document, which
also conforms to the rules of a Document
Type Definition (DTD)
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
…
</note>
31
Example DTD
• A DTD is used to define the structure of an
XML document but its not in XML!
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
32
Example XSchema
• An XSchema is an XML alternative to a
DTD
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
33
Errors!
• Errors in XML documents will stop your XML
applications
• XML software should be small, fast, and
compatible
• HTML browsers will display documents with
errors (like missing end tags)
• HTML browsers are big and incompatible
because they have a lot of unnecessary code to
deal with (and display) HTML errors
34
XML Viewing
• Just use a normal browser ...
– simple.xml
– cd_catalog.xml
– plant_catalog.xml
35
Better XML Viewing
• Just use the Cascading Style Sheet (CSS)
• Without CSS
• With CSS
36
The CSS
CATALOG { background-color: #ffffff; width: 100%; }
CD { display: block; margin-bottom: 30pt; margin-left: 0; }
TITLE { color: #FF0000; font-size: 20pt; }
ARTIST { color: #0000FF; font-size: 20pt; }
COUNTRY,PRICE,YEAR,COMPANY { display: block;
color: #000000; margin-left: 20pt; }
37
Even better XML viewing
• Use XSLT
– XSLT is the recommended style sheet
language of XML
– XSLT (eXtensible Stylesheet Language
Transformations) is far more sophisticated
than CSS
– One way to use XSLT is to transform XML into
HTML before it is displayed
38
XSL Example
• The XML
• The XSL
• The result
39
Exercise
• Amazon just commissioned you to create an XML file for the
following book as follows:
– Title
A.I. a modern approach
– Author
Russel and Norvig
– Publisher
Prentice Hall
– Date of Publication
2000
– ISBN
1234567
– Dimensions
10 x 5
– Number of Pages
500
– Comments
2 in store 1, 3 in store 2
– Review
Quite interesting!
– Image
http://www.amazon.com/AIBook
40
Download