Intro to XML

advertisement
Understanding XML
An Introduction to XML
Sandeep Bhattaram
Summary of Introduction *


HTML was designed to ‘Display’ and
format data based on ‘Syntax’
XML is designed to ‘Describe’ and
structure data based on ‘Semantics’
Types of ‘Data’ *

Structured Data

Semi-Structured Data

Unstructured Data
Semi-Structured Data
- An Example


Schema information
is mixed with data
objects and values
No predefined
schema for the data
to conform to.
Unstructured Data


No structure for the data to conform to.
Example : HTML
<table>
<TR>
<TD> XML Class </TD>
<TD> very interesting</TD>
</TR>
</table>
Basics of XML *




XML stands for eXtensible Markup
Language
XML is a mark up language
User defines his own tags in XML
XML is Self-Descriptive
Basic example of XML
<note>
<to>Everyone</to>
<from>Sandeep</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>


XML doesn’t “DO” anything
With XML your data is stored
outside your HTML
Basics Contd..


Exchange of data between incompatible
systems via XML
XML can be used to Store Data
Definition:
XML is a cross-platform, software and
hardware independent tool for
describing and transmitting information.
XML Syntax *

The syntax rules of XML are very
simple, self-describing & very strict.
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Everyone</to>
<from>Sandeep</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XML Syntax Rules

XML Data Model has TWO structuring
concepts
1.
Elements
2.
Attributes
XML Syntax Rules Contd...
All XML elements must have a
closing tag
 XML tags are case sensitive
 All XML elements must be properly
nested…
NOT -> <B> <I> abc </B> </I>

XML Syntax Rules Contd...

All XML documents must have a
root element – Tree Model
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
XML Syntax Rules Contd...

Attribute values must always be
quoted
<?xml version="1.0" encoding="ISO-8859-1"?>
<note date=12/11/2002>
<to>Everyone</to>
<from>Sandeep</from>
</note>

Comments in XML
<!-- This is a comment -->
XML Elements *

1.
2.
3.
4.
Elements classified w.r.t Contents
element content
mixed content
simple content
empty content
XML Elements Contd...
Element – content example
<book>
<title>My First XML</title>
<prod id="33-657” media="paper"></prod>
<chapter>Introduction to XML
<para>What is HTML</para>
<para>What is XML</para>
</chapter>
</book>

XML Elements Contd...

XML Elements are Extensible
<note>
<to>Everyone</to>
<from>Sandeep</from>
<body>Don't forget me this weekend!</body>
</note>
<date>2004-04-08</date>. Will the application crash ??
XML Element Naming Rules
XML elements must follow these basic
naming rules:
 Names can contain letters, numbers, and
other characters
 Names must not start with a number or
punctuation character
 Names must not start with the letters xml
(or XML or Xml ..)
 Names cannot contain spaces
XML Attributes *



<img src="computer.gif"> - HTML
<person sex="female"> - XML
So is this right??
<note day="12" month="11" year="2002"
to=“Everyone" from=“Sandeep”
heading="Reminder" body="Don't forget me
this weekend!"> </note>
XML Attributes Contd...


1.
2.
NO !
Use Child Elements:
<date>12/11/2002</date>
<date>
<day>12</day> <month>11</month>
<year>2002</year> </date>
Not <date day = “11” month = “12”...></date>
XML Attributes Contd...
Problems using attributes





attributes cannot contain multiple values
attributes are not easily expandable
attributes cannot describe structures
attributes are more difficult to manipulate by
program code
attribute values are not easy to test against a
Document Type Definition (DTD) - which is used to
define the legal elements of an XML document
When do we use attributes ?
XML Documents *


1.
2.
3.

Basic object in XML
Types:
Data-Centric
Document-Centric
Hybrid XML Documents
Document Declaration
<?xml version="1.0" encoding="ISO-8859-1”
standalone = “yes”?>
XML Document Type Definition


Well Formed XML Document = syntax
Valid XML Document
= Well Formed XML Document +
Conforms to DTD/XSD rules.
Definition
A DTD defines the legal elements of an XML
document.
XML DTD Contd...


Inline DTD Document
<!DOCTYPE root-element [element-declarations]>
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note> <to>Everyone</to> <from>Sandeep</from>
<heading>Reminder</heading> <body>Don't forget me this
weekend</body> </note>
XML DTD Contd...

External DTD
<!DOCTYPE root-element SYSTEM "filename">


<!DOCTYPE note SYSTEM "note.dtd">
<note>... </note>
<!DOCTYPE note SYSTEM
"http://www.uark.edu/dtd/note.dtd">
<note> … </note>
XML DTD Contd...

1.
2.
3.
4.
5.
6.
Building blocks of XML DTD
Elements
Tags
Attributes
Entities - <, >, &, ", '
PCDATA – Parsed Character DATA
CDATA – Character Data
XML DTD Contd...

1.
2.
3.
4.
DTD Element Declarations (w.r.t content)
Empty - <!ELEMENT element-name EMPTY>
Character - <!ELEMENT element-name (#PCDATA)>
Any - <!ELEMENT element-name ANY>
Children –
<!ELEMENT element-name
(child-element-name,child-element-name,.....)>

+ (required multivalued), * optional multivalued,
| (or), ? (optional singlevalued), required single
valued
XML DTD Contd...

XML DTD Attribute
Declaration
<!ATTLIST element-name
attribute-name attribute-type
default-value>


Attribute Types
Default Types: value, EMPTY, #REQUIRED,
#IMPLIED, #FIXED “value”
XML DTD Contd...
XML DTD Contd...
Limitations of DTD:
1.
2.
3.
Data types in DTD are not very
general
DTD needs specialized processors
Unordered elements are not permitted
XML SCHEMA *


1.
2.
3.
XML Schema is used to structure the XML
Document into ‘legal’ blocks.
Advantages:
Supports data types
Written in XML
Facilitates secure data communication.
XML Schema – Key Points
XML Schema defines
 Elements
 Attributes
 What are the data types of elements and attributes
 Number of Children, Copies for an element
 Which elements are children, or have text or are
empty
 Order of Children
XML Schema-Key Points contd
XML Schema supports
 Name Spaces
df/f:note, df/f:note xmlns:df/f = “www…..”
 Data Types
 Extensible to future additions
XML Schema is a W3C Recommendation now!
XML SCHEMA- Example XML file
and DTD
XML SCHEMA - Example Schema
XML Schema – Reference to DTD,
XML schema
XML Schema - Simple Elements *



A simple element is an XML element that can
contain only text
<xs:element name="xxx" type="yyy"/>
Example - <lastname>Refsnes</lastname>
<xs:element name="lastname" type="xs:string"/>

XML Schema Datatypes:
xs:string,xs:decimal,xs:integer,xs:boolean,xs:date,xs:time


<xs:element …… default = “aaa” />
<xs: element …… fixed = “bbb” />
XML Schema – Attributes *






All attributes are declared as simple types.
Only complex elements can have attributes!
<xs:attribute name="xxx" type="yyy"/>
Example -<lastname lang="EN">Smith</lastname>
<xs:attribute name="lang" type="xs:string"/>
fixed = “ ”, default = “ ”
use = “optional/required”
XML Schema – Facets *

Facets are restrictions applied on Elements and
Attributes
4.
Facets , Constraint used–
Single value, minInclusive maxInclusive etc
Series of values, enumeration
White spaces, whiteSpace
Length, length minLength maxLength etc

Restrictions on Datatypes

1.
2.
3.
XML Schema – Single value Facet,
Facet on set of values
XML Schema – Facets on Series
of values


Pattern constraint – [],[][]..,([][]..)*, ([][]..)+, ([]|[]|..),
[] {}
Example:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/> ///or [a-zA-Z]
</xs:restriction>
</xs:simpleType>
</xs:element>
XML Schema – Facets on White
space characters



<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:whiteSpace value="replace"/>
<xs:whiteSpace value="collapse"/>
XML Schema – Facets on Length


Constraints = length, minLength, maxLength
Example:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
XML Schema –
Facets for Datatypes
XML Schema –Complex Elements*


1.
2.
3.
4.
A complex element is an XML element that
contains other elements and/or attributes.
There are four kinds of complex elements:
empty elements
elements that contain only other elements
elements that contain only text
elements that contain both other elements
and text
XML Schema – Complex Elements
example – Elements only

EXAMPLE:
<employee> <firstname> .. </firstname>
<lastname>..</lastname>
</employee>
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
XML Schema – Complex Elements
example – Empty elements only

EXAMPLE:
<product prodid="1345" />
<xs:element name="product” type="prodtype"/>
<xs:complexType name="prodtype">
<xs:attribute name="prodid" type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>

Similarly for Text Only elements and Mixed ele
XML Schema – Indicators *





Indicators are used to control How these
elements are used in the documents.
Order Indicators: All, Choice, Sequence
Occurrence Indicators:
maxOccurs, minOccurs
Group Indicators:
Group name, attributeGroup name
See Text Book Example.
XML Documents & Databases
XML Documents & Databases
XML Documents & Databases
XML Documents & Databases
XML Documents & Databases
XML Documents & Databases
XML Documents & Databases
Review of Topics Covered






Introduction
Types of Data
XML Basics
XML DTD
XML Schema
XML and Databases
References



Database Management Systems,
Chapter 26
www.w3.org
www.w3schools.com
Download