IDK0040 Võrgurakendused I XML Deniss Kumlander XML intro • XML stands for EXtensible Markup Language • XML is a markup language much like HTML and was invented to describe data • XML tags are not predefined, so developers can define own tags. • XML uses either a Document Type Definition (DTD) or an XML Schema to describe the structure of the documents’ tags and restrictions • XML is a W3C Recommendation Use • Exchange data • Store data • Make it platform independent, i.e. Have a broader “client” Syntax <?xml version="1.0" encoding="ISO-8859-1"?> <root> <child> <subchild>.....</subchild> <subchild>.....</subchild> </child> <child> <subchild>.....</subchild> <subchild>.....</subchild> </child> </root> Example <?xml version="1.0" encoding="ISO-8859-1"?> <mail> <note> <to>IDK0040</to> <from>TTU</from> <heading>Reminder</heading> <body>Don't forget to be at lectures!</body> </note> <note> <to>IDK0040</to> <from>TTU</from> <heading>Reminder 2</heading> <body>Exams are close!</body> </note> </mail> Attributes <note date=“31.12.2006”> <to>IDK0040</to> <from>TTU</from> <heading>Reminder</heading> <body>Don't forget to be at lectures!</body> </note> Avoid using attributes? • • • • • Should we avoid using attributes? Some of the problems with using attributes are: attributes cannot contain multiple values (child elements can) attributes are not easily expandable (for future changes) attributes cannot describe structures (child elements can) attributes are more difficult to manipulate by program code attribute values are not easy to test against a Document Type Definition (DTD) - which is used to define the legal elements of an XML document Initial validation • • • • • XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested XML attribute values must always be quoted XML and CSS <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/css" href="cd_catalog.css"?> <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> </CATALOG> CATALOG { background-color: #ffffff; width: 100%; } CD { display: block; margin-bottom: 30pt; margin-left: 0; } TITLE { color: #FF0000; font-size: 20pt; } ARTIST { color: #0000FF; font-size: 20pt; } COUNTRY,PRICE,YEAR,COMPANY { display: block; color: #000000; margin-left: 20pt; } XML Data Embedded in HTML (“XML Data Island”) IE only <?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <html> <body> <xml id="note" src="note.xml"></xml> Just inform browser (i.e. Link xml) – the actual use is later ... ... ... <table border="1" datasrc="#note"> <tr> <td><span datafld="to"></span></td> <td><span datafld="from"></span></td> </tr> </table> ... </body> </html> XML Namespaces • Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names or tags are the same as for HTML XML Namespaces <f:table> <f:name>Work Desk</f:name> <f:width>700</f:width> <f:length>1200</f:length> </f:table> Where f should mean a “furniture” to differenciate from something else XML Namespaces <f:table xmlns:f="http://www.ttu.ee/furniture"> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table> <table xmlns="http://www.ttu.ee/furniture"> <name>African Coffee Table</name> <width>80</width> <length>120</length> </table> • Instead of using only prefixes, we have added an xmlns attribute to the <table> tag to give the prefix a qualified name associated with a namespace. • When a namespace is defined in the start tag of an element, all child elements with the same prefix are associated with the same namespace. • Note that the address used to identify the namespace is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, very often companies use the namespace as a pointer to a real Web page containing information about the namespace. XML schema description • DTD – Document type definition • XML Schemas - an XML-based alternative to DTD. DTD If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: internal <!DOCTYPE root-element [element-declarations]> external <!DOCTYPE root-element SYSTEM "filename"> Internal DTD Example <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> Why use a DTD? • With DTD, each of your XML files can carry a description of its own format with it. • With a DTD, independent groups of people can agree to use a common DTD for interchanging data. • Your application can use a standard DTD to verify that the data you receive from the outside world is valid. • You can also use a DTD to verify your own data. DTD: The building blocks • Elements - Elements are the main building blocks of both XML and HTML documents, i.e. tags • Attributes - Attributes provide extra information about elements. • Entities - Entities are variables used to define common text. Entity references are references to entities. Most of you will know the HTML entity reference: "&nbsp;“ • PCDATA - PCDATA means parsed character data. • CDATA - CDATA also means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded. DTD: Elements Declared <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)> Empty element: <!ELEMENT element-name EMPTY> Doesn’t contain any content – for example see a html tag called br Character data: <!ELEMENT element-name (#PCDATA)> DTD: Elements • With children: <!ELEMENT element-name (child-element-name)> or <!ELEMENT element-name (child-element-name,child-elementname,.....)> • example: <!ELEMENT note (to,from,heading,body)> When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children. DTD: Element • Only one occurrence (must occur and only once): – <!ELEMENT element-name (child-name)> – <!ELEMENT note (message)> • Minimum one occurrence (can be more than 1) – <!ELEMENT element-name (child-name+)> – <!ELEMENT note (message+)> • Zero or more occurrences – <!ELEMENT element-name (child-name*)> – <!ELEMENT note (message*)> • Zero or one – <!ELEMENT element-name (child-name?)> – <!ELEMENT note (message?)> • Either one or another: – <!ELEMENT note (to,from,header,(message|body))> DTD: Attributes • Declaration – <!ATTLIST element-name attribute-name attribute-type default-value> – <!ATTLIST payment type CDATA "check"> • Attribute-type can be: • Default-value can be DTD: Entity Entity can be seen as a defined constant • Syntax: – <!ENTITY entity-name "entity-value"> • DTD Example: – Define <!ENTITY writer “Leo Võhandu"> <!ENTITY copyright “TTU"> – Use <author>&writer; &copyright;</author> XML Schemas: XSD Another, modern way to describe xml structure • Why instead of DTD: – XML Schemas are extensible to future additions – XML Schemas are richer and more powerful than DTDs – XML Schemas are written in XML – XML Schemas support data types – XML Schemas support namespaces XSD: Example • Example <xs:schema xmlns:xs="http://.../XMLSchema" targetNamespace="http://..." xmlns=“...." elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> XSD: Simple elements • A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes (but the text can be of any type!) • Declaration – – – • <xs:element name="xxx" type="yyy"/> <xs:element name="xxx" type="yyy“ default=“zzz”/> <xs:element name="xxx" type="yyy“ fixed=“zzz”/> Build-in types: – – – – – – – – – – – xs:string xs:decimal xs:integer xs:boolean xs:date (YYYY-MM-DD +zone) xs:time xs:dateTime xs:time xs:hexBinary xs:base64Binary xs:anyURI Value definer XSD: Simple element example • XML: – <lastname>Võhandu</lastname> – <age>36</age> – <dateprof>1974-01-02</dateprof> • XSD: – <xs:element name="lastname" type="xs:string"/> – <xs:element name="age" type="xs:integer"/> – <xs:element name="dateprof" type="xs:date"/> XSD: Attributes • Declaration – <xs:attribute name="xxx" type="yyy"/> – <xs:attribute name="xxx" type="yyy“ default=“zzz”/> – <xs:attribute name="xxx" type="yyy“ fixed=“zzz”/> Note: simple elements cannot have attributes XSD: Restrictions defines a value range for a number • <xs:element name=“percentage_int"> – <xs:simpleType> • <xs:restriction base="xs:integer"> – <xs:minInclusive value="0"/> – <xs:maxInclusive value="120"/> • </xs:restriction> – </xs:simpleType> • </xs:element> XSD: Restrictions (set) defines a value range for the string • <xs:element name="car"> – <xs:simpleType> • <xs:restriction base="xs:string"> – <xs:enumeration value="Audi"/> – <xs:enumeration value=“VW"/> – <xs:enumeration value="BMW"/> • </xs:restriction> – </xs:simpleType> • </xs:element> XSD: Restrictions pattern • <xs:element name="initials"> • <xs:simpleType> – <xs:restriction base="xs:string"> • <xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z][0-9]"/> – </xs:restriction> • </xs:simpleType> • </xs:element> * : one or more + : at least one | : one or another {x} : exactly x elements (characters): ="[a-zA-Z0-9]{8}" XSD: Restriction length • <xs:element name="password"> – <xs:simpleType> • <xs:restriction base="xs:string"> – <xs:minLength value="5"/> – <xs:maxLength value="8"/> • </xs:restriction> – </xs:simpleType> • </xs:element> XSD: Complex type • There are 4 kinds of complex elements: – empty elements – elements that contain only other elements – elements that contain only text – elements that contain both other elements and text Note: complex elements may contain attributes. Examples or xml to be described as complex types • Empty element (in the example the value is defined via an attribute) – <inventoryitem pid="1345"/> • Element "employee“ that contains only other elements: – <employee> • <firstname>Deniss</firstname> • <lastname>Kumlander</lastname> • <position>Software Architect</position> – </employee> • Element module that contains only text: – <module type=“COA_Dependent">Allocation</module> XSD: Complex element description example for an element containing others • Declaration: direct – <xs:element name="employee"> • <xs:complexType> – <xs:sequence> » <xs:element name="firstname" type="xs:string"/> » <xs:element name="lastname" type="xs:string"/> » <xs:element name=“position" type="xs:string"/> – </xs:sequence> • </xs:complexType> – </xs:element> • Declaration using a “type” – <xs:element name="employee" type="personinfo"/> – <xs:element name=“probationer" type=“personinfo"/> <employee> <firstname> Deniss </firstname> <lastname> Kumlander </lastname> <position> Software Architect </position> </employee> – <xs:complexType name="personinfo"> Means ordered occurance – <xs:element name="firstname" type="xs:string"/> of elements • <xs:sequence> – <xs:element name="lastname" type="xs:string"/> – <xs:element name=“position" type="xs:string"/> • </xs:sequence> – </xs:complexType> XSD: Complex element description example for the text only element • <xs:element name=“a_name"> – <xs:complexType> • <xs:simpleContent> – <xs:extension base=“xs:integer"> .... .... – </xs:extension> • </xs:simpleContent> – </xs:complexType> • </xs:element> or • <xs:element name=“a_name"> – <xs:complexType> • <xs:simpleContent> – <xs:restriction base=“xs:integer"> .... .... – </xs:restriction> • </xs:simpleContent> – </xs:complexType> • </xs:element> Using either an extension or a restriction XSD: Complex element • <xs:complexType mixed="true"> Means that can be something like: <note_body> Dear <customer_name>Mr. Carlsson</customer_name>. Your order <orderid>123</orderid>...</note_body> in other words a mix of tags and text, where tags appear inside the text to give somekind extra information XSD: Indicators • Indicator do allow to control how elements are used – Declaration: <xs:complexType> <xs:xxx> – Order indicators • sequence – child elements should occur in the specific order • all – any order, but all child elements should occur at least once • choice – either one or another element should occur – Occurance (part of the element declaration) • minOccurance • maxOccurance Example: <xs:element name="child_name" type="xs:string" maxOccurs=“500" minOccurs="0"/> XSD: Extensions • The <any> element enables us to extend the XML document with elements not specified by the schema. • The <anyAttribute> element enables us to extend the XML document with attributes not specified by the schema. XSL • XSL is an acronym for EXtensible Stylesheet Language. i.e. something like CSS for XML • XSL consists of three parts: – XSLT - a language for transforming XML documents – XPath - a language for navigating in XML documents – XSL-FO - a language for formatting XML documents XPath • XPath is a syntax allowing navigating inside an XML document. • It uses tree-like structure of XML documents and process nodes and parents. • It is very similar to navigating in any folders (we have seen that in some css and html tag like href for example) XPath nodes • There are seven kinds of nodes: element, attribute, text, namespace, processinginstruction, comment, and document (root) nodes. XPath relationship • Parent: Each element and attribute has one parent. <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> “employee” is a parent for the “firstname”, “lastname” and “position” XPath relationship • Children: Each element can have 0, 1 or many children. <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> “firstname”, “lastname” and “position” are children for the “employee” XPath relationship • Siblings: are nodes having the same parent. <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> “firstname”, “lastname” and “position” are siblings XPath relationship • Ancestors: nodes’ parent, parent's parent, etc.. • Descendants: node’s children, children’s children, etc.. XPath expressions Statement Explanation nodename Selects all child nodes of the node, for example “employee” select all children, i.e. lastname, firstname etc. / Selects from the root node // Selects nodes in the document from the current node that match the selection no matter where they are, for example “//lastname” selects all lastnames, wherever those are . Selects the current node .. Selects the parent of the current node @ Selects attributes personnel/employee selects all “employee”s that are children of “personnel” XPath expressions XPath expressions • /personnel/employee[last()-1] • //employee[@branch=‘CODA Eesti'] – Selects all employees where attribute branch of the employee is CODA Eesti • /personnel/employee[salary>10000]/lastname – Selects personnel children employees with salary more than 10000 and returns only lastnames XPath • Notice that it was just a short introducation!!! XSLT • XSLT is the most important part of XSL. • XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element. • With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements, perform tests and make decisions about which elements to hide and display, and a lot more. XSLT template • It is possible to say that HTML (XHTML) is a style sheet for XML ! XSLT coda.xsl <?xml version="1.0" encoding="ISO-8859-1"?> <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <?xml-stylesheet type="text/xsl" href=“coda.xsl"?> <xsl:template match="/"> <personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> </personnel> <html> <body> <h2>CODA Personnel</h2> <table border="1"> <tr bgcolor="#9acd32"> <th align="left">Name</th> <th align="left">Position</th> </tr> <xsl:for-each select=“personnel/employee"> <tr> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet> XML connected to XSLT • XML document should contain the following string to be associated with a template, where “coda.xsl” is a user/defined name of the xslt file <?xml-stylesheet type="text/xsl" href=“coda.xsl"?> XSLT file: template • The <xsl:template> element is used to build a template. • The match attribute is used to associate a template with an XML element from the “source” file. The value of the match attribute is an XPath expression (note: match="/" connects to the whole document). <xsl:template match="/"> <html> <body> XSLT file: “for-each” • The XSL <xsl:for-each> element is used to select each XML element of a specified set of nodes. • Notice that “select” is nothing else than an XPath defining the level to start selection from (elements to iterate). <xsl:for-each select=“personnel/employee"> <tr> ….. </tr> </xsl:for-each> XSLT file: “value-of” • The <xsl:value-of> element is used to get a value of an XML element and put it to the output stream. • Notice that “select” is again an XPath defining an element to get <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td> XSLT coda.xsl <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href=“coda.xsl"?> <personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> </personnel> <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <h2>CODA Personnel</h2> <table border="1"> <tr bgcolor="#9acd32"> <th align="left">Name</th> <th align="left">Position</th> </tr> <xsl:for-each select=“personnel/employee"> <tr> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet> XSLT advance: filtering • It is possible to filter the output from the XML file by adding a criterion to the select attribute in the <xsl:foreach> element. <xsl:for-each select=“personnel/employee[position=‘Developer']"> • Filter operators are: = (equal) != (not equal) &lt; less than &gt; greater than XSLT advance: “sort” • The <xsl:sort> element is used to sort output (as XML is ordered list of items, so it is a possibility to re-order items). It is added after tag <xsl:for-each> appears. <xsl:for-each select=“personnel/employee"> <xsl:sort select=“lastname"/> … </xsl:for-each> “select” indicates element to sort XSLT advance: “if” • The <xsl:if> element is used to put a conditional if test against the content of the XML file. It is added after the tag <xsl:foreach> appears. <xsl:for-each select=“personnel/employee"> <xsl:if test="expression"/> “salary &gt; 20000” … </xsl:for-each> XSLT advance: “choose” • Elements <xsl:choose>, <xsl:when> and <xsl:otherwise> are used similar to “if .. then …else” construction of major programming languages <xsl:choose> <xsl:when test="expression"> ... an output ... </xsl:when> <xsl:otherwise> ... an output .... </xsl:otherwise> </xsl:choose> XSLT advance: choose <xsl:for-each select=“personnel/employee"> <tr> <td> <xsl:value-of select=“lastname"/> </td> <xsl:choose> <xsl:when test=“salary &gt; 20000"> <td> <b><xsl:value-of select=“position"/></b> </td> </xsl:when> <xsl:otherwise> <td> <xsl:value-of select=“position"/> </td> </xsl:otherwise> </xsl:choose> </tr> </xsl:for-each> XSLT advance: choose <xsl:for-each select=“personnel/employee"> <tr> <td> <xsl:value-of select=“lastname"/> </td> <xsl:choose> <xsl:when test=“salary &gt; 20000"> <td> <b><xsl:value-of select=“position"/></b> </td> </xsl:when> <xsl:when test=“salary &lt; 10000"> <td bgcolor=“red”> <xsl:value-of select=“position"/> </td> </xsl:when> <xsl:otherwise> <td> <xsl:value-of select=“position"/> </td> </xsl:otherwise> </xsl:choose> </tr> </xsl:for-each> XSLT advance: copy-of and variables <xsl:variable name=“footer"> <tr><td></td> <td>property of CODA</td> </tr> </xsl:variable> <xsl:template match="/"> <html> <body> Lower salary employees <table><xsl:for-each select=“personnel/employes"> <tr><xsl:if test=“salary<10000"> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td> </xsl:if> </tr> </xsl:for-each> <xsl:copy-of select="$footer" /> </table> <br /> High salary employees <table><xsl:for-each select="table/record"> <tr> <xsl:if test=“salary>100000"> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select="description"/></td> </xsl:if> </tr> </xsl:for-each> <xsl:copy-of select="$footer" /> </table> </body> </html> </xsl:template> Copies with children. There is also just a “copy” function that copy only the xml element without children XSLT advance: apply-templates • The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes. • The select attribute is used to define the order in which the child nodes are processed. XSLT advance: apply-template • Show each article title as an header 1 ... <xsl:template match=“article_title"> <h1><xsl:apply-templates/></h1> </xsl:template> XSLT advance: apply-template <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="coda2.xsl"?> <personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> <employee> <firstname>Veiko</firstname> <lastname>Laev</lastname> <position>Developer</position> </employee> </personnel> <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl :template match="/"> <html> <body> <h2>CODA Personnel</h2> <xsl:apply-templates/> </body> </html> </xsl:template><xsl:template match="employee"> <p> <xsl:apply-templates select="position"/> <xsl:apply-templates select="lastname"/> </p> </xsl:template> <xsl:template match="position"> <b><i>Position: </i></b> <xsl:value-of select="."/><br /> </xsl:template> <xsl:template match="lastname"> <b>Name: </b> <xsl:value-of select="."/><br /> </xsl:template> </xsl:stylesheet>