XP CREATING AN XML DOCUMENT 1 XP INTRODUCING XML • XML stands for Extensible Markup Language. A markup language specifies the structure and content of a document. • Because it is extensible, XML can be used to create a wide variety of document types. 2 XP INTRODUCING XML • XML is a subset of the Standard Generalized Markup Language (SGML) which was introduced in the 1980s. SGML is very complex and can be costly. These reasons led to the creation of Hypertext Markup Language (HTML), a more easily used markup language. XML can be seen as sitting between SGML and HTML – easier to learn than SGML, but more robust than HTML. 3 XP THE LIMITS OF HTML • HTML was designed for formatting text on a Web page. It was not designed for dealing with the content of a Web page. Additional features have been added to HTML, but they do not solve data description or cataloging issues in an HTML document. • Because HTML is not extensible, it cannot be modified to meet specific needs. Browser developers have added features making HTML more robust, but this has resulted in a confusing mix of different HTML standards. 4 XP XML VOCABULARIES 5 XP WELL-FORMED AND VALID XML DOCUMENTS • An XML document is well-formed if it contains no syntax errors and fulfills all of the specifications for XML code as defined by the W3C. • An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD or schema attached to the document. 6 THE STRUCTURE OF AN XMLXP DOCUMENT • XML documents consist of three parts – Prolog: optional and information about the document itself – Document body: the document’s content in a hierarchical tree structure – Epilog: optional and any final comments or processing instructions 7 THE STRUCTURE OF AN XMLXP DOCUMENT: THE XML DECLARATION • The XML declaration is always the first line of code in an XML document. It tells the processor what follows is written using XML. It can also provide any information about how the parser should interpret the code. • The complete syntax is: <?xml version=“version number” encoding=“encoding type” standalone=“yes | no” ?> • A sample declaration might look like this: <?xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?> 8 THE STRUCTURE OF AN XMLXP DOCUMENT: INSERTING COMMENTS • Comments or miscellaneous statements go after the declaration. Comments may appear anywhere after the declaration. • The syntax for comments is: <!- - comment text - -> • This is the same syntax for HTML comments 9 XP ELEMENT • Element names are case sensitive • Elements can be nested, as follows: <tracks>Kind of Blue <track>So What ((:22)</track> <track>Blue in Green (5:37)</track> </tracks> 10 XP WORKING WITH ATTRIBUTES • An attribute is a feature or characteristic of an element. Attributes are text strings and must be placed in single or double quotes. The syntax is: <element_name attribute=“value”> … </element_name> 11 ELEMENTS AND ATTRIBUTES: XP ADDING ELEMENTS TO THE JAZZ.XML FILE document elements 12 XP CHARACTER REFERENCES This figure shows commonly used character reference numbers 13 XP CDATA SECTIONS • A CDATA section is a large block of text the XML processor will interpret only as text. • The syntax to create a CDATA section is: <! [CDATA [ Text Block ] ]> 14 XP CDATA SECTIONS • In this example, a CDATA section stores several HTML tags within an element named HTMLCODE: <htmlcode> <![CDATA[ <h1>The Jazz Warehouse</h1> <h2>Your Online Store for Jazz Music</h2> ] ]> </htmlcode> 15 XP CDATA SECTIONS This figure shows the revised Jazz.XML file CDATA section 16 XP PARSING AN XML DOCUMENT 17 XP LINKING TO A STYLE SHEET • Link the XML document to a style sheet to format the document. The XML processor will combine the style sheet with the XML document and apply any formatting codes defined in the style sheet to display a formatted document. • There are two main style sheet languages used with XML: – Cascading Style Sheets (CSS) and Extensible Style Sheets (XSL) 18 XP LINKING TO A STYLE SHEET • There are some important benefits to using style sheets: – By separating content from format, you can concentrate on the appearance of the document – Different style sheets can be applied to the same XML document – Any style sheet changes will be automatically reflected in any Web page based upon the style sheet 19 APPLYING A STYLE TO AN ELEMENT XP • To apply a style sheet to a document, use the following syntax: selector {attribute1:value1; attribute2:value2; …} • selector is an element (or set of elements) from the XML document. • attribute and value are the style attributes and attribute values to be applied to the document. 20 XP CREATING PROCESSING INSTRUCTIONS • The link from the XML document to a style sheet is created using a processing statement. • A processing instruction is a command that gives instructions to the XML parser. 21 CREATING PROCESSING INSTRUCTIONS XP • For example: <?xml-stylesheet type=“style” href=“sheet” ?> • Style is the type of style sheet to access and sheet is the name and location of the style sheet. 22 XP THE JW.CSS STYLE SHEET This figure shows the cascading style sheet stored in the jw.css file 23 LINKING TO THE JW.CSS STYLE SHEET XP This figure shows how to link the JW.css style sheet to the Jazz.xml file processing instruction to access the jw.css style sheet 24 THE JAZZ.XML DOCUMENT XP FORMATTED WITH THE JW.CSS STYLE SHEET This figure shows the formatted jazz.xml file 25 XP WORKING WITH XSLT 26 GENERATING A RESULT DOCUMENT XP • An XSLT style sheet converts a source document of XML content into a result document by using the XSLT processor 27 XP CREATING AN XSLT STYLE SHEET • To create an XSLT style sheet, the general structure: <?xml version =“1.0”> <xsl:stylesheet version = “1.0” xmlns:xsl =“http://www.w3.org/1999/XSL/Transform”> Content of the style sheet </xsl:stylesheet> The <xsl:stylesheet> tag can be substituted for the <xsl:transform> tag 28 WORKING WITH DOCUMENT NODES XP • Under XPath, each component in the document is referred to as a node, and the entire structure of the document is a node tree • The node tree consists of the following objects: – the source document itself – comments – processing instructions – namespaces – elements, – element text – element attributes 29 XP NODE TREE EXAMPLE 30 XP RELATIVE PATHS • With a relative path, the location of the node is indicated relative to a specific node in the tree called the context node 31 USING XPATH TO REFERENCEXP A NODE • For absolute path, XPath begins with the root node, identified by a forward slash and proceeds down the levels of the node tree • An absolute path: /child1/child2/child3/… • To reference an element without regard to its location in the node tree, use a double forward slash with the name of the descendant node • A relative path : //descendant 32 REFERENCING GROUPS OF ELEMENTS XP • XPath allows you to refer to groups of nodes by using the wildcard character (*) • To select all of the nodes in the node tree, you can use the path: //* The (*) symbol matches any node, and the (//)symbol matches any level of the node tree Example: /portfolio/stock/* 33 REFERENCING ATTRIBUTE NODES XP • XPath uses different notation to refer to attribute nodes • The syntax for attribute node is: @attribute where attribute is the name of the attribute Example: /portfolio/stock/name/@symbol 34 XP WORKING WITH TEXT NODES • The text contained in an element node is treated as a text node • The syntax for referencing a text node is: text() • To match all text nodes in the document, use: //text() 35 XP CREATING THE ROOT TEMPLATE • A template is a collection of elements that define how a particular section of the source document should be transformed in the result document • The root template sets up the initial code for the result document 36 XP CREATING A TEMPLATE • To create a template, the syntax is: <xsl:template match=“node set”> styles </xsl:template> –where node set is an XPath expression that references a node set from the source document and styles are the XSLT styles applied to those nodes 37 XP CREATING A ROOT TEMPLATE • To create a root template, the syntax is: <xsl:template match=“/”> styles </xsl:template> 38 XP EXTRACTING ELEMENT VALUES • To insert a node’s value into the result document, the syntax is: – <xsl:value-of> select=“expression” /> – where expression is an expression that identifies the node from the source document’s node tree • If the node contains child elements in addition to text content, the text in those child nodes appears as well 39 INSERTING A NODE VALUE XP EXAMPLE 40 PROCESSING SEVERAL ELEMENTS XP • To process a batch of nodes, the syntax is: <xsl:for-each select=“expression” /> styles </xsl:for-each> where expression is an expression that defines the group of nodes to which the XSLT and literal result elements are applied 41 PROCESSING SEVERAL ELEMENTS XP 42 XP WORKING WITH TEMPLATES • To apply a template in the result document, use the XSLT element – <xsl:apply-templates select=“expression” /> where expression indicates the node template to be applied 43 CREATING THE STOCK TEMPLATE EXAMPLE XP 44 XP SORTING NODE SETS • By default, nodes are processed in document order, by their appearance in the document • To specify a different order, XSLT provides the <xsl:sort> element • This element can be used with either the <xsl:applytemplates> or the <xsl:for-each> element 45 XP SORTING NODE SETS • The <xsl:sort> element contains several attributes to control how the XSLT process sorts the nodes in the source document – The select attribute determines the criteria under which the context node is sorted – The data-type attribute indicates the type of data – The order attribute indicates the direction of the sorting (ascending or descending) 46