Introduction to XML • XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup vocabularies. 1 CML: An XML Example 2 GolfML: Another XML Example <?xml version="1.0" encoding="utf-8"?> <golfml xmlns="http://pga.com/golfml"> <course name=“The Oaks”> <tee num=“1”> <par>5</par> <handicap>15</handicap> <length units=“yds”>475</length> </tee> <tee num=“2”> … </tee> </course> </golfml> 3 Why is XML Important? • XML gives us a way to create and maintain structured documents in plain text that can be rendered in a variety of different ways. • A primary objective of XML is to completely separate content from presentation. Example: The Asbury Park Press is a structured document containing pages, columns, etc. The news is just plain text (including the images). Newspapers can be rendered on paper or online. 4 Where XML Fits into Other Markup Languages 5 DTDs and XML Documents • A DTD (Document Type Definition) or schema specifies the rules for what a legal XML document may contain. • An XML document is well-formed if it contains no syntax errors and fulfills all of the specifications for XML code as defined by the W3C. • An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD or schema attached to the document. 6 The Structure of an XML Document: The Prolog • The XML declaration is always the first line of code in an XML document. It tells the parser what follows is written using XML. • The complete syntax is: <?xml version=“version number” encoding=“encoding type” standalone=“yes | no” ?> • The typical declaration is: <?xml version=“1.0” encoding=“UTF-8” standalone = “yes” ?> 7 The Structure of an XML Document: Elements • • • • • • Closed elements have the following syntax: <element_name>Content</element_name> For example: <Artist>Miles Davis</Artist> Open elements have the syntax: <element /> For example: <Jazz_Music /> Element names are case sensitive, must begin with a letter (or _), and may not contain spaces. Elements can be (properly) nested. For example: <playlist> <track>So What</track> <track>Blue in Green</track> </playlist> All elements must be nested within a single root element. Comments are enclosed in <!-- comment --> (like HTML). 8 The Structure of an XML Document: Attributes • An attribute is a property of an element. They are text strings placed in single or double quotes. The syntax is: <element_name attribute=“value”> 9 The Element Hierarchy 10 Special Character References Special symbols can be inserted into an XML document using either the character reference or entity reference, 11 CDATA Sections • Validators can get confused by some XML: <temperatureRange> > 100 degrees </temperatureRange> • You must separate the file into PCDATA and CDATA. • Parsed character data (PCDATA) is text to be parsed by a browser or parser (all the XML code: declarations, elements, attributes, comments). • Unparsed character data (CDATA) is text not to be processed by the browser or parser. A CDATA section marks a block of text as CDATA so that parsers ignore any text within it: <temperatureRange> <![CDATA[ > 100 degrees ]]> </temperatureRange> 12 Mark Your XML with the Sections You Don’t Want Parsed CDATA section 13 Parsing an XML Document 14 Displaying an XML Document in a Web Browser If it’s well-formed If it’s not well-formed 15 Linking to a Style Sheet 16 Applying Styles to the XML Elements • To apply a style sheet to a document, use the syntax: selector {attribute1:value1; attribute2:value2; …} For example: artist {color:red; font-weight:bold} 17 Linking the XML to the Style Sheet •The link from the XML document to a style sheet is created using a processing statement. A processing instruction is a command that gives instructions to the XML parser. 18 The XML Document Formatted with the Style Sheet 19