ECT 360 Introduction to the Class Prof. Robin Burke ECT 360 Fall 2004 Outline Introductions Course and Syllabus XML XHTML Homework #1 Introductions Student information sheet Administrativa Contacting me Questions CS&T 453 x 25910 rburke@cs.depaul.edu Try the course discussion forum Automatically mailed to all students I will also post announcements here Course web site http://josquin.cs.depaul.edu/~rburke/courses/ f04/ect360/ About Me 3rd year at CTI PhD in AI, 1993 Research AI applications in E-Commerce "smart catalogs" Taught web development since 1996 What I hope to get out of teaching this class Course Introduction to XML concentrate on XML's uses for the web many other uses! Three parts XML standard XML validation XML transformations Some DOM programming Course cont'd Seven homework assignments Midterm project Final project Allocation Homework – 40% Midterm project – 30% Final project – 30% Midterm project Instead of a midterm 10/6 Two-person teams Choose an XML language and report on it Possibilities SVG, VoiceML, XSL-FO, MathML, SMIL, SOAP, WSDL, UDDI, BPEL4WS, XBRL anything else you think interesting Midterm project Proposal Due next week 9/15 Email message with the following Name / email address for each partner 1st/2nd/3rd choice for XML application Grading Three Components Knowledge Does the work display correct technical knowledge? Reasoning Does the work indicate good problem-solving skills? Communication Written work: Is the answer well-written English? Code: Does the answer display good coding / documentation style Grading, cont'd A = Excellent work B = Very good work Complete knowledge of the subject matter No major errors of reasoning in problem solutions Competent written answers Readable coding style C = Average work Thorough knowledge of the subject matter Well-considered and creative solutions Well-written answers Employment of impeccable coding style Some gaps in knowledge of subject matter Some errors or omissions in problem solving Written answers may contain grammatical and other errors Coding may be stylistically awkward D = Below average work Substantial gaps in knowledge of subject matter. Problem solving incomplete or incorrect Poor English in written answers Ineffective coding style Resources Text Carey, P. New Perspectives on XML (Comprehensive). Thomson Learning Tools XML Spy • 4.3 included with book XML Spy Enterprise 2004 • available in 7th floor lab Discussion enhancement Card distribution XML eXtensible Markup Language Misnomer Not a language Technology for creating languages XML Looks a little bit like HTML But with a wide variety of tag names Reason HTML and XML have a common ancestor • SGML Developed for entry and management of very large documents Why do we need XML? Web publishing with HTML Develop content Determine how content should be displayed on pages Encode content in HTML Content available to users Problem what happens when content changes • design decisions must be rethought what happens when design changes • HTML must be rewritten designer and author must work closely Web publishing with XML Develop XML application for content Develop content Content encoded in XML Design pages Write stylesheet to render pages in HTML Content available to users Benefits If design changes, only stylesheet is affected Different pages / displays can be generated from the same content Designer and author need not interact Big picture Modularity is a good thing decoupling of data's structure from its use in a particular application lowers effort of repurposing data Modularity requires standards non-application specific data representation • not in the interest of any application vendor XML is the language in which such standards can be expressed XML applications Purpose-specific languages that conform to the XML standard Many are standardized In-house languages easy to develop XML is becoming the default choice for data storage format MS Office 2003 Example: Syllabus <syllabus xmlns="http://josquin.cs.depaul.edu/~rburke/namespaces/sylla bus"> <course> <course-number>ECT 360</course-number> <course-title>Introduction to XML</course-title> <prereqs> <note>One quarter of programming</note> <and> <or> <course-number>CSC 211</course-number> <course-number>CSC 261</course-number> <equivalent/> </or> <course-number>IT 130</course-number> </and> </prereqs> </course> ... see full example ... Note Structure determined by needs of application Other design choices could be made separate components of course number text for prerequisites Note Mixed content Use of external namespaces Entities Internal referencing The rules of XML Documents consist of elements, attributes and content (and a few other things) Elements are set off by tags in angle brackets start tag for element foo <syllabus> end tag for element foo </syllabus> Anything in between the start tag and end tag is element content Attributes are additional data associated with an element indicated by name/value pairs inside the start tag • <hwk ref="hwk2"> More rules Comments enclosed by special character sequence <!-- --> Document prolog before the first element contains declarations typically • declare that it is xml • declare the relevant document type Processing instructions information that the XML parser doesn't use passed along to the application Special tag <? Entities Special characters Certain characters part of the language Need a way to indicate these • &lt; < Entities can be defined as part of a document type useful for inserting standard text &copyright; might insert a standard copyright notice Document tree Document is just one form of XML More useful for computation Tree representation XML Tree syllabus offering, etc. course coursenumber coursetitle prereqs and TEXT "ECT 360" TEXT "Introduction to XML" note description coursenumber TEXT "IT 130" or TEXT “One quarter of programming” coursenumber coursenumber TEXT "CSC 211" TEXT "CSC 261" equivalent TEXT "This course is..." Tree Nodes elements text nodes Attribute lists Paths A path traverses the tree XPath provide syntax for tree traversal Example /section[2]/meeting[1]/day/ Transformation XML transformations change the XML tree adding deleting changing contents Well-formed vs valid A well-formed document is one that obeys the syntactic rules it can be parsed <foo bar="2"><baz>thud</baz>&zap;</foo> well-formed document A valid document has been validated against some standard what is the entity zap? is baz a legal subelement for foo? unknown without a definition for foo XML Validation Validation is the process of checking an XML document against a standard Different languages for defining such standards DTD – document type definition XML Schema RELAX NG others Document type The document type specifies the legal structure of the XML document order, contents of elements legal attributes and default values etc. Designing a document type means deciding what data will be stored and how HTML HTML is not XML-compliant XML is case-sensitive XML requires quotes around attributes HTML optional XML requires end tags HTML is not Optional in some cases for HTML XML requires /> syntax for empty elements HTML does not XHTML Latest HTML standard Makes HTML XML-conformant Different flavors Transitional • allows style information as part of the document (align attribute) Frameset • allows frames Strict • no frames, no style attributes • assumes use of a stylesheet for rendering Benefits of XHTML XML decouples data from application XHTML decouples content from style for web documents Modularity = More pieces HTML document Web browser XML data XHTML document XSLT stylesheet CSS stylesheet Web browser More pieces = More flexibility XML data XSL-FO stylesheet PDF document XML data XSLT stylesheet SVG graphics Converting to XHTML No sloppiness tags must nest end tags everywhere quotes on attribute values remove deprecated elements / attributes • replace with style XML-specific /> for empty elements • like img declarations • xml • doctype • namespace lowercase Example HTML XHTML conversion Validation on-line validator Assignment #1 Convert a file to XHTML Use the online-validator or XML Spy Due before class time next week submit to COL