Developing Web Applications With XHTML: Problems and Benefits Ian GRAHAM Senior Manager, eSolutions Group Emfisys, Bank of Montreal E: ian.graham@bmo.com or utoronto.ca} W: http://www.utoronto.ca/ian/talks/ T: 416.513.5656 / F: 416.513.5590 Talk Outline Introduction Technical and process issues Browsers and XHTML Server-side data management Dynamic content generation Conclusions Browsers and XHTML Data delivered to browsers as one of two MIME types text/html text/xml (HTML data) (XML data) Two types handled in very different ways Browsers and XHTML HTML Channel Support Navigator 1--6 Opera 3--6 Lynx IE 3--5.5 ... Basically every browser XML Channel Support Navigator 6 IE 5/5.5 Opera 4 Fewer browsers, with caveats (rendering / processing problems) HTML vs. XML Channels HTML channel XML channel Large set of defined internal general entities for common non-ASCII characters and symbols Default formatting properties for all standard HTML elements CSS support for id- and class-based CSS selectors (e.g., div.special, pre#note) Hard-wired support for functional elements (links, replaced elements [img, object/applet], map etc.) No internal DTD subsets: some browsers supports DOCTYPE declarations with no internal subset (for rendering mode switching) No validation Only 5 XML-standard predefined general entities (lt, gt, amp, apos, and quot) No default XHTML formatting properties (exception: Mozilla/Navigator 6) No support must omit CDATA sections N/A No namespace support No support (some support in Mozilla) Supports internal DTD subsets and arbitrary entity declarations, both internal and (sometimes) external Supports DTD-based (and sometimes schema-based) validation Supports CDATA sections Supports an XML declaration Supports namespaces XHTML via XML Channel Mozilla/NN6 “namespace” support for XHTML namespace – default formatting properties Restricted CSS support No support (yet) for style, link elements – <?xml-stylesheet .... ?> IE 5/5.5 No special support for XHTML (use CSS+XML) No standardized support for functional markup Transform XML into HTML: XML+XSLT HTML Browser Conclusions: Deliver data as text/html Must avoid many XML features CDATA sections DTD internal subsets (and DTD functionality) Not supported by HTML processors XHTML as text/html Be aware of some XHTML features xml:lang vs. lang attributes (use both) XML declarations (avoid them) empty element tag notation (add space before trailing “/” character) non-minimized attribute assignments (some bugs with Opera 3.6) DTD declaration (rendering concerns) “Safe” XHTML example <?xml version= "1.0" encoding= "utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd" > <html xmlns= "http://www.w3.org/1999/xhtml1" xml:lang= "en-ca" lang= "en-ca" > <head> <title> ... </title> <meta http-equiv= "content-type" content= "text/html;charset=utf-8" /> </head> <body> ..... whitespace </body> before slash </html> DOCTYPE Switching New HTML processors render differently depending on DOCTYPE value: “quirks” mode -- (reproduces various CSS and other layout “errors” of older browsers) “standards” mode -- (correct behavior, no quirks [supposedly!] ) DOCTYPE Switching All XHTML DTDs get “standard” mode. Thus the following rule: quirks mode: strict mode: omit DOCTYPE include an XHTML DOCTYPE declaration Summary Seems like a lot of problems.... Not so bad ... Have to deal with most anyway (e.g., quirky vs. strict) “patching” often easy to automate Benefits Reduced markup errors (well-formedness) Migration path to future delivery channels Markup Guidelines Use external scripts and CSS style sheets Avoid CDATA sections External script, style sheet files On output, auto-adjust syntax: insert space at end of empty-element tags duplicate lang-related attributes strip/insert appropriate DOCTYPE declaration strip XML declaration Talk Outline Introduction Browsers and XHTML Server-side data management Dynamic content generation Conclusions Server-Side Management XHTML reduces errors in composition and rendering phases Well-formedness ensures unambiguous processing by an HTML or XML processor Barring CSS errors, this means fewer browser formatting problems Problems support by content generation tools Issues: Limited pool of XHTML authoring tools Most popular page authoring tools designed for HTML (DreamWeaver, FrontPage etc.) XHTML conversion adds process steps to authoring – Too complex for most Web authors/designers Not an issue for organizations with existing XML/SGML processes Processing Content Problem detected A) Author / edit View & verify Publish content Problem detected B) Author / edit Convert to XHTML View & verify Publish content Alternatives Use XML-aware authoring tools Create tightly controlled page “templates” and limit authoring responsibilities Issues: Cost, non-WYSIWYG nature Reduces likelihood of markup error Dynamic content generation Fragment authoring only -- with validation Talk Outline Introduction Browsers and XHTML Server-side data management Dynamic content generation Conclusions Dynamic Content Content from databases SQL, XML, files (plain text or XML fragments) Can enforce well-formedness in software Either via structured markup generating functions or true XML-based tools Server-Side Management text strings XHTML Fragment Page generation engine XHTML Fragment XHTML Fragment HTML / XHTML page template HTML / XHTML to browser Reduced number of markup/formatting errors XHTML Fragments XHTML helps reduce errors in composition of pages from fragments: Checking fragment well-formedness ensures well-formedness of composed content Even if template is not well-formed, wellformedness of fragments increases rendering reliability of composed product Easier to control fragment authoring Roles & Responsibilities Fragment creation Many individual writers; requires little markup knowledge; WYSIWYG often not as important Template creation / Management Fewer, more technically skilled people; can enforce well-formedness Fragment Creation Common use case in content creation e.g., Newspaper articles, weblogs Subset of XHTML (e.g., p, a, br, h3, b, i, img, em) few attributes Easy to write Easy to dynamically “correct” markup Example Web Interface Advantages • Centralized code • Single interface Disadvantages • Crappy editor • Need to know markup to write content Future Alternatives Editable Web content (IE 5.5, Mozilla) Accessing DOM tree to build an editor http://standardbrains.editthispage.com/ DOM ensures validity / well-formedness of markup With Mozilla, will work with XML Content Assembly Many tools (JSP, php, ASP, zope ...) Not many that guarantee well-formed XML output Obvious choice: XSLT Guarantees well-formedness of output HTML output mode allows for most issues described earlier Server-Side Management XSLT processor XHTML Fragment text strings XSLT fragment template XHTML Fragment XHTML Fragment XHTML to browser XSLT page template Zero markup / formatting errors XSLT Advantages XSLT (2) Always well-formed! Abstraction layer between content and output Easier adjustment for future delivery channels Output (2) XSLT XSLT (1) Output (1) XSLT Caveats Non-trivial language (well, at least to page developers/designers) Unclean separation between designer and programmer Lack of easy integration to common editing tools Alternative Approach “Procedural” page template designs XHTML-compliant design tools Escape to XSLT for content “components” “Prototype-based” transformations Markup Model ...regular markup .... <xsl:prototype> <table> ... Example table content goes here ... </table> </xsl:prototype> ... More XHTML markup .... Talk Outline Introduction Browsers and XHTML Server-side data management Dynamic content generation Conclusions Conclusions XHTML provides pragmatic advantages Improved content reliability, at several layers; migration path to XML-centric world Several disadvantages in near-term Poor integration with authoring tools; no easy XSLT-design separation Developing Web Applications With XHTML: Problems and Benefits Ian GRAHAM Senior Manager, eSolutions Group Emfisys, Bank of Montreal E: ian.graham@{bmo.com or utoronto.ca} W: http://www.utoronto.ca/ian/talks/ T: 416.513.5656 / F: 416.513.5590