xml-ubc - Ian Graham

advertisement
<XML> and the Future of Internetbased Computing
11 March 2002
Ian GRAHAM
Emerging Business Strategy, Bank of Montreal
E: <ian.graham@bmo.com> or <ian.graham@utoronto.ca>
T: (416) 513.5656 / F: (416) 513.5590
Web: http://www.utoronto.ca/ian/talks/
Emerging Business Strategy, IBS
ian.graham@bmo.com / 416.513.5656
1
Overview
 A history lesson
– The Web and the birth of XML
– when, why, and who
 What does XML give us?
 Examples, illustrations, and applications
 The future
2
In The Beginning .....
Ftp
 …. was the birth of the Web
(Tim Berners-Lee, 1992)
• HTML
• HTTP
• URL
News
Email
Web
Server
Db & other
software
HTML
Internet
communication
protocols
URLs
(location
e.g -- http://www.foo.org/boo.html )
(data/display)
Hello There
Here’s a zippy
HTML page, with
lots of Colors and
Links ...!!!
Fun, Eh?
HTTP
(transfer)
3
Three Core Concepts
 HTTP -- HyperText Transfer Protocol
– A protocol for transferring data between machines on the Internet
 URL -- Uniform Resource Locator
– A scheme for referencing, using a simple text string, the specific
location of a resource (Web page, audio file, program) somewhere on
the Internet (e.g. http://www.utoronto.ca/ian/talks/ )
 HTML -- HyperText Markup Language
– a markup language for encoding information to be read / viewed by
people
HTTP and URLs have pretty-well stood the test of time.
But by 1996, HTML was already showing signs of age ....
4
Simple HTML Example
HTML (not XML) Markup
Browser Rendering
<HTML>
<HEAD>
<TITLE>The XML Specification Guide -Website Home Page </TITLE>
<LINK REL="stylesheet" HREF="style.css">
</HEAD>
<BODY BGCOLOR="#FFFFFF"
TEXT="black" LINK="#0066CB"
ALINK="#00A000" VLINK="#808080" >
<TABLE WIDTH="100%" CELLPADDING="0"
CELLSPACING="0" BORDER="0">
<TR>
<TD VALIGN="top" ALIGN="left"><FONT
CLASS="toolbar"
FACE="arial,helvetica" SIZE="-1">The XML
Specification Guide
</FONT></TD>
…….. More tags and text ….
5
The Problems with HTML
 HTML designed to serve one role - simple hypertext documents,
with simple user interaction (forms, etc.). But people soon wanted
to display other types of data:
– mathematical expressions, literary text
– graphics, multimedia, interactive content ...
– commercial forms, purchase orders, generic data
 ... and “connect” these parts together (so they can interact)
 ... and dynamically mix/edit chunks of data together
 ... and build dynamic networks that exchange information
 ... and make sure this works reliably, anywhere.
6
HTML Scope was Too Limited
– Single model for data (hypertext text)
– Syntax too lenient ... It’s easy to create HTML that can be misprocessed by other systems
– Result:
• can’t create arbitrary custom data that can be universally understood
HTML
Web
Evolution
interchange
data between
machines
modeling
different types
of data
presentation
of different
types of data
7
The Birth of XML...
 ..happened in 1996, when a group of experts assembled to try and
find a way out of the problem.
 First draft came out in late 1996 ... Final version of the XML 1.0
specification came out in February 1998
– Large Canadian contribution -- 3 out of 18 WG members, plus 1/3
editors [Tim Bray]
– Followed in 1999 by a second ‘core’ XML specification (Also with Tim
Bray as co-editor)
Core Principles
– Simple
• But not as simple as HTML, in particular with stricter formal syntax
– Extensible
• So you can create your own tags, or elements
– Distributed environment -friendly
• like HTML, but better
8
An XML Example
<?xml version=“1.0” ?>
<partorders
xmlns=“http://myco.org/Spec/partorders” >
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc>
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
</partorders>
9
What is XML?
 Specification of a syntax for “encoding” text-based data (words,
phrases, numbers, ...), with strict syntax rules about how to do so.
 A text-based syntax -- written using printable characters (no
explicit binary data)
 Extensible -- you can define your own tags (essentially data
types), within the constraints of the syntax rules
 Universal -- the syntax rules ensure that all XML processing
software MUST identically handle a given piece of XML.
If you can read and process it, so can anybody else
10
Example Revisited
element
tags
attribute of this
quantity element
<partorders
xmlns=“http://myco.org/Spec/partorders” >
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc>
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
Hierarchical, structured information
</partorders>
11
Processing XML -- creating data structures
ref=
date=
<partorders xmlns="...">
<order date="..."
ref="...">
<desc> ..text..
</desc>
<part />
<quantity />
<delivery-date />
</order>
<order ref=".." .../>
</partorders>
desc
text
order
part
quantity
partorders
text
xmlns=
delivery-date
order
ref=
date=
XML syntax rules guarantees the same result, always
12
XML: Why it's this way
 Simple (like HTML)
– But not quite so simple
– Stricter syntax rules, to eliminate processing errors
– syntax defines structure (hierarchically), and names structural parts
(element names) -- it is self-describing data
 Extensible (unlike HTML, vocabulary is not fixed)
– Can create your own language of tags/elements, with rules
– Strict syntax ensures that custom tags can be reliably processed
 Designed for a distributed environment (like HTML)
– Can have data all over the place: can retrieve and use it reliably
 Can mix different data types together (unlike HTML)
– Can mix one set of tags with another set: resulting data can still be
reliably processed
13
Mixing dialects together: name spaces
Default ‘type’
is xhtml
<?xml version="1.0" encoding="iso-8859-1"?>
<html xmlns="http://www.w3.org/1999/xhtml1"
xmlns:mt=“http://www.w3.org/1998/mathml” >
<head>
<title> Title of XHTML Document </title>
</head><body>
<div class="myDiv">
<h1> Heading of Page </h1>
<mt:mathml>
<mt:sup> ...… MathML markup …
</mt:mathml>
<p> more html stuff goes here </p>
</div>
</body>
</html>
mt: prefix
indicates
'type'
mathml
(a different
language)
14
W3C rec
XML Specification(s) Chart
XML 1.0
XML names
15
Classes of XML Dialects
 XML gives us a tool for expressing data in a universally shareable
way.
 Many XML 'dialects,' optimised for different roles.
 Can roughly break these down into five categories
– presentation & data stuff people read, look at, or exchange
– metadata
for describing things; for use by other software
– distributed apps
data delivery; distributed applications, Web
services
– XML utilities
XSLT, Schemas,…
– software utilities
variety of things …
 We’ll now look at some examples from the first three categories.
16
Classes of XML Dialects
 1) Presentational Language (for people/applications)
–
–
–
–
–
–
SMIL -- for multimedia (RealPlayer Multimedia players)
WML -- Wireless WAP-phones
XUL -- user interface (Netscape 6)
VoiceXML -- voice interfaces (telephone-based ...)
XHTML -- XMLized version of HTML
…
 Some language with specific academic relevance:
–
–
–
–
–
TEI -- Text encodinghttp://www.tei-c.org/
MathML -- for mathematics http://www.w3.org/Math
XHTML -- new HTML
http://www.w3.org/MarkUp
SVG -- for graphics
http://www.w3.org/Graphics/SVG
HEML -- historical events
http://www.heml.org
17
TEI -- Text Encoding Initiative
 ... represent all kinds of literary and linguistic texts for online
research and teaching, using an encoding scheme that is
maximally expressive and minimally obsolescent. †
 Recently migrated to be compatible with XML (TEI-Lite)
– Namespaces let you re-use XHTML ‘links’
– XML also has its own more expressive linking/pointing mechanisms
 Some online examples via ....
[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
 Gain: universally accessible literary/academic texts, with
networked capabilities
† From: TEI home page, http://www.tei-c.org, 16 Jan 2002
18
MathML, SVG: for Mathematics and Graphics
 XML dialects that model essential “types” of data for presentations
and display.
 “Namespace” mechanism let you mix these different types of
information together, and with other dialects (like XHTML)
 Some online examples ....
[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
 Advantages: Can communicate both structural and semantic
information (how it looks and what it means)
– Interactive mathematical example documents
– Interfaces with tools like Mathematica, Maple
– Non-proprietary languages, interfaces
19
HEML: Historical Event Markup and Linking
 ... elements that are flexible enough to represent most known
events in the past while working well with existing document
encoding schemes, such as XHTML, TEI-Lite and Docbook. †
 Online examples at ...
[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
 A “web” of historical events, cross-linking documents with
resources, timelines, etc.
† From: HEML home page, http://www.heml.org, 16 Jan 2002
20
And others




CML - Chemical Markup Lang
CellML - biological models
BSML - bioinformatic sequences
MAGE-ML - Microarray Gene
Expression
 XSTAR - for archaeological
research
 XMLMARC - MARC in XML
 AML - astronomy markup
language
 ... many (dozens and dozens)
more ...
There has been an explosion of
activity towards developing
“universal” XML formats for
encoding, exchanging and
linking information.
“Evolutionary” forces still at play
(many languages are born, but
only a few will survive)
Prediction -- this will lead to a
big change in how academic
information is created, shared,
and stored.
21
Informational Data: Metadata and Packages
 Can use XML to encode information about data
– Indexes, catalog records, etc.
– data about non-text resources (images, people, whatever)
 Can also use XML to package up information (data + catalog)
 Example: IMS Content packaging
– A standard for “packaging” Web content relevant to Web based
instructional applications
– Will allow for interoperable content -- so it can be moved between
different IMS-compliant learning systems.
– A growing number of learning systems, including WebCT, support
this standard
 One of the core components for creating learning objects
22
Distributed Data
 The networking of the data is becoming more important that the
data itself
 XML is becoming the tool for creating such networks, and for
transporting data from place to place in that network.
 The preceding example languages can sometimes do this sort of
thing, but there are also specific XML languages aimed at this role.
These ideas -- and some of the existing tools -- can be used in Portal /
Website development, creation of distributed databases, etc.
23
Distributed data application: Open Directory
 RDF -- Resource Description Framework
– A language for encoding metadata about resources
– Used by the Open Directory Project to create an open, shareable
directory of Web resources
– Can search the directory site (like Yahoo), or download the entire
directory and integrate it into your own.
 Current directory has:
–
–
–
–
46,000 human editors
45,000 categories
millions or ‘resources’ catalogued
re-used by ~290 sites around the world
 Online examples from ...
[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
24
Open Directory Model
dmoz.org
RDF
data
feeds:
infospace
<XML>
Ask Jeeves
Google
infospace
Downloading XML data
from well-known location
Labour party
UK
25
Distributed data application: RSS
 RSS -- Rich/Resource/RDF Site Summaries
– A language for encoding summary data about Web pages/sites, and
related metadata (update interval, etc.)
– Designed for syndicated distribution of information about pages
– Rather like headlines for newspapers
 There are currently 850+ syndicators of such data, and several
thousand RSS ‘feeds’
– News agencies
– Web sites with updated content
– individuals with ‘blogs’
 Online examples from ...
[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
26
RSS Syndication Model
sites ...
RSS
consumers
Web
site
RSS
aggregator
Desktop
app (e.g.,
Headline
Viewer)
Black lines: <XML>
JavaScript
component
Other ...
(aggregator, ...)
‘one-way’ XML -Simple querying
of ‘aggregator’
via URLs:
http://ag.org/?news
27
Distributed data application: Jabber
 open, XML-based protocol for instant messaging and presence.
Jabber-based software is deployed on thousands of servers
across the internet and is used by over a million people worldwide.
 A complete XML-based distributed application toolset.
† From: TEI home page, http://www.tei-c.org, 16 Jan 2002
28
Jabber:
Jabber clients
• Presence
• User directory
• Proxys to Yahoo, ICQ
• Other services
Jabber
server
Jabber
server
29
Jabber Example
Jabber
client
Jabber
client
Jabber
server
• Connect
register presence
• Lookup user
contact database
• Send text message
contact database
Jabber
server
Requests and responses all
sent in XML
Generic XML protocol for
exchanging messages, plus
some services.
Can be extended to non-text
messaging applications
30
XML for networked applications
 XML for encoding data
 XML for transporting information between applications
 XML for encoding instructions to send to another application
– XML interfaces to other applications
 Creation of Web Services
– Software made available to others via a generic XML interface, with
supporting facilities (directory service for ‘finding’ them, etc.)
 XML is becoming the core tool for building distributed, dynamically
configured applications
31
How can this be used?
XML interface
(SOAP, XML-RPC, other...)
Integrated
Application
Web site
News Feeds
Jabber/chat
• Web content distribution
• Calendar aggregation
• Portlets for Web sites
• Distributed catalogs / db’s
Banking
32
The result of all this activity
 Enormous drive to create all the XML technologies needed behind
the scenes
 Many “core” XML languages, plus many supporting standards
 Evolution has been very quick, as the new Web model is not that n
33
XML (and related) Specifications
XML Core
XML 1.0
W3C rec
industry std
W3C draft
‘Open’ std
Xfragment
XML names
RDF
Canonical
Xpath
MathML
APIs
XSLT
JDOM
Xpointer
SMIL 1 & 2
XML base
VoiceXML
JAXP
Xlink
XSL
DOM 1
DOM 2
DOM 3
XML
signature
Infoset
XHTML
events
XML query ….
UDDI
RSS
SOAP
Biztalk
CSS 1
CSS 2
CSS 3
Style
WDDX
...
IFX
IMS
XML-RPC
XMI
ebXML
...
Jabber
WSDL
Protocols
Web Services
CellML
XHTML 1.0
XHTML
basic
Xforms
XML schema
SAX 1
SAX 2
TEI
HEML
...
Application areas
…...
SVG
Modularized
XHTML
Docbook
XUL
100's
more ....
...
Data/presentaion
34
In Conclusion
 XML is changing the way we think about ‘raw’ information
–
–
–
–
–
Open,
Universal
Shareable
Distributable
Collective, complex, and emergent
 .. and with the Internet model is changing the way we think about
applications
– Networked (via XML) collections of individually simple apps.
– Value in aggregation, not the individual parts
35
Conclusion II
 “A large part of how we think about music is influenced by the
methods by with which it has conventionally been distributed. We
think of pop songs as being three or four minutes long because 40
years ago that was all that could fit on one side of a vinyl single.”
Moby
 We think of Internet-based computing is the same way -- in terms
of what we know or knew -- not what it can be, or will become
 Our great opportunity is to help define this future
36
<XML> and the Future of Internetbased Computing
11 March 2002
Ian GRAHAM
Emerging Business Strategy, Bank of Montreal
E: <ian.graham@bmo.com> or <ian.graham@utoronto.ca>
T: (416) 513.5656 / F: (416) 513.5590
Web: http://www.utoronto.ca/ian/talks/
Emerging Business Strategy, IBS
ian.graham@bmo.com / 416.513.5656
37
Download