eCommerce Technology 20-751 Data Interchange 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Outline • The need for data interchange – Transactions imply data exchange • XML for identifying data – Separation of • content • appearance • document structure • Integrating with legacy applications – Legacy application: one you wish you could replace but can’t • ASN.1 for self-describing data formats – Solves a different problem than XML does – Not what the data means but how it is encoded 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS eCommerce Data Exchange Needs RFQs Ship Notices Catalogs Letters of Credit Quotations Purchase Orders Electronic Payments Bills of Lading Invoices 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Invoice Example <UnitPrice>6.05</UnitPrice> SOURCE: PROF. JEROME YEN 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Data Exchange Problem Different systems and applications use different names and formats for the same information: Real Name Application 1 Field Name Application 2 Field Name Customer Customer_name Cust_name Application 3 Field Name Application 4 Field Name Account Cust Client_number Customer number Customer_num Cust_num Account_num Quantity Par_amount Trade_Quantity Shares Quantity SOURCE: FTISOFT 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS How to Make Data Portable • Tell what the data means • Tell how the data is structured • Tell how it should look SO COMPUTERS CAN UNDERSTAND IT • BUT DO THESE SEPARATELY. MIXING IS BAD • • • • The meaning -- XML The structure -- DTD (document type definition) The formatting -- XSL (Extensible style sheet) Example: XML catalog structure – DTD, XSL 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XML at a glance Well Formed Document: <Book> <Author>George Soros</Author> <Title>The Crisis of Global Capitalism</Title> <Year>1998</Year> <Publ>Public Affairs</Publ> <Price>26.00</Price> <ISBN>1-891620-27-4</ISBN> </Book> DTD: Document Type Definition <?xml version="1.0"> <!DOCTYPE Book [ <!ELEMENT Book (Author, Title, Year, Publ, Price, ISBN)> ]> SOURCE: PROF. JEROME YEN 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XML Recipe Example <?xml version="1.0"?> <Recipe> <Name>Apple Pie</Name> <Ingredients> <Ingredient> <Qty unit=pint>1</Qty> <Item>milk</Item> </Ingredient> <Ingredient> <Qty unit=each>10</Qty> <Item>apples</Item> </Ingredient> </Ingredients> <Instructions> <Step>Peel the apples</Step> <Step>Pour the milk into a 10-inch saucepan</Step> <!-- And so on... --> </Instructions> </Recipe> 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Document Is Now Block-Structured <?xml version="1.0"?> <Recipe> <Name>Apple Pie</Name> <Ingredients> <Ingredient> <Qty unit=pint>1</Qty> <Item>milk</Item> </Ingredient> <Ingredient> <Qty unit=each>10</Qty> <Item>apples</Item> </Ingredient> </Ingredients> <Instructions> <Step>Peel the apples</Step> <Step>Pour the milk into a 10-inch saucepan</Step> <!-- And so on... --> </Instructions> </Recipe> 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS DTD (Document Type Definition) XML is extensible because it allows user-defined tags DEFINES TAG Recipe HAVING A Name AND 3 OPTIONALTAGS XML COMMENT <!-- Sample DTD --> <!ELEMENT Recipe (Name, Description?, Ingredients?, TAGS Name, Description Instructions?)> CONTAIN ONLY <!ELEMENT Name (#PCDATA)> CHARACTER DATA <!ELEMENT Description (#PCDATA)> TAG Ingredients CONTAINS ZERO OR <!ELEMENT Ingredients (Ingredient)*> MORE Ingredient TAGS <!ELEMENT Ingredient (Qty, Item)> Ingredient TAG HAS A <!ELEMENT Qty (#PCDATA)> Qty TAG FOLLOWED <!ATTLIST Qty unit CDATA #REQUIRED> BY AN Item TAG <!ELEMENT Item (#PCDATA)> TAG Qty HAS TWO <!ATTLIST Item optional CDATA "0" POSSIBLE ATTRIBUTES: optional (DEFAULT VALUE 0) isVegetarian CDATA "true"> isVegetarian (DEFAULT true) <!ELEMENT Instructions (Step)+> TAG Instructions HAS ONE OR MORE Step TAGS SOURCE: JAVAWORLD 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Document Object Model (DOM) in XML • An XML structured document can be treated and manipulated as an object • DOM parser transforms the document into a parse tree • Program can walk the tree, performing arbitrary transformations • DOM API then converts the new tree to another XML file • New file can be printed, sent over the net or used as input to another program • Applications can now exchange data without knowing formats — DTD contains everything necessary SOURCE: JAVAWORLD 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XSL Extensible Style Language REWRITING RULES TELLING HOW TO MAP THE CONTENTS OF XML TAGS TO HTML LOOK FOR ALL “TITLE” TAGS <xsl> <rule> <target-element type="title"/> <H1 color="red" font-family="Arial"> <children/> PUT THE CONTENTS OF “TITLE” TAGS INTO RED HEADER FONT ARIAL </H1> </rule> DO THE SAME FOR THE CHILDREN OF </xsl> “TITLE” TAGS SOURCE: WDVL.COM 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XML Financial Implementations • OFX - Open Financial Exchange – Fields • FIXML - XML grammar for FIX (Financial Information Exchange). MSDW is a principal. More information • FINXML - Capital markets. Info. • SWIFT • X12 - data exchange standard for business transactions SOURCE: PROF. JEROME YEN 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XML Implementations • HL7 - health information (Health Level 7) • EDIFACT/SimplEDI - syntax - repository • IFX - Interactive Financial Exchange - personal banking. • – catalogues, supply chain automation • IOTP – Internet Open Trading Protocol buying, payments • XBRL – Extensible Business Reporting Language • XML-enabled product vendor list SOURCE: PROF. JEROME YEN 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Ubiquitous XML Architecture SOURCE: PROF. JEROME YEN 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS The Discovery Problem Broader B2B A mid-sized manufacturer needs to create 400 online relationships with customers, each with their own set of standard and protocols Smarter Search A flower shop in Australia wants to be “plugged in” to every marketplace in the world, but doesn’t know how Easier Aggregation A B2B marketplace cannot get catalog data for relevant suppliers in its industry, along with connections to shippers, insurers, etc. Describe Services Discover Services Integrate Them Together SOURCE: UDDI.ORG 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS UDDI (Universal Description, Discovery and Integration) • Microsoft, IBM, Ariba formed uddi.org • Announced August 31, 2000; endorsed by over 30 companies • Global directory of companies; searchable by computer • Companies publish machine-readable information about themselves AND how to conduct ebusiness with them • Platform-neutral open standard based on XML 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS UDDI Registry Entries Entities register information about themselves Standards Bodies, Programmers, Publishers register information about their Service Types (specs) SOURCE: MICROSOFT 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS UDDI Registry Contents • • • Business name, general business description (in any number of languages) Contact info: names, phone numbers, fax numbers, web sites, etc. Known identifiers: D-U-N-S, Thomas, domain name, stock ticker symbol, other • • • • • • Business categories: name-value pairs 3 standard taxonomies in V1: Industry: NAICS (Industry codes - US Govt.) Product/Services: UN/SPSC (ECMA) Location: Geographical taxonomy (ISO 3166) …more in upcoming releases • • • • • How to do eCommerce” with us (machine-readable) Business process (functional) Service specifications (technical) Binding information (implementation) Language/platform/implementation-agnostic SOURCE: MICROSOFT 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS UDDI Operation 1. SydneyNet.com UDDI Registry Harbour Metals creates online website with local ASP 4. Consumers and businesses discover Harbour Metals and do business with it 2. ASP registers Harbour Metals with UBR 3. Marketplaces and search engines query UBR, cache Harbour Metals data, and bind to its services SOURCE: UDDI.ORG 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Impact of XML • • • • • Interchange mechanism between applications Rapidly becoming the database language of the Web XML client/server/server transactions over http: Permits web data repositories XML properties: – Scalable – Maintainable – Easy to use (spreadsheet style skills) – Interoperable (exchange business components) 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Standard Encodings • Business systems must exchange data in different formats: – Invoices, payment orders, checks, bills of lading, delivery instructions, authentication information • Need standard notation to describe transmitted data in communication protocols • BUT: what format is the data in? – What does 437573746F6D6572 (hex) mean? – Where do fields begin and end? – How about complex data structures (arrays, etc,)? 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Need for Standard Encodings • Interoperability – How can my program read your program’s data? • Parties cannot always agree in advance on standards • Encodings need to be changed – YYMMDD became YYYYMMDD (the Y2K problem) • Minimize programmer and development time 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Abstract Syntax Notation (ASN) • ASN.1 is a method of encoding data so that the format can be decoded from the data itself • ASN.1 is not a programming language • It describes only data structures. No code, no logic. 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Abstract Syntax Notation • ASN.1 has primitive types: BOOLEAN, INTEGER, REAL, ENUMERATED, BIT STRING, IA5STRING, . . . • ASN.1 has – SET (unordered) SEQUENCE (fixed order) of primitive types – CHOICE for selecting alternative types (integer or real) • Can define new types: Month ::= INTEGER (1..12) Day ::= INTEGER (1..31) Daily-stock-volume ::= SEQUENCE SIZE (31) OF INTEGER 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Basic Encoding Rules (BER) • Define how fields described in ASN.1 should be encoded • Units of BER are data elements • A data element is a triple: { identifier-type, length, value } • Some type codes: BOOLEAN IA5STRING INTEGER SEQUENCE SET 01 16 02 10 31 (8-BIT ASCII) • The string “Customer” would be encoded as 16 08 43 75 73 74 6F 6D 65 72 IA5STRING LENGTH 8 20-751 ECOMMERCE TECHNOLOGY HEX “C” HEX “r” HEX “u” SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Basic Encoding Rules • Content field may be primitive (value) or structured (content has subcomponents) 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Basic Encoding Rules BBCard ::= SEQUENCE { name IA5String (SIZE (1..60)), team IA5String (SIZE (1..60)), age INTEGER (1..100), position IA5String (SIZE (1..60)), handedness ENUMERATED {left-handed(0), right-handed(1), ambidextrous(2)}, batting-average REAL } “Casey”, “Mudville Nine”, 32, “left field”, ambidextrous, 0.250 (47 bytes of text) C a s e y M 302D1605 43617365 79160D4D 4E696E65 02012016 0A6C6566 01020903 80FE01 (47 20-751 ECOMMERCE TECHNOLOGY u d v i 75647669 74206669 bytes in SUMMER 2002 l l e 6C6C6520 656C640A BER) COPYRIGHT © 2002 MICHAEL I. SHAMOS Using ASN.1/BER ASN.1 DEFINITIONS ASN.1 COMPILER SOURCE LANGUAGE (JAVA, C) DATA STRUCTURES ENCODER/ DECODER APPLICATION CODE APPLICATION PROGRAM (JAVA, C) COMPILER APPLICATION PROGRAM NOW READS AND WRITES DATA ACCORDING TO BER 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 APPLICATION PROGRAM COPYRIGHT © 2002 MICHAEL I. SHAMOS ASN.1 Encoding Rules • BER (Basic) from 1980s – Internet messaging, telephone billing – BER and Java • DER (Distinguished) – Security applications requiring a unique method of encoding • CER (Canonical) – For long messages. Encoding can begin before the whole message has been read • PER (Packed) – Efficient encodings to reduce bandwidth requirements 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Digital Certificates • Digital certificates are encoded in DER • ASN.1 Primer from RSA Certificate ::= SEQUENCE { tbsCertificate TBSCertificate, signatureAlgorithm AlgorithmIdentifier, signatureValue BIT STRING } • Full ASN.1 definition 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS ASN.1 Applications • Telephone billing information – Transferred Account Procedure (TAP3) – UMTS (3G phones) • • • • • • X9 financial services (checks, electronic funds transfer) Air-to-ground aircraft information Electric and gas utilities Automobile diagnostic monitoring systems Radio Frequency Identification (RFID) Biometric IDs (Proposed ANSI Standard X9.84) – Common Biometric Exchange File Format CBEFF • Smart cards (ISO 7816-4) • MORE 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS XML and ASN.1 • Both XML and ASN.1 represent hierarchical (treestructured) data • Therefore, one can be translated into the other! • IBM ASN.1/XML translator • We also have XER, the XML encoding rules 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS Q&A 20-751 ECOMMERCE TECHNOLOGY SUMMER 2002 COPYRIGHT © 2002 MICHAEL I. SHAMOS