Accounting and the Web (XML and Web Services) Stephen Burd Associate Professor Anderson Schools of Management University of New Mexico Prepared for the ALPFA 2003 Conference October 4, 2003 – Albuquerque, New Mexico Slides available online at http://averia.mgt.unm.edu Objectives Describe Internet and Web standards with particular attention to capabilities needed to access and use accounting information. ¾Historical Internet/Web Standards – TCP/IP, HTTP, … ¾Representing data semantics (meaning) ¾The Internet as a platform for distributed accounting applications. Describe current and future Web-based standards for accounting information interchange. ¾XML and accounting-related standards based on XML ¾SOAP and accounting-related applications based on SOAP The Web - What Makes it Tick? Ubiquitous Telecommunication Connections ¾AOL, Comcast, AT&T – in all shapes and sizes for everyone from homes to multinational corporations. Common Languages ¾Transport Control Protocol / Internet Protocol (TCP/IP) A general-purpose language for computers to talk to one another. ¾HyperText Transport Protocol (HTTP): A language for reliable document transfer. ¾HyperText Markup Language (HTML): A language for describing document format and content. What Makes The Web Tick - Continued Common Directories ¾Domain Naming System (DNS) A directory of domains and computers For example, microsoft.com and www.mgt.unm.edu ¾Uniform Resource Locator (URL) A unique name for every Web document and application For example, http://averia.mgt.unm.edu/default.htm. Non-proprietary Free/cheap standards software ¾Web browsers (for example Microsoft Internet Explorer) ¾Web server and application development software The Internet and the Web The Internet is a global network of networks tied together with TCP/IP, DNS, and a handful of other standards. The Web is the set of documents and applications that can be accessed and interpreted via Web protocols such as HTML and HTTP. The Internet is the infrastructure over which Web resources are accessed and delivered. Web Usage Scenarios End-user information access ¾ For example, viewing a document (Web page) in a browser and navigating from page to page using hyperlinks. ¾ This is what the web was originally designed to do and has been doing since the early 1990s. Web Usage Scenarios - Continued End-user access to application software via Web pages ¾For example, buying goods online at a site such as www.amazon.com. ¾This usage scenario developed in the late 1990s and is now nearly ubiquitous. ¾Extensions and additions to the HTML/HTTP protocols enabled this type of application (for example, HTML forms, Javascript, CGI scripts, and Active Server Pages). Web Usage Scenarios - Continued Web Usage Scenarios - Continued Information access/exchange by applications ¾ For example, an indexing services such as Google that automatically accesses many Web documents and “does something useful” with the information it extracts. ¾ For example, a service that examines Web “price lists” to find the lowest price for a particular product. ¾ This usage scenario began in the late 1990s, but has been hampered by limitations of HTML. Web Usage Scenarios - Continued Applications components built from distributed Web software ¾For example, a brokerage application that combines order entry software on the broker’s server, inventory management software on vendor servers, credit verification software stored on another server, and delivery scheduling stored on a another server. ¾The application is little more than a shell (script) that calls other Web-accessible applications to do most of the actual work. ¾Early standards to address this usage scenario were cumbersome and limited – newly developed Web standards will help. Web Usage Scenarios - Continued Sample Transaction in an HTML Document HTML Source – Page 1 <table border=0 cellpadding=3 cellspacing=0 bgcolor='#FFFFFF' width=600> <tr bgcolor='#CCCCCC'> <td width=10 class='btsb' nowrap>&nbsp;</td> <td class='btsb' nowrap>Product</td> <td class='btsb' nowrap align='right'>CDW</td> <td class='btsb' nowrap align='right'>Qty</td> <td class='btsb' nowrap align='right'>Price</td> <td class='btsb' nowrap align='right'>Ext Price</td> </tr> <TR> <td class='bts'>&nbsp;</td> <TD class='bts' nowrap> <a href='/shop/products/default.asp?EDC=488664'>US Robotics 56K PCI Faxmodem</a></TD> <TD class='bts' nowrap align='right'>488664</TD> <TD class='bts' nowrap align='right'>1</TD> <TD class='num' nowrap align='right'>$23.60</TD> <TD class='num' nowrap align='right'>$23.60</TD> </TR> <tr> <td colspan=4 align='left'> <img src='http://img.cdw.com/global/pixels/none.gif' height=1 width=1 alt=' ' border=0></td> <td colspan=2 align='right'> <img src='http://img.cdw.com/global/pixels/grey.gif' height=1 width='100%' alt=' ' border=0></td> </tr> HTML Source – Page 1 <TR> <td rowspan=6 class='bts' valign='top'>&nbsp;</td> <td colspan=3 rowspan=5 class='bts' valign='top'>Parcel Tracking # for Box 1 <a target=new href='http://track.airborne.com/atrknav.asp?ShipmentNumber=78314872160'>78314872160</a><BR></ td> <TD align='right' class='btsb'>Sub - Total</TD> <TD class='num' nowrap align='right'>$23.60</TD> </TR> <TR> <TD align='right' class='btsb'>Freight</TD> <TD class='num' nowrap align='right'>$10.99</TD> </TR> <TR><TD align='right' class='btsb'>Tax</TD> <TD class='num' nowrap align='right'>$0.00</TD> </TR> <tr> <td colspan=2 align='right'> <img src='http://img.cdw.com/global/pixels/grey.gif' height=1 width='100%' alt=' ' border=0></td> </tr> <TR> <TD align='right' class='btsbRed'>TOTAL</TD> <TD class='numBoldRed' nowrap align='right'>$34.59</TD> </TR> </table> HTML Formatting HTML is intended to format information in a device-neutral way. ¾Content is interspersed with formatting commands called tags. For example, <table>, <tr>, and <td> are table formatting tags. ¾The display device (usually a Web browser) interprets the tags and makes decisions on how to display the information, for example: Is the window/screen wide enough for the table? If not, should the content be shrunk or scrolled? Are the requested colors available on this display? If not, what colors or grey shades should be substituted? What is a “large font” given the display capabilities and current user preferences? HTML Capabilities and Limitations HTML has achieved its primary goal – it’s made data accessible from and viewable on virtually any computer in the world. HTML has also achieved secondary goals including linking huge amounts of data on the Internet via hypertext references, or links. HTML’s primary limitation is what it doesn’t provide (and what it was never intended to provide!) – information about the semantics, or meaning, of formatted data. Extracting Semantic Information Assume that the previous Web page was the source document for a journal entry. How would you: ¾determine the amounts to be recorded? ¾determine the currency unit(s)? ¾determine the transaction date? ¾determine the account to be debited? Now assume that you want to program a computer to make the same journal entry. How will the program find the information it needs? Human Semantic Data Extraction A human can extract the information needed to make the appropriate journal entry via: ¾Key symbols, words, and phrases, such as “$”, “Tax”, and “Ext Price”. ¾Visual cues including the placement of key symbols and phrases to the left or right of key numbers For example, the word “Total” appears immediately to the left of the currency amount “$34.59”. For example, the word “Price” appears immediately above the currency amount $23.60. ¾Experience with the language and interpreting printed or displayed documents. For example, what are possible synonyms of “total” and where does a total usually appear on an invoice? Automated Semantic Data Extraction It is very difficult to program a computer to perform semantic data extraction using “human” methods. ¾Large dictionaries of key symbols/words/phrases are required (what about spelling errors and coffee spills on the page?). ¾Computers are notoriously poor at visual recognition, including character/symbol recognition and interpreting “placement cues”. Any program that performs the task will generally have an unacceptable error rate. ¾For example, consider the error rate of most OCR programs when working with mixed content documents. XML The World Wide Web Consortium (W3C) considered these problems in the 1990s and began an effort to address them. They developed a new data representation language that address both format and semantics. The new language is called eXtensible Markup Language (XML) XML is an extension of HTML and XML documents are transmitted via HTTP. XML is designed to be extended by language users . XML and Data Semantics XML enables users to define new tags to describe whatever they want, including the meaning of data embedded within XML documents. For example, rather than guessing that a particular number in a document is an invoice total based on nearby key words or visual position, data meaning can be made explicit with a tag: ¾For example, <invoice-total>34.59</invoice-total> and <price-per-unit>23.60</price-per-unit> The tags are ignored by browsers and application software that aren’t specifically programmed to recognize them. XML and Data Users The W3C defined XML, which essentially threw the problem of data semantics back to data users. ¾Users of data must develop and agree upon tags that are meaningful within their particular problem domain. ¾Users must develop software applications that “understand” the tags and manipulate tagged data in useful ways. Many industry-specific groups have been created to develop XML tag standards. The organizations that develop the standards aren’t affiliated with or sanctioned by the W3C. Accounting-Related XML Standards Extensible Business Reporting Language (XBRL) ¾Formerly known as XFRML Financial Information Exchange (FIX) protocol ¾Real-time electronic exchange of securities transactions Mortgage Industry Standards Maintenance Organization (MIMSO) ¾electronic commerce standards for the mortgage industry See www.xml.org for other standards and organizations. XBRL XBRL was developed by a large consortium of accounting-related companies and governmental organizations, sponsored by the AICPA. XBRL has two current standards: ¾XBRL Financial Statements ¾XBRL General Ledger XBRL financial statements was developed first and pilot projects have already been implemented. Other XBRL standards are in progress in areas such as tax filings and credit reporting. XBRL Financial Statement in Excel XBRL Example – XML Source - <ci:statements.incomeStatement> - <!-- INCOME --> <ci:netIncomeAvailableToCommon.netIncome numericContext="C0103"> - 436215000 </ci:netIncomeAvailableToCommon.netIncome> <ci:netIncomeAvailableToCommon.netIncome numericContext="C0104"> 48625000 </ci:netIncomeAvailableToCommon.netIncome> <ci:netIncomeAvailableToCommon.netIncome numericContext="C0105"> 17133000 </ci:netIncomeAvailableToCommon.netIncome> <ci:incomeFromContinuingOperations.incomeTaxes numericContext="C0103"> - 1081000 </ci:incomeFromContinuingOperations.incomeTaxes> <ci:incomeFromContinuingOperations.incomeTaxes numericContext="C0104"> 25367000 </ci:incomeFromContinuingOperations.incomeTaxes> <ci:incomeFromContinuingOperations.incomeTaxes numericContext="C0105"> 10233000 </ci:incomeFromContinuingOperations.incomeTaxes> Current XBRL Limitations XBRL is well-developed for financial reporting of results ¾Income statement ¾Balance sheet ¾Cash flow XBRL or a companion protocol needs further work to address ongoing operations ¾Transactions and journal entries ¾Auditing and compliance Accounting-Related XML Applications Submission and automated analysis of SEC and related financial statements ¾For example, export SEC filings as XBRL documents. ¾For example, analyze competitor financial results via automated analysis of their SEC filings. Data migration among accounting software from different vendors and software packages ¾For example, migrate an entire general ledger from PeachTree to QuickBooks or Oracle Small Business. What’s Next? - SOAP Simple Object-Oriented Access Protocol (SOAP) enables objects (program methods or subroutines) to call and execute one another via the Internet. SOAP and XML SOAP messages are XML documents. SOAP messages are transmitted by HTTP – same as “ordinary web pages”. XML documents can contain other XML documents, so the call to a method or subroutine can pass a document as a parameter. ¾For example, a financial statement or general ledger. SOAP Security SOAP security standards are currently in flux. SOAP messages can be encrypted through virtual private networks (VPNs) like any other Internet message type (e.g., email). Digital signatures and certificates need to be better integrated into SOAP. SOAP and Application Software SOAP is the “glue” that will bind software distributed around the Internet into larger applications. ¾What SOAP does isn’t new, but the way it does it simplifies building distributed Internet applications (note the word “simple” in the acronym). SOAP is currently supported by many vendors who are building software tools to write and host SOAP programs. ¾For example, Microsoft .NET ¾For example, SOAP-enabled Web servers to host SOAP programs, subroutines, and methods. SOAP-Based Accounting Software Solomon (Microsoft) SOAP-Based Accounting Software ObjAcct What is an XML or XBRL SDK A software development kit (SDK) is a library of programs functions that are designed to be built into application software (e.g., an accounts receivable system). SDKs are intended for software developers who want to incorporate specific functions into their products without having to write programs to perform those functions. SDKs are irrelevant to end users unless they use software developed in-house. Future SOAP Applications Automated audit review of financial transactions and statements. ¾Periodic transmission of all transaction records and statements to external auditors and governmental agencies. ¾Continuous export of general ledger data to external auditors and financial markets. Outsourcing of one or more accounting-related business functions with direct data transfer back to the general ledger. ¾For example, receivables collection SOAP and XML Interactions Conclusions XML-based accounting standards such as XBRL have already changed the nature of financial reporting. They’ll continue to advance in areas such as banking and taxation. New XML-based standards are emerging to address realtime capture of transaction data. These developments will enable simple accounting data transfer among users and software. SOAP and related standards will enable parts of an accounting information system to be distributed across the Internet. They’ll also enable outsourcing of entire accounting functions and integration of external service providers into internal accounting systems. Linkography Financial Information Exchange (FIX) Protocol (www.fixprotocol.org) - Real-time electronic exchange of securities transactions NASDAQ/Microsoft Excel Investor’s Assistant (www.nasdaq.com/xbrl) – Demo of financial statement analysis in Excel using XBRL financial statements ObjAcct Business Systems (www.objacct.com) - SOAP-Based accounting software Organization for the Advancement of Structured Information Systems (www.xml.org) - A clearinghouse for information about XML-related standards and organizations. Solomon Software (www.microsoft.com/BusinessSolutions/Solomon/default.mspx World Wide Web Consortium (www.w3c.org) - The source of many Web standards including XML XBRL International (www.xbrl.org) - Developers of the Extensible Business Reporting Language standard.