Efficient XML Report Version 2.0 Gary Blackburn Jason Craig Aaron Braeckel Feb 2013 This research is in response to requirements and funding by the Federal Aviation Administration (FAA). The views expressed are those of the authors and the authors do not necessarily represent the official position of the FAA. 1 1 Table of Contents 1 Table of Contents ....................................................................................................................................... 2 2 Executive Summary .................................................................................................................................... 3 3 Introduction: .............................................................................................................................................. 5 4 Example of XML verbosity using a METAR report: .................................................................................... 6 5 Why Binary XML? ....................................................................................................................................... 7 5.1 EXI ....................................................................................................................................................... 7 5.2 FastInfoSet .......................................................................................................................................... 8 6 W3C Benchmark environment for XML compression testing: .................................................................. 8 6.1 Compressors evaluated: ..................................................................................................................... 9 6.1.1 GZIP .............................................................................................................................................. 9 6.1.2 EXIficient ...................................................................................................................................... 9 6.1.3 Sun’s FastInfoset ........................................................................................................................ 10 6.1.4 Efficient XML .............................................................................................................................. 10 6.2 Framework Compressor descriptions: .............................................................................................. 10 7 Parsing API: .............................................................................................................................................. 11 7.1 Simple API for XML (SAX) .................................................................................................................. 11 8 How to improve compression .................................................................................................................. 12 8.1 Problem with including raw text strings ........................................................................................... 12 8.2 Floating point rounding errors .......................................................................................................... 13 9 Results: ..................................................................................................................................................... 14 9.1 Compression ..................................................................................................................................... 14 9.1.5 AgileDelta’s EfficientXML ........................................................................................................... 14 9.1.6 FastInfoSet ................................................................................................................................. 14 9.1.7 GZIP ............................................................................................................................................ 14 9.2 Encoding CPU usage .......................................................................................................................... 15 10 High level analysis: ................................................................................................................................. 16 10.1 Single report analysis: ..................................................................................................................... 16 10.2 24 Hour analysis .............................................................................................................................. 16 2 10.3 24 hours of data compression and parsing rankings ...................................................................... 17 10.4 Detailed results: .............................................................................................................................. 18 10.4.1 METAR ...................................................................................................................................... 18 10.4.2 PIREP ........................................................................................................................................ 21 10.4.3 SIGMET/AIRMET....................................................................................................................... 24 10.4.4 TAF ........................................................................................................................................... 27 11 APPENDIX A - ACRONYMS ...................................................................................................................... 30 2 Executive Summary The utilization of a binary representation of XML to depict typical weather products has successfully addressed issues surrounding verbosity when representing them in native XML format The use of binary representations of XML resulted in the compression of up to 1% of the original size. For example, 24 hours of METAR reports compressed from 427196 kilobytes down to 4720 kilobytes. Compression could be improved even further had the original METAR Traditional Alphanumeric Codes (TAC) string had not been included in the Weather Exchange Model (WXXM) XML data set used in the analysis. Using appropriate floating point precision was also found to improve compression significantly in all cases. The following table shows the improvement in compression when using 24 hours of data as input compared to single reports; it also shows that the size is significantly smaller than the original TAC METAR Compression Summary Compression 24 hours Uncompressed XML Binary XML (EXI) GZIP XML Single Uncompressed XML Binary XML (EXI) GZIP XML 24 hours of TAC (sans headers) Original Bytes per report 2721 30 144 3957 345 1163 67 Note: Single report results may not be representative since they are not averages but a single report used as an example. 3 Using a binary XML format reduces the verbosity of XML document and will reduce the cost of parsing, transmitting, and storing of common weather products. It also hinders the use of ordinary text editors to view and edit the document, one of the cornerstones of XML. Hardware/software solutions could be developed to seamlessly decode binary output to the original XML for human readability if required. The XML standards body, World Wide Web Consortium (W3C), adopted a binary format as a recommendation on 10 March 2011. This format, Efficient XML Exchange (EXI), was used by the top performing compressors in this study. There is significant variability in the amount of CPU required to encode the XML data; generally the better the compression, the longer it takes to encode. The test framework uses, for CPU encoding reference, the time required to simply regenerate native XML from uncompressed data received directly from the parser. This would represent the worst compaction but the best CPU encoding time. The same 24 hours of METAR data mentioned above takes 7147 milliseconds (ms) to encode compared to 6142 ms to regenerate XML from the parser. The study concludes impact on CPU requirements for processing is not a significant deterrent for using compressed binary representations of weather data. Recent trends in bandwidth available to home users and the amount of processing performance commercially available suggest that, over time, network bandwidth will be a scarcer resource than processing power. Greater compression at the cost of processing power is likely to be a worthwhile tradeoff. 4 3 Introduction: The eXtensible Markup Language (XML) is the de facto standard for data representation and exchange over the World Wide Web. Its self-describing property provides XML with flexibility that has led to a level of acceptance. Unfortunately, this is also the source of its main criticism, the sheer size of XML documents. Compactness issues negatively impact both bandwidth and latency, especially if considered in the context of sending data to mobile devices and aircraft. Transmission costs are recurring and can be very expensive depending on where the data is sent. Finally, storage and archival requirement are also impacted. Binary XML has been developed to help mitigate problems with XML verbosity. Binary solutions unfortunately cause a loss of the human-readability characteristic in favor of efficiency. Human readability is one of the reasons XML has been embraced as a standard. It then becomes critical that sufficient software exists to easily and transparently support human-readable translation. To illustrate the verbosity issue, a representation of an Aviation Routine Weather Reports (METAR), a very popular International Civil Aviation Organization (ICAO)/World Meteorological Organization (WMO) standardized format for the transmission of weather data, follows in both its raw and WXXM 1.1 XML representation: 5 4 Example of XML verbosity using a METAR report: The raw METAR report: METAR K7BM 100619Z AUTO 00000KT 1 1/4SM -SN BKN004 OVC009 M03/M05 A2958 RMK AO2 The same message in WXXM 1.1 XML format: 6 This example illustrates the large increase of information when compared to TAC messages. It also demonstrates the need for XML compression tools to reduce the bandwidth needed for data exchange, reduce disk space required for storage and reduce server latencies. XML compression can also effectively minimize memory requirements required for processing and querying XML documents. Several XML-conscious compression techniques have been under development to tackle this problem. 5 Why Binary XML? Binary XML is a compact representation of XML. Using a binary XML format generally reduces the verbosity of XML documents thereby also reducing the cost of parsing, but hinders the use of ordinary text editors and third-party tools to view and edit the document. There are several competing formats, but none has yet emerged as a de facto standard, although the World Wide Web Consortium adopted EXI as a Recommendation on 10 March 2011. ASN.1/PER is being used as the basis of FastInfoset. The International Organization for Standardization (ISO) and the International Telecommunications Union (ITU) published the FastInfoset standard in 2007 and 2005, respectively. Both EXI and ASN.1/PER binary formats were evaluated in this study. Binary XML is typically used in applications where the performance of standard XML is insufficient, but the ability to convert the document to and from a form (XML) which is easily viewed and edited is valued. Other advantages may include enabling random access and indexing of XML documents. Alternatives to binary XML include using traditional file compression methods on XML documents (for example GZIP); or using an existing standard such as ASN.1. Traditional compression methods, however, offer only the advantage of reduced file size, without the advantage of decreased parsing time or random access. 5.1 EXI EXI is a very compact representation for XML that is intended to simultaneously optimize performance and the utilization of computational resources. The EXI format uses a hybrid approach drawn from the information and formal language theories, plus practical techniques verified by measurements, for entropy encoding XML information. This algorithm is amenable to fast and compact implementation and a small set of data type representations. Even though EXI is capable of utilizing schema information to improve compactness and processing efficiency, it does not depend on accurate, complete, or current schemas to work. This is critically important for using EXI in an environment where data might be severed across international boundaries that may have a different schematic representation of a product. For example, the Europeans more closely follow the ICAO standard for representing products like the METAR example above. In the United States several deviations to the standard have been introduced. If schemas are used as part of the binary compression mechanism, in order to be decompressed, the end user would require knowledge of the same schema. If the lowest common schema was used for compression, in this example the ICAO standard, this will not be a problem. 7 Reference - http://www.w3.org/TR/2011/REC-exi-20110310 A program module called an EXI processor, whether it is software or hardware, is used by application programs to encode their structured data into EXI streams and/or to decode EXI streams to make the structured data accessible. 5.2 FastInfoSet FastInfoSet is a binary format that is competing with EXI. It uses for input a binary format called XML Abstract Syntax Notation One (ASN.1). This standard uses a flexible notation that describes rules and structures for representing, encoding, transmitting, and decoding data in telecommunications and computer networking. The ASN.1 XML Encoding Rules (XER) provides a textual encoding of data structures defined using ASN.1 notation. Reference - http://www.itu.int/ITU-T/asn1/xml/finf.htm 6 W3C Benchmark environment for XML compression testing: The EXI framework used in this report is a testing framework developed by the W3C EX working group and is available for download from http://www.w3.org/XML/EXI/framework/exi-ttsfms.zip This measurement test framework was created for obtaining empirical data about the processing efficiency and compactness of several XML and binary XML candidates. The EXI framework is built on top of another framework called Japex, which provided the functionality for drawing the charts included in this report. The EXI framework included drivers for several Java and C/C++ candidates submitted to the EXI W3C working group. The Java drivers use the Simple API for XML (SAX) API; the C/C++ drivers use either a SAX-like API or a typed API (data binding). For this report only Java Simple API for XML drivers were used. We computed both compactness tests, final size of a dataset after encoding to binary XML, and best case performance tests, amount of time used to encode to compressed XML. Both single and 24 hours’ worth of data sets were used as XML input to this framework. These data sets include METARs/Special Weather Reports (SPECI), Terminal Area Forecasts (TAFs), Significant Meteorological Information (SIGMETs), and (Pilot Reports (PIREPs) data. The XML format used to represent these data sets was based on the Weather Exchange Model (WXXM) version 1.1.1. Hardware used in this test was a system with two 6 Core Intel Xeon X5650 64 bit processors. Processor speed was 2.66 Ghz, with 12M Cache and Hyper threading turned off. The system had 6x4GB (for a total 24GB of RAM) 1333MHz RDIMM memory chips. The EXI framework CPU performance test was configured to run 5 seconds worth of data as a warm up, followed by 15 seconds worth of data used for the actual test. The report time is calculated by dividing 8 15 seconds by the number of iterations performed in the time interval. This allows the test to only calculate the time within the algorithm; it eliminates time required for file reading, algorithm initialization and cleanup tasks. With this setting output numbers are much more consistent and result in little deviations between evaluation runs. We would like to acknowledge the support from AgileDelta when augmenting this framework to include Efficient XML as one of the candidate compressors. In addition to an evaluation license for Efficient XML software, AgileDelta also provided drivers so this technology could be included in the W3C test framework as well as technical advice. In particular we would like to thank both John Schneider and Rich Rollman for supporting this effort. 6.1 Compressors evaluated: 6.1.1 GZIP (GNU zip) is a popular open source compressor which falls in the category of a general text compressor; it can be used since XML is stored as regular text file. General purpose text compressors belong to the non-queriable group of compressors and can’t be queried in their compressed state. 6.1.2 EXIficient is an open source implementation of the W3C Efficient XML Interchange (EXI) format specification written in the Java programming language. It uses the EXI binary representation of XML as input. This technology is considered to be a XML Conscious Compressor. This group of compressors is designed to exploit awareness of XML structure to improve compression. Two subclasses further distinguish this group: Schema dependent compressors – requires that both the encoder and decoder have access to the document schema. Compressions ratios are often improved with the use of schemas but this advantage is offset by requiring their presence. Schema independent compressors EXIficient falls into the class of queriable compressors. This generally increases compressed file size but this increase this is counterbalanced by the importance of being able to query documents without the need to fully decompress them, an important characteristic on CPU limited systems. Version 0.9 released on July 5, 2012 was used for this test. Reference: http://exificient.sourceforge.net 9 6.1.3 Sun’s FastInfoset is an open source compressor that is considered a XML conscious compressor that allows queries over compressed input. It can be either be schema dependent or independent. It uses for input a binary format called XML Abstract Syntax Notation One (ASN.1). Reference - http://www.itu.int/ITU-T/asn1/xml/finf.htm 6.1.4 Efficient XML is a commercial compressor developed by AgileDelta. It also is a XML conscious compressor that can use in either a schema dependent or independent mode and will allow queries over compressed input. Like EXIficient it uses as input the EXI format. The EXI format is derived from the AgileDelta Efficient XML format. AgileDelta worked closely with the W3C to define this open standard. Version 5.0 was used in this study. Reference: http://www.agiledelta.com/product_efx.html 6.2 Framework Compressor descriptions: The W3C EXI framework was set up to use these compressors and compressor settings: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 10 GZIP - XML using GZIP level 1 EXIficientSAX - EXI no optimization (pure tokenization) EXIficientSAX + Schema - EXI with schema optimizations but without deflate EXIficientSAX + Document - EXI with deflate FasttInfoSetSax - No optimization FastInfoSetSAX + Document GZIP - using document analysis FastInfoSetSAX + Schema - using schema optimizations EfficientXMLFasterSAX - without schemas but optimized for speed EfficientXMLFasterSchemaSAX - with schemas and optimized for speed EfficientXMLSmallerSAX - without schemas and optimized for size EfficientXMLSmallerSchemaSAX - with schemas and optimized for size 7 Parsing API: This study focused on event-based APIs and did not evaluate tree-based APIs such as Document Object Model (DOM) and instead chose to focus on the Simple API for XML (SAX) class of parsers. Although not used in this study, it is worth mentioning the Streaming API for XML (StAX) class of parser. In StAX, the application is in control rather than the parser. The application tells the parser when it wants to receive the next data chunk rather than the parser telling the client when the next chunk of data is ready. Furthermore, StAX exceeds SAX by allowing programs to both read existing XML documents and create new ones. Unlike SAX, StAX is a bidirectional API. 7.1 Simple API for XML (SAX) SAX is a streaming API and is likely preferred in the data access services (such as Web Feature Services). A streaming API uses much less memory than a tree API, like DOM, since it doesn't have to hold the entire document in memory. Streaming parsing techniques process the document in small pieces which allows them to start generating output almost immediately, without waiting for the entire document to be read. SAX is much faster than a DOM parser. A problem with SAX is that they are all push APIs. Content of the document is fed to the application as soon as the parser sees it, regardless of whether the application is ready to receive data or not. Unlike DOM parsers, SAX only allows one pass through the source document. 11 8 How to improve compression 8.1 Problem with including raw text strings WXXM1.1 includes the raw test in the XML output. A METAR example would look like: <rawText>KDEN 100553Z 36005KT 10SM SCT090 BKN150 01/M06 A2961 RMK AO2 SLP004 T00111061 10089 20006 58005<rawText> The following chart indicates improvements in compression when removed: METAR 24 Hrs 4,852,630 1,695,870 With raw text Without If required the raw string could be recreated algorithmically downstream. Results are similar for the other products. (Chart provided by AgileDelta) 12 8.2 Floating point rounding errors Often Units of Measure required by WXXM are different than the original input. Attention to precision when converting to now require cause excessive decimal digits in the XML output. More digits mean more bits. Initial WXXM input had 15 digits of precision when the original string only had 0 to 2 digits of decimal precision in the raw METAR format. For example: 29.61 Hg in METAR vs. 29.61023712158203 Hg in XML 1° C in METAR vs. 33.97999954223633° F in XML Latitude/Longitude values are another place where this problem can be found. METAR input was corrected before being used in this study. Improvements where dramatic: (Chart provided by AgileDelta) 13 9 Results: 9.1 Compression The use of binary representations of XML resulted in compression of up to 1% of the original size. For example, the best compression achieved for 24 hours of METAR reports was 4720 kilobytes down from 427196. It was surprising that out of the top 4 ranked compressors, only one of them benefited by having schemas available. It is believed this could be improved by compressor configuration dedicated to improving schema usage. The compressors based on the newly W3C recommended binary format, EXI, performed well holding the top three rankings in this study. AgileDelta’s commercial implementation, Efficient XML, was the top performer. The following findings emphasize optimizations without schemas. 9.1.5 AgileDelta’s EfficientXML overall compressed better than the other compression techniques and seemed to be the most efficient. It provides the flexibility to emphasis either size or speed. If CPU is not an issue EXIficient with deflate yields impressive compression results. 9.1.6 FastInfoSet using document GZIP performs almost as well as EXIficient with deflate and requires significantly less CPU. With the exception of FastInfoSetSAX+DocumentGZIP, our study found that this class of compressors performed much worse than the other techniques. The Efficient XML Interchange Working Group found that FastInfoSet did not meet the minimum requirements identified by the XML Binary Characterization Working Group and so is not a W3C recommended compression technique. Therefore we do not recommend the use of this class of compressors. 9.1.7 GZIP performed almost as well as Efficient XML, EXIficient and FastInfoSet when they were run with no optimizations. It is important to realize the need to optimize all of these compressors and not to rely on default settings. The major drawback with GZIP is the inability to parse the document unless it is uncompressed. Note that results for EXIficientSAX and AgileDelta’s EfficientXMLFasterSAX generated identical compaction results for both single and 24 hour testing. AgileDelta’s Efficient XML was set to emphasize minimal processor usage over compactness when configured in FasterSAX mode. However, EfficientXMLFasterSAX has much better encoding performance than EXIficientSAX. 14 9.2 Encoding CPU usage The baseline used for comparison was time required for output from the SAX parser, doing no compression, to handle SAX events and to regenerate XML. Generally this would result in the best encoding performance but would be the worst for compaction. There is significant variability in the amount of CPU required to encode the XML data in this study; generally the better the compression, the longer it takes to encode. The top three compressors consumed the most CPU. When adding a GZIP post compression to the output of FastinofSetSAX (FastinfoSetSAX + Document GZIP) it was found that this combination provided the best combination of compression and CPU usage. The study concludes impact on CPU requirements for processing is not a significant deterrent for using compressed binary representations of weather data. 15 10 High level analysis: 10.1 Single report analysis: Because of the amount of variability of information contained in single reports, it is important to be careful how much weight is placed on the single report results. This is apparent when reviewing the lack of consistency in the results. One common trait is that all XML knowledgeable schemas, EXIficient, FastInfoSet, and AgileDelta Efficient XML, compressed better than GZIP; the one exception was results from FastInfoSetSAX. 10.2 24 Hour analysis Results were surprisingly consistent across all products when compressing 24 hours worth of the four test products. Compression efficiency of 1% of the original XML was consistently achieved by the top three compressors in the following chart. PIREP compression results were slight higher at 10%. The chart also illustrates that generally the better the compression, the more time spent by the CPU to parse the message; these top performers were among the worst at parsing. 16 10.3 24 hours of data compression and parsing rankings Compressor EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX EXIficientSAX + Document FastInfoSetSAX + Document GZIP GZIP EXIficientSAX EfficientXMLFasterSAX EXIficientSAX + Schema EfficientXMLFasterSchemaSAX FastInfoSetSAX FastInfoSetSAX + Schema 17 Average compression ranking across all products 1 1 1 2 3 4 4 5 6 7 8 Average encoding time ranking across all products 9 10 11 5 7 6 3 8 4 1 2 10.4 Detailed results: 10.4.1 METAR 10.4.1.1 Single 10.4.1.1.1 Compaction: 10.4.1.1.2 CPU usage METAR (cont) 18 10.4.1.2 24 hours 10.4.1.2.1 Compression: 10.4.1.2.2 CPU usage: METAR (cont) 19 10.4.1.3 METAR compression and encoding processor ranking results: Compressor Single METAR Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 24 Hours of METAR Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 20 File Size (bytes) 3957 1163 1098 366 792 1800 1063 898 1098 345 792 356 437448875 23087909 24192238 30951421 4991926 47356881 7537816 54842070 24192238 24578095 4833185 4900812 Compression Ranking % of XML CPU usage (ms) 0.05 0.14 0.09 0.61 0.61 0.02 0.13 0.02 0.03 0.03 0.09 0.08 10 8 1 4 11 7 6 8 1 4 1 .29 .28 .09 .20 .45 .27 .23 .28 .09 .20 .09 5 6 9 1 10 4 11 6 6 1 1 6146.22 .05 8246.64 .06 4973.22 .07 6950.10 .01 17641.23 .11 2006.63 .02 3840.13 .13 2936.98 .06 2596.34 .06 3474.46 .01 7416.57 .01 8523.70 CPU usage Ranking %of XML 9 6 10 10 1 8 1 3 3 6 5 2.80 1.80 12.20 12.20 0.40 2.60 0.40 0.60 0.60 1.80 1.60 9 6 7 11 1 5 3 2 4 8 10 1.34 0.81 1.13 2.87 0.32 0.62 0.48 0.42 0.57 1.20 1.39 10.4.2 PIREP 10.4.2.1 Single 10.4.2.1.1 Compression: 10.4.2.1.2 CPU usage: PIREP (cont) 21 10.4.2.2 24 hours 10.4.2.2.1 Compression: 10.4.2.2.2 CPU usage: PIREP (cont) 22 10.4.2.3 PIREP compression and encoding processor ranking results: Compressor File Size (bytes) Compression Ranking % of XML CPU usage (ms) CPU usage Ranking %of XML Single PIREP Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 1969 737 677 226 511 1133 686 410 677 216 511 224 10 7 3 5 11 9 4 7 1 5 1 .37 .34 .14 .26 .58 .35 .21 .34 .11 .26 .11 0.042 0.113 0.057 0.590 0.417 0.014 0.104 0.016 0.017 0.018 0.068 0.053 9 6 11 10 1 8 2 3 4 7 5 2.69 1.36 14.50 9.93 0.33 2.48 0.38 0.40 0.43 1.62 1.26 24 Hours of PIREPs Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 9444309 3323869 4492884 4977576 977322 7994883 1347254 8859492 4492884 4685868 954567 955960 5 6 9 1 10 4 11 6 8 1 1 .35 .48 .53 .10 .85 .14 .94 .48 .50 .10 .10 370.72 674.64 528.61 937.06 2707.00 134.46 458.44 182.90 277.07 358.39 886.76 1402.80 7 6 9 11 1 5 2 3 4 8 10 1.82 1.43 2.53 7.30 0.36 1.24 0.49 0.75 0.97 2.34 3.78 23 10.4.3 SIGMET/AIRMET 10.4.3.1 Single 24 10.4.3.1.1 Compression: 10.4.3.1.2 CPU usage: SIGMET/AIRMETs (cont) 10.4.3.2 24 hours 25 10.4.3.2.1 Compression: 10.4.3.2.2 CPU usage: SIGMET/AIRMET (cont) 10.4.3.3 SIGMET/AIRMET compression and encoding processor ranking results: Compressor File Size (bytes) Single SIGMET Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 3373 1326 1988 900 1030 2446 1209 1681 1988 1488 1030 700 24 Hours of SIGMENTs Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 994371 117586 245982 270504 42298 392797 58950 402230 245982 249002 41917 41459 26 Compression Ranking 6 9 2 3 11 5 8 9 7 3 1 5 6 9 1 10 4 11 6 6 1 1 % of XML CPU usage (ms) CPU usage Ranking %of XML .37 .59 .14 .26 .73 .35 .50 .59 .44 .26 .11 0.044 0.193 0.086 0.643 0.917 0.017 0.152 0.015 0.031 0.033 0.125 0.108 9 5 10 11 2 8 1 3 4 7 6 4.39 1.95 14.61 20.84 0.39 3.45 0.34 0.70 0.75 2.84 2.45 .12 .25 .27 .04 .39 .06 .40 .25 .25 .04 .04 4.441 13.493 8.975 18.685 102.98 1.844 13.477 2.612 4.791 6.127 18.464 16.767 7 5 10 11 1 6 2 3 4 9 8 3.04 2.02 4.21 23.19 0.42 3.03 0.59 1.08 1.38 4.16 3.78 10.4.4 TAF 10.4.4.1 Single 27 10.4.4.1.1 Compression: 10.4.4.1.2 CPU usage: TAFS (cont) 10.4.4.2 24 Hours 28 10.4.4.2.1 Compression: 10.4.4.2.2 CPU Usage: TAFs (cont) 10.4.4.3 TAF compression and encoding processor ranking result: Compressor Single TAF Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 24 Hours of TAFs Baseline XML GZIP EXIficientSAX EXIficientSAX+Schema EXIficientSAX+Document FastInfoSetSAX FastinfoSetSAX+Document GZIP FastInfoSetSAX+Schema EfficientXMLFasterSAX EfficientXMLFasterSchemaSAX EfficientXMLSmallerSAX EfficientXMLSmallerSchemaSAX 29 File Size (bytes) 2823 1078 1209 541 803 1709 980 824 1209 543 803 397 187642751 9910735 8551255 11956090 1544273 19490564 2610643 22427030 8551255 8583793 1488381 1470162 Compression Ranking 8 9 2 4 11 7 6 9 2 4 1 5 5 9 1 10 1 11 5 5 1 1 % of XML CPU usage (ms) CPU usage Ranking %of XML .38 .43 .19 .28 .61 .35 .29 .43 .19 .28 .14 0.047 0.132 0.085 0.61 0.746 0.02 0.116 0.016 0.028 0.026 0.101 0.077 9 6 11 10 1 8 2 3 4 7 5 2.81 1.81 12.99 15.87 0.43 2.47 0.34 0.60 0.55 2.15 1.64 .05 .05 .06 .01 .10 .01 .12 .05 .05 .01 .01 2157.48 3099.33 1936.48 2926.69 7151.67 834.474 1421.26 1162.10 1195.45 1767.32 3394.05 4478.49 8 6 7 11 1 4 2 3 5 9 10 1.44 0.90 1.36 3.31 0.39 0.66 0.54 0.55 0.82 1.57 2.08 11 APPENDIX A - ACRONYMS AIRMET Airmen's Meteorological Information API Application Programming Interface CSS-WX Common Support Services DOM Document Object Model GML Geographic Markup Language ICAO International Civil Aviation Organization IWXXM ICAO Weather Exchange Model METAR Aviation Routine Weather Report NCAR National Center for Atmospheric Research OGC Open GeoSpatial Consortium PIREP Pilot Reports SAX Single Authoritative Source SIGMET Significant Meteorological Information SPECI Spec ial Weather Reports StAX Streaming API for XML TAC Traditional Alphanumeric Code TAF Terminal Aerodrome Forecasts W3C World Wide Web Consortium WFS Web Feature Service XML Extensible Markup Language WMO World Meteorological Organization WXXM Weather Exchange Model 30