StAX Streaming API for XML

advertisement
Efficient XML Report
Version 2.0
Gary Blackburn
Jason Craig
Aaron Braeckel
Feb 2013
This research is in response to requirements and funding by the Federal Aviation Administration
(FAA). The views expressed are those of the authors and the authors do not necessarily
represent the official position of the FAA.
1
1 Table of Contents
1 Table of Contents ....................................................................................................................................... 2
2 Executive Summary .................................................................................................................................... 3
3 Introduction: .............................................................................................................................................. 5
4 Example of XML verbosity using a METAR report: .................................................................................... 6
5 Why Binary XML? ....................................................................................................................................... 7
5.1 EXI ....................................................................................................................................................... 7
5.2 FastInfoSet .......................................................................................................................................... 8
6 W3C Benchmark environment for XML compression testing: .................................................................. 8
6.1 Compressors evaluated: ..................................................................................................................... 9
6.1.1 GZIP .............................................................................................................................................. 9
6.1.2 EXIficient ...................................................................................................................................... 9
6.1.3 Sun’s FastInfoset ........................................................................................................................ 10
6.1.4 Efficient XML .............................................................................................................................. 10
6.2 Framework Compressor descriptions: .............................................................................................. 10
7 Parsing API: .............................................................................................................................................. 11
7.1 Simple API for XML (SAX) .................................................................................................................. 11
8 How to improve compression .................................................................................................................. 12
8.1 Problem with including raw text strings ........................................................................................... 12
8.2 Floating point rounding errors .......................................................................................................... 13
9 Results: ..................................................................................................................................................... 14
9.1 Compression ..................................................................................................................................... 14
9.1.5 AgileDelta’s EfficientXML ........................................................................................................... 14
9.1.6 FastInfoSet ................................................................................................................................. 14
9.1.7 GZIP ............................................................................................................................................ 14
9.2 Encoding CPU usage .......................................................................................................................... 15
10 High level analysis: ................................................................................................................................. 16
10.1 Single report analysis: ..................................................................................................................... 16
10.2 24 Hour analysis .............................................................................................................................. 16
2
10.3 24 hours of data compression and parsing rankings ...................................................................... 17
10.4 Detailed results: .............................................................................................................................. 18
10.4.1 METAR ...................................................................................................................................... 18
10.4.2 PIREP ........................................................................................................................................ 21
10.4.3 SIGMET/AIRMET....................................................................................................................... 24
10.4.4 TAF ........................................................................................................................................... 27
11 APPENDIX A - ACRONYMS ...................................................................................................................... 30
2 Executive Summary
The utilization of a binary representation of XML to depict typical weather products has successfully
addressed issues surrounding verbosity when representing them in native XML format
The use of binary representations of XML resulted in the compression of up to 1% of the original size.
For example, 24 hours of METAR reports compressed from 427196 kilobytes down to 4720 kilobytes.
Compression could be improved even further had the original METAR Traditional Alphanumeric Codes
(TAC) string had not been included in the Weather Exchange Model (WXXM) XML data set used in the
analysis. Using appropriate floating point precision was also found to improve compression significantly
in all cases. The following table shows the improvement in compression when using 24 hours of data as
input compared to single reports; it also shows that the size is significantly smaller than the original TAC
METAR Compression Summary
Compression
24 hours
Uncompressed XML
Binary XML (EXI)
GZIP XML
Single
Uncompressed XML
Binary XML (EXI)
GZIP XML
24 hours of TAC (sans headers)
Original
Bytes per report
2721
30
144
3957
345
1163
67
Note: Single report results may not be representative since they are not averages but a single report
used as an example.
3
Using a binary XML format reduces the verbosity of XML document and will reduce the cost of parsing,
transmitting, and storing of common weather products. It also hinders the use of ordinary text editors
to view and edit the document, one of the cornerstones of XML. Hardware/software solutions could be
developed to seamlessly decode binary output to the original XML for human readability if required.
The XML standards body, World Wide Web Consortium (W3C), adopted a binary format as a
recommendation on 10 March 2011. This format, Efficient XML Exchange (EXI), was used by the top
performing compressors in this study.
There is significant variability in the amount of CPU required to encode the XML data; generally the
better the compression, the longer it takes to encode. The test framework uses, for CPU encoding
reference, the time required to simply regenerate native XML from uncompressed data received directly
from the parser. This would represent the worst compaction but the best CPU encoding time. The same
24 hours of METAR data mentioned above takes 7147 milliseconds (ms) to encode compared to 6142 ms
to regenerate XML from the parser.
The study concludes impact on CPU requirements for processing is not a significant deterrent for using
compressed binary representations of weather data.
Recent trends in bandwidth available to home users and the amount of processing performance
commercially available suggest that, over time, network bandwidth will be a scarcer resource than
processing power. Greater compression at the cost of processing power is likely to be a worthwhile
tradeoff.
4
3 Introduction:
The eXtensible Markup Language (XML) is the de facto standard for data representation and exchange
over the World Wide Web. Its self-describing property provides XML with flexibility that has led to a
level of acceptance. Unfortunately, this is also the source of its main criticism, the sheer size of XML
documents.
Compactness issues negatively impact both bandwidth and latency, especially if considered in the
context of sending data to mobile devices and aircraft. Transmission costs are recurring and can be very
expensive depending on where the data is sent. Finally, storage and archival requirement are also
impacted.
Binary XML has been developed to help mitigate problems with XML verbosity. Binary solutions
unfortunately cause a loss of the human-readability characteristic in favor of efficiency. Human
readability is one of the reasons XML has been embraced as a standard. It then becomes critical that
sufficient software exists to easily and transparently support human-readable translation.
To illustrate the verbosity issue, a representation of an Aviation Routine Weather Reports (METAR), a
very popular International Civil Aviation Organization (ICAO)/World Meteorological Organization (WMO)
standardized format for the transmission of weather data, follows in both its raw and WXXM 1.1 XML
representation:
5
4 Example of XML verbosity using a METAR report:
The raw METAR report: METAR K7BM 100619Z AUTO 00000KT 1 1/4SM -SN BKN004 OVC009 M03/M05
A2958 RMK AO2
The same message in WXXM 1.1 XML format:
6
This example illustrates the large increase of information when compared to TAC messages. It also
demonstrates the need for XML compression tools to reduce the bandwidth needed for data exchange,
reduce disk space required for storage and reduce server latencies. XML compression can also
effectively minimize memory requirements required for processing and querying XML documents.
Several XML-conscious compression techniques have been under development to tackle this problem.
5 Why Binary XML?
Binary XML is a compact representation of XML. Using a binary XML format generally reduces the
verbosity of XML documents thereby also reducing the cost of parsing, but hinders the use of ordinary
text editors and third-party tools to view and edit the document. There are several competing formats,
but none has yet emerged as a de facto standard, although the World Wide Web Consortium adopted
EXI as a Recommendation on 10 March 2011. ASN.1/PER is being used as the basis of FastInfoset. The
International Organization for Standardization (ISO) and the International Telecommunications Union
(ITU) published the FastInfoset standard in 2007 and 2005, respectively. Both EXI and ASN.1/PER binary
formats were evaluated in this study.
Binary XML is typically used in applications where the performance of standard XML is insufficient, but
the ability to convert the document to and from a form (XML) which is easily viewed and edited is
valued. Other advantages may include enabling random access and indexing of XML documents.
Alternatives to binary XML include using traditional file compression methods on XML documents (for
example GZIP); or using an existing standard such as ASN.1. Traditional compression methods, however,
offer only the advantage of reduced file size, without the advantage of decreased parsing time or
random access.
5.1 EXI
EXI is a very compact representation for XML that is intended to simultaneously optimize
performance and the utilization of computational resources. The EXI format uses a hybrid approach
drawn from the information and formal language theories, plus practical techniques verified by
measurements, for entropy encoding XML information. This algorithm is amenable to fast and
compact implementation and a small set of data type representations.
Even though EXI is capable of utilizing schema information to improve compactness and processing
efficiency, it does not depend on accurate, complete, or current schemas to work. This is critically
important for using EXI in an environment where data might be severed across international
boundaries that may have a different schematic representation of a product. For example, the
Europeans more closely follow the ICAO standard for representing products like the METAR example
above. In the United States several deviations to the standard have been introduced. If schemas
are used as part of the binary compression mechanism, in order to be decompressed, the end user
would require knowledge of the same schema. If the lowest common schema was used for
compression, in this example the ICAO standard, this will not be a problem.
7
Reference - http://www.w3.org/TR/2011/REC-exi-20110310
A program module called an EXI processor, whether it is software or hardware, is used by
application programs to encode their structured data into EXI streams and/or to decode EXI streams
to make the structured data accessible.
5.2 FastInfoSet
FastInfoSet is a binary format that is competing with EXI. It uses for input a binary format called
XML Abstract Syntax Notation One (ASN.1). This standard uses a flexible notation that describes
rules and structures for representing, encoding, transmitting, and decoding data in
telecommunications and computer networking. The ASN.1 XML Encoding Rules (XER) provides a
textual encoding of data structures defined using ASN.1 notation.
Reference - http://www.itu.int/ITU-T/asn1/xml/finf.htm
6 W3C Benchmark environment for XML compression testing:
The EXI framework used in this report is a testing framework developed by the W3C EX working group
and is available for download from
http://www.w3.org/XML/EXI/framework/exi-ttsfms.zip
This measurement test framework was created for obtaining empirical data about the processing
efficiency and compactness of several XML and binary XML candidates. The EXI framework is built on
top of another framework called Japex, which provided the functionality for drawing the charts included
in this report. The EXI framework included drivers for several Java and C/C++ candidates submitted to
the EXI W3C working group. The Java drivers use the Simple API for XML (SAX) API; the C/C++ drivers use
either a SAX-like API or a typed API (data binding). For this report only Java Simple API for XML drivers
were used.
We computed both compactness tests, final size of a dataset after encoding to binary XML, and best
case performance tests, amount of time used to encode to compressed XML. Both single and 24 hours’
worth of data sets were used as XML input to this framework. These data sets include METARs/Special
Weather Reports (SPECI), Terminal Area Forecasts (TAFs), Significant Meteorological Information
(SIGMETs), and (Pilot Reports (PIREPs) data. The XML format used to represent these data sets was
based on the Weather Exchange Model (WXXM) version 1.1.1.
Hardware used in this test was a system with two 6 Core Intel Xeon X5650 64 bit processors. Processor
speed was 2.66 Ghz, with 12M Cache and Hyper threading turned off. The system had 6x4GB (for a total
24GB of RAM) 1333MHz RDIMM memory chips.
The EXI framework CPU performance test was configured to run 5 seconds worth of data as a warm up,
followed by 15 seconds worth of data used for the actual test. The report time is calculated by dividing
8
15 seconds by the number of iterations performed in the time interval. This allows the test to only
calculate the time within the algorithm; it eliminates time required for file reading, algorithm
initialization and cleanup tasks. With this setting output numbers are much more consistent and result
in little deviations between evaluation runs.
We would like to acknowledge the support from AgileDelta when augmenting this framework to include
Efficient XML as one of the candidate compressors. In addition to an evaluation license for Efficient XML
software, AgileDelta also provided drivers so this technology could be included in the W3C test
framework as well as technical advice. In particular we would like to thank both John Schneider and
Rich Rollman for supporting this effort.
6.1 Compressors evaluated:
6.1.1 GZIP
(GNU zip) is a popular open source compressor which falls in the category of a general text
compressor; it can be used since XML is stored as regular text file. General purpose text
compressors belong to the non-queriable group of compressors and can’t be queried in their
compressed state.
6.1.2 EXIficient
is an open source implementation of the W3C Efficient XML Interchange (EXI) format
specification written in the Java programming language. It uses the EXI binary
representation of XML as input. This technology is considered to be a XML Conscious
Compressor. This group of compressors is designed to exploit awareness of XML structure
to improve compression. Two subclasses further distinguish this group:


Schema dependent compressors – requires that both the encoder and decoder have
access to the document schema. Compressions ratios are often improved with the
use of schemas but this advantage is offset by requiring their presence.
Schema independent compressors
EXIficient falls into the class of queriable compressors. This generally increases compressed
file size but this increase this is counterbalanced by the importance of being able to query
documents without the need to fully decompress them, an important characteristic on CPU
limited systems.
Version 0.9 released on July 5, 2012 was used for this test.
Reference: http://exificient.sourceforge.net
9
6.1.3 Sun’s FastInfoset
is an open source compressor that is considered a XML conscious compressor that allows
queries over compressed input. It can be either be schema dependent or independent. It
uses for input a binary format called XML Abstract Syntax Notation One (ASN.1).
Reference - http://www.itu.int/ITU-T/asn1/xml/finf.htm
6.1.4 Efficient XML
is a commercial compressor developed by AgileDelta. It also is a XML conscious compressor
that can use in either a schema dependent or independent mode and will allow queries over
compressed input. Like EXIficient it uses as input the EXI format. The EXI format is derived
from the AgileDelta Efficient XML format. AgileDelta worked closely with the W3C to define
this open standard.
Version 5.0 was used in this study.
Reference: http://www.agiledelta.com/product_efx.html
6.2 Framework Compressor descriptions:
The W3C EXI framework was set up to use these compressors and compressor settings:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
10
GZIP
- XML using GZIP level 1
EXIficientSAX
- EXI no optimization (pure tokenization)
EXIficientSAX + Schema
- EXI with schema optimizations but without deflate
EXIficientSAX + Document
- EXI with deflate
FasttInfoSetSax
- No optimization
FastInfoSetSAX + Document GZIP - using document analysis
FastInfoSetSAX + Schema
- using schema optimizations
EfficientXMLFasterSAX
- without schemas but optimized for speed
EfficientXMLFasterSchemaSAX - with schemas and optimized for speed
EfficientXMLSmallerSAX
- without schemas and optimized for size
EfficientXMLSmallerSchemaSAX - with schemas and optimized for size
7 Parsing API:
This study focused on event-based APIs and did not evaluate tree-based APIs such as Document Object
Model (DOM) and instead chose to focus on the Simple API for XML (SAX) class of parsers. Although not
used in this study, it is worth mentioning the Streaming API for XML (StAX) class of parser. In StAX, the
application is in control rather than the parser. The application tells the parser when it wants to receive
the next data chunk rather than the parser telling the client when the next chunk of data is ready.
Furthermore, StAX exceeds SAX by allowing programs to both read existing XML documents and create
new ones. Unlike SAX, StAX is a bidirectional API.
7.1 Simple API for XML (SAX)
SAX is a streaming API and is likely preferred in the data access services (such as Web Feature
Services). A streaming API uses much less memory than a tree API, like DOM, since it doesn't have to
hold the entire document in memory. Streaming parsing techniques process the document in small
pieces which allows them to start generating output almost immediately, without waiting for the
entire document to be read. SAX is much faster than a DOM parser. A problem with SAX is that
they are all push APIs. Content of the document is fed to the application as soon as the parser sees
it, regardless of whether the application is ready to receive data or not. Unlike DOM parsers, SAX
only allows one pass through the source document.
11
8 How to improve compression
8.1 Problem with including raw text strings
WXXM1.1 includes the raw test in the XML output. A METAR example would look like:
<rawText>KDEN 100553Z 36005KT 10SM SCT090 BKN150 01/M06 A2961 RMK AO2 SLP004
T00111061 10089 20006 58005<rawText>
The following chart indicates improvements in compression when removed:
METAR 24 Hrs
4,852,630
1,695,870
With raw text
Without
If required the raw string could be recreated algorithmically downstream.
Results are similar for the other products.
(Chart provided by AgileDelta)
12
8.2 Floating point rounding errors
Often Units of Measure required by WXXM are different than the original input. Attention to
precision when converting to now require cause excessive decimal digits in the XML output. More
digits mean more bits.
Initial WXXM input had 15 digits of precision when the original string only had 0 to 2 digits of
decimal precision in the raw METAR format. For example:
29.61 Hg in METAR vs. 29.61023712158203 Hg in XML
1° C in METAR vs. 33.97999954223633° F in XML
Latitude/Longitude values are another place where this problem can be found.
METAR input was corrected before being used in this study. Improvements where dramatic:
(Chart provided by AgileDelta)
13
9 Results:
9.1 Compression
The use of binary representations of XML resulted in compression of up to 1% of the original size.
For example, the best compression achieved for 24 hours of METAR reports was 4720 kilobytes
down from 427196. It was surprising that out of the top 4 ranked compressors, only one of them
benefited by having schemas available. It is believed this could be improved by compressor
configuration dedicated to improving schema usage.
The compressors based on the newly W3C recommended binary format, EXI, performed well
holding the top three rankings in this study. AgileDelta’s commercial implementation, Efficient XML,
was the top performer.
The following findings emphasize optimizations without schemas.
9.1.5 AgileDelta’s EfficientXML
overall compressed better than the other compression techniques and seemed to be the most
efficient. It provides the flexibility to emphasis either size or speed. If CPU is not an issue
EXIficient with deflate yields impressive compression results.
9.1.6 FastInfoSet
using document GZIP performs almost as well as EXIficient with deflate and requires
significantly less CPU. With the exception of FastInfoSetSAX+DocumentGZIP, our study found
that this class of compressors performed much worse than the other techniques. The Efficient
XML Interchange Working Group found that FastInfoSet did not meet the minimum
requirements identified by the XML Binary Characterization Working Group and so is not a W3C
recommended compression technique. Therefore we do not recommend the use of this class
of compressors.
9.1.7 GZIP
performed almost as well as Efficient XML, EXIficient and FastInfoSet when they were run with
no optimizations. It is important to realize the need to optimize all of these compressors and
not to rely on default settings. The major drawback with GZIP is the inability to parse the
document unless it is uncompressed.
Note that results for EXIficientSAX and AgileDelta’s EfficientXMLFasterSAX generated identical
compaction results for both single and 24 hour testing. AgileDelta’s Efficient XML was set to
emphasize minimal processor usage over compactness when configured in FasterSAX mode.
However, EfficientXMLFasterSAX has much better encoding performance than EXIficientSAX.
14
9.2 Encoding CPU usage
The baseline used for comparison was time required for output from the SAX parser, doing no
compression, to handle SAX events and to regenerate XML. Generally this would result in the best
encoding performance but would be the worst for compaction.
There is significant variability in the amount of CPU required to encode the XML data in this study;
generally the better the compression, the longer it takes to encode. The top three compressors
consumed the most CPU. When adding a GZIP post compression to the output of FastinofSetSAX
(FastinfoSetSAX + Document GZIP) it was found that this combination provided the best
combination of compression and CPU usage.
The study concludes impact on CPU requirements for processing is not a significant deterrent for
using compressed binary representations of weather data.
15
10 High level analysis:
10.1 Single report analysis:
Because of the amount of variability of information contained in single reports, it is
important to be careful how much weight is placed on the single report results. This is
apparent when reviewing the lack of consistency in the results. One common trait is that all
XML knowledgeable schemas, EXIficient, FastInfoSet, and AgileDelta Efficient XML,
compressed better than GZIP; the one exception was results from FastInfoSetSAX.
10.2 24 Hour analysis
Results were surprisingly consistent across all products when compressing 24 hours worth of
the four test products. Compression efficiency of 1% of the original XML was consistently
achieved by the top three compressors in the following chart. PIREP compression results
were slight higher at 10%. The chart also illustrates that generally the better the
compression, the more time spent by the CPU to parse the message; these top performers
were among the worst at parsing.
16
10.3 24 hours of data compression and parsing rankings
Compressor
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
EXIficientSAX + Document
FastInfoSetSAX + Document GZIP
GZIP
EXIficientSAX
EfficientXMLFasterSAX
EXIficientSAX + Schema
EfficientXMLFasterSchemaSAX
FastInfoSetSAX
FastInfoSetSAX + Schema
17
Average compression ranking
across all products
1
1
1
2
3
4
4
5
6
7
8
Average encoding time ranking
across all products
9
10
11
5
7
6
3
8
4
1
2
10.4 Detailed results:
10.4.1 METAR
10.4.1.1 Single
10.4.1.1.1
Compaction:
10.4.1.1.2
CPU usage
METAR (cont)
18
10.4.1.2 24 hours
10.4.1.2.1
Compression:
10.4.1.2.2
CPU usage:
METAR (cont)
19
10.4.1.3 METAR compression and encoding processor ranking results:
Compressor
Single METAR
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
24 Hours of METAR
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
20
File Size
(bytes)
3957
1163
1098
366
792
1800
1063
898
1098
345
792
356
437448875
23087909
24192238
30951421
4991926
47356881
7537816
54842070
24192238
24578095
4833185
4900812
Compression
Ranking
% of
XML
CPU
usage
(ms)
0.05
0.14
0.09
0.61
0.61
0.02
0.13
0.02
0.03
0.03
0.09
0.08
10
8
1
4
11
7
6
8
1
4
1
.29
.28
.09
.20
.45
.27
.23
.28
.09
.20
.09
5
6
9
1
10
4
11
6
6
1
1
6146.22
.05 8246.64
.06 4973.22
.07 6950.10
.01 17641.23
.11 2006.63
.02 3840.13
.13 2936.98
.06 2596.34
.06 3474.46
.01 7416.57
.01 8523.70
CPU
usage
Ranking
%of
XML
9
6
10
10
1
8
1
3
3
6
5
2.80
1.80
12.20
12.20
0.40
2.60
0.40
0.60
0.60
1.80
1.60
9
6
7
11
1
5
3
2
4
8
10
1.34
0.81
1.13
2.87
0.32
0.62
0.48
0.42
0.57
1.20
1.39
10.4.2 PIREP
10.4.2.1 Single
10.4.2.1.1
Compression:
10.4.2.1.2
CPU usage:
PIREP (cont)
21
10.4.2.2 24 hours
10.4.2.2.1
Compression:
10.4.2.2.2
CPU usage:
PIREP (cont)
22
10.4.2.3 PIREP compression and encoding processor ranking results:
Compressor
File Size
(bytes)
Compression
Ranking
% of
XML
CPU
usage
(ms)
CPU
usage
Ranking
%of
XML
Single PIREP
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
1969
737
677
226
511
1133
686
410
677
216
511
224
10
7
3
5
11
9
4
7
1
5
1
.37
.34
.14
.26
.58
.35
.21
.34
.11
.26
.11
0.042
0.113
0.057
0.590
0.417
0.014
0.104
0.016
0.017
0.018
0.068
0.053
9
6
11
10
1
8
2
3
4
7
5
2.69
1.36
14.50
9.93
0.33
2.48
0.38
0.40
0.43
1.62
1.26
24 Hours of PIREPs
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
9444309
3323869
4492884
4977576
977322
7994883
1347254
8859492
4492884
4685868
954567
955960
5
6
9
1
10
4
11
6
8
1
1
.35
.48
.53
.10
.85
.14
.94
.48
.50
.10
.10
370.72
674.64
528.61
937.06
2707.00
134.46
458.44
182.90
277.07
358.39
886.76
1402.80
7
6
9
11
1
5
2
3
4
8
10
1.82
1.43
2.53
7.30
0.36
1.24
0.49
0.75
0.97
2.34
3.78
23
10.4.3 SIGMET/AIRMET
10.4.3.1 Single
24
10.4.3.1.1
Compression:
10.4.3.1.2
CPU usage:
SIGMET/AIRMETs (cont)
10.4.3.2 24 hours
25
10.4.3.2.1
Compression:
10.4.3.2.2
CPU usage:
SIGMET/AIRMET (cont)
10.4.3.3 SIGMET/AIRMET compression and encoding processor ranking results:
Compressor
File Size
(bytes)
Single SIGMET
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
3373
1326
1988
900
1030
2446
1209
1681
1988
1488
1030
700
24 Hours of SIGMENTs
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
994371
117586
245982
270504
42298
392797
58950
402230
245982
249002
41917
41459
26
Compression
Ranking
6
9
2
3
11
5
8
9
7
3
1
5
6
9
1
10
4
11
6
6
1
1
% of
XML
CPU
usage
(ms)
CPU
usage
Ranking
%of
XML
.37
.59
.14
.26
.73
.35
.50
.59
.44
.26
.11
0.044
0.193
0.086
0.643
0.917
0.017
0.152
0.015
0.031
0.033
0.125
0.108
9
5
10
11
2
8
1
3
4
7
6
4.39
1.95
14.61
20.84
0.39
3.45
0.34
0.70
0.75
2.84
2.45
.12
.25
.27
.04
.39
.06
.40
.25
.25
.04
.04
4.441
13.493
8.975
18.685
102.98
1.844
13.477
2.612
4.791
6.127
18.464
16.767
7
5
10
11
1
6
2
3
4
9
8
3.04
2.02
4.21
23.19
0.42
3.03
0.59
1.08
1.38
4.16
3.78
10.4.4 TAF
10.4.4.1 Single
27
10.4.4.1.1
Compression:
10.4.4.1.2
CPU usage:
TAFS (cont)
10.4.4.2 24 Hours
28
10.4.4.2.1
Compression:
10.4.4.2.2
CPU Usage:
TAFs (cont)
10.4.4.3 TAF compression and encoding processor ranking result:
Compressor
Single TAF
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
24 Hours of TAFs
Baseline XML
GZIP
EXIficientSAX
EXIficientSAX+Schema
EXIficientSAX+Document
FastInfoSetSAX
FastinfoSetSAX+Document GZIP
FastInfoSetSAX+Schema
EfficientXMLFasterSAX
EfficientXMLFasterSchemaSAX
EfficientXMLSmallerSAX
EfficientXMLSmallerSchemaSAX
29
File Size
(bytes)
2823
1078
1209
541
803
1709
980
824
1209
543
803
397
187642751
9910735
8551255
11956090
1544273
19490564
2610643
22427030
8551255
8583793
1488381
1470162
Compression
Ranking
8
9
2
4
11
7
6
9
2
4
1
5
5
9
1
10
1
11
5
5
1
1
% of
XML
CPU
usage
(ms)
CPU
usage
Ranking
%of
XML
.38
.43
.19
.28
.61
.35
.29
.43
.19
.28
.14
0.047
0.132
0.085
0.61
0.746
0.02
0.116
0.016
0.028
0.026
0.101
0.077
9
6
11
10
1
8
2
3
4
7
5
2.81
1.81
12.99
15.87
0.43
2.47
0.34
0.60
0.55
2.15
1.64
.05
.05
.06
.01
.10
.01
.12
.05
.05
.01
.01
2157.48
3099.33
1936.48
2926.69
7151.67
834.474
1421.26
1162.10
1195.45
1767.32
3394.05
4478.49
8
6
7
11
1
4
2
3
5
9
10
1.44
0.90
1.36
3.31
0.39
0.66
0.54
0.55
0.82
1.57
2.08
11 APPENDIX A - ACRONYMS
AIRMET
Airmen's Meteorological Information
API
Application Programming Interface
CSS-WX
Common Support Services
DOM
Document Object Model
GML
Geographic Markup Language
ICAO
International Civil Aviation Organization
IWXXM
ICAO Weather Exchange Model
METAR
Aviation Routine Weather Report
NCAR
National Center for Atmospheric Research
OGC
Open GeoSpatial Consortium
PIREP
Pilot Reports
SAX
Single Authoritative Source
SIGMET
Significant Meteorological Information
SPECI
Spec ial Weather Reports
StAX
Streaming API for XML
TAC
Traditional Alphanumeric Code
TAF
Terminal Aerodrome Forecasts
W3C
World Wide Web Consortium
WFS
Web Feature Service
XML
Extensible Markup Language
WMO
World Meteorological Organization
WXXM
Weather Exchange Model
30
Download