WHY TABLE DRIVEN CODE FORMS? Langen, 17 April 2007 (Joël Martellet, WMO, World Weather Watch, Data Processing and Forecasting Systems) Why table driven codes?: Plan • The past: fixed alphanumeric codes • and the present: fast evolution of science and technology • Solution: table driven codes • Objectives of this seminar • What is different? WHY WMO CODES? • A WMO “code” is a way of representing data which are exchanged internationally between countries = a data representation form • The Purpose is global exchange of meteorological (or associated to meteorology or hydrology, e.g. marine, oceanography) data in an unambiguous and most efficient way for use by many different applications in different countries • FUNDAMENTAL: IT IS THE RAW MATTER MAKING METEOROLOGY • The design of the Codes should take into account operational constraints WMO DATA REPRESENTATION The design of a data representation system for real time exchange and processing has to take into account operational constraints • THE PAST: the Constraints decades ago: – Telecommunication lines with low bandwidth (50 Bits/s) led to: • character type • abbreviated coding • parameter representation often translated by code rather than the exact value – Manual operation: • human readability • not too frequent • Solution: – FIXED ALPHANUMERIC CHARACTER CODES : Before the meteorological codes were seen! at all stages of the data flow! CONSIDER THE WORLD WEATHER WATCH DATA FLOW (OLD!) Global Observing System GOS Observer and encoding on Teletype Global Telecommunication System GTS Global Data Processing System GDPS Teletype Paper Manual plotting Forecaster • The old WMO “codes” were developed by data type, e.g. TEMP, SYNOP, SATOB, etc. • And they have a «fixed» format. FM 42 AMDAR FM 41 CODAR FM 13 SHIP FM 35 TEMP FM 88 SATOB FM 87 SARAD FM 86 SATEM FM 85 SAREP Nowadays: faster technical evolution • Expanding requirements for applications: e.g. NWP and Global Climate Studies • Increasing volume of data (e.g. satellite) and complexity of observations • Higher accuracy of the measurements • Higher resolution required for observations (e.g. Radiosonding): – In time (more frequent) – In space (vertical resolution) – All radiosonde observations required (flight of the sonde, like an aircraft) for very high resolution non-hydrostatic models.) • New types of data observed and exchanged (e.g. ozone, radiology, sea and water level , etc…) • More frequent requested changes for new data types And how to modify a traditional “fixed” code like inserting a new group: • Such a modification would require a corresponding update to all software programs that encode or decode such reports, if not: the software would either give incorrect values or fail completely. • The reason is that the fixed format and the coding conventions which define the data have to be translated and built fully into the coding and decoding programs. It is this fact that renders the traditional alphanumeric code forms incapable of accommodating new types of data. Difficulties, if not impossibilities: • • Changing requirements but fixed codes Change implies long time scale and costs, because automation came!: – software fixes needed at all stage of the processing chain – equipment made by manufacturers – training required, etc. • Approval of code changes by WMO official body: CBS, and then implementation by Member Countries take from 2 to 4 years, or even more • This is incompatible with the modern fast rate of evolution of science and technology Then blocking ! • Current (but old) Traditional Alphanumeric Codes (TAC) prevent the exchange of critical environmental information to NMHSs and their customers • There is an unnecessarily inefficient costly use of resources for information exchange • They will increasingly constrain NMHS capabilities to exchange more accurate and timely information • They will become increasingly more costly to sustain expansion and growing scientific needs. WHY TABLE DRIVEN CODES? Operational meteorology evolves: • More automation: – more computer processing • Higher bandwidth (9.6kb/s, 64kb/s, 1 Mb/s) • More frequently: new parameters (e.g. satellite, oceanography) • Higher resolution required in time and space • Higher accuracy- more precision required Today the meteorological code does not need to be seen at all! CONSIDER THE WORLD WEATHER WATCH DATA FLOW (ALL AUTOMATED): Global Observing System GOS Automatic Weather Station (AWS) automatic encoding Global Telecommunication System GTS Global Data Processing System GDPS Automatic decoding, displaying program Forecaster Different functions within the data flow: • Not to confuse the functions of: – – – – – Data acquisition Data collection Data transmission Data reception Data visualization • The format used to represent data could be different at each stage, if it is more efficient. The format used to represent data could be different at each stage, if it is more efficient. Not to confuse the functions of: – Data acquisition: AWS sensors value format (or observer reporting data) – Data collection: Encoding in a data representation form the observer report (or the AWS sensors values) for national collection – Data transmission: Keeping format or Encoding with a view to perform international transmission – International transmission process: If possible do not change the data representation, just move a mail “envelope” – Data reception: Decode the data representation form to feed data base or other processing applications – Data visualization: Visualize the data values for a human reader = convert to the most appropriate display format for the user • The WMO concern is the International Exchange and the agreed formats to transmit data between countries. • National formats could be different than WMO recommended standards if more appropriate. • But we need new efficient formats for international exchanges. SOLUTION: Need for Data Representations which offer: – EXPANDABILITY (ability to add easily new parameters, to change accuracy) – SELF DESCRIPTION (auto description of content) – FLEXIBILITY (ability to vary the content) – SUSTAINABILITY (old archives readable) – COMPRESSION (for binary digital exchange) SOLUTION: Need for Data Representations which offer: – EXPANDABILITY (ability to add easily new parameters, to change accuracy): • BUFR, CREX, GRIB Edition 2 (GRIB2) – SELF DESCRIPTION (auto description of content) • BUFR, CREX, GRIB Edition 1 (GRIB1), GRIB2 – FLEXIBILITY (ability to vary the content) • BUFR, CREX, GRIB2 – SUSTAINABILITY (old archives readable) • BUFR, CREX, GRIB1, GRIB2 – COMPRESSION (for binary digital exchange) • BUFR, GRIB1, GRIB2 BUFR: Binary Universal Form for the Representation of meteorological data designed in early eighties- approved for operation implementation on November 1988- Used for archives of all data types and operational exchange of satellites data, ASDAR, AMDAR ,wind profilers, tropical cyclone data, ARGOS data: buoy, XBT, XCTD, sub-surface floats, and starts to be used for translating Traditional Alphanumeric Codes (TAC: SYNOP, SHIP, TEMP, CLIMAT, etc..). CREX: Character form for the Representation and Exchange of data designed in the nineties (it is the image of BUFR in character) approved for operational implementation on May 2000- designed for data types which do not have Traditional Alphanumeric Codes (TAC) and where BUFR cannot be used - Used for operational exchange of ozone, radiological, tide gauge, hydrological and soil temperature data , squall lines (Africa) and tropical cyclone information (also it is a good tool to understand BUFR - could be an interim solution before BUFR, if binary connection and binary processing not possible) Objectives of the seminar • To understand fully the structure of BUFR • To understand how to convert Traditional Alphanumeric Codes (TAC) data into BUFR • To understand how to initiate and have approved additions to the Table Driven Code Forms • To understand how to use and implement in the data processing chains some of the existing decoder and encoder software • To understand how to organize the migration internationally and therefore nationally (migration plan) • To understand what to do to start operationally the migration within the World Weather Watch System TABLE-DRIVEN CODES FORMAT • InAlphanumeric a table driven code form, like in a Traditional Codes, there is also a fixed structure, but • • • it applies only to the shape of the «container» (or code layout or structure) rather than to the content of the «container». These structure rules are the same for whatever the type of data: “only one physical format for all data types”. The presence and form of the data are described within the «container» itself. This is the concept of: SELF-DESCRIPTION. In order to accomplish it, there is a section: THE DATA DESCRIPTION SECTION in a BUFR, CREX or GRIB message, where the type and form of the data, contained within the message, are defined. Once this section is read, the following content: the transmitted data (in the DATA SECTION) can be understood (and decoded). THE DATA DESCRIPTION SECTION • This section will in fact contain a set of “pointers” (called “descriptors”) which refer to elements • • • which are listed in predefined and internationally agreed TABLES (kept in the official WMO Manual on Codes and on the WMO web server). In the TABLES, the list of descriptors (elements) defines how every “datum” is and SHALL BE coded. Hence: the name: “TABLE DRIVEN CODE FORMS”. All the “datum” to be transmitted must be defined in the tables of the WMO Manual (e.g. name, unit (m/s, °K,…), size: 6 bits, 9 bits,…). Descriptors in the Data Description Section In the MESSAGE: The Data Description Section: I see the List of Descriptors In the WMO MANUAL: Set of TABLES containing set of Descriptors Each Descriptor defines how a parameter (an element) Shall be coded: if I know that, I can decode the data in the Data Section (indeed, the data must be listed in the same order as the descriptors in the Description Section) and the Tables must be readable by the decoder (program or human) General structure of a TDCF message Table driven codes generally have this structure • Indicator: GRIB/BUFR/CREX • Identification: Date, time, originator, edition number, table version number ... • Optional section: e.g. more Metadata, private data … • Data description section: What sort of data follows (pointers to WMO tables identifying the data to be transmitted) • Data section: Actual data here • Closure: “7777” • When there is a requirement for transmission of • • new parameters or new data types, items are simply added to the WMO defined tables (to be agreed by CBS) and new pointers (descriptors) will be included in the data description section to reference the new items, if they must be exchanged. In principle, table driven codes can transmit an “infinity of information”. There is total FLEXIBILITY. Definition of new «codes » as such, is no longer necessary. Expansion of WMO tables is sufficient = EXPANDABILITY. Edition number of the format (physical structure of the message) and version number of the WMO tables recorded in the identification section at a fixed for ever place enable the safe processing of archived data = SUSTAINABILITY. BUFR versus CREX IN THE DATA SECTION: • IN BUFR, AN ITEM (PARAMETER CORRESPONDING TO A TRANSMITTED DATUM IN A REPORT) WILL BE TRANSLATED INTO A SET OF BITS. • IN CREX AN ITEM WILL BE TRANSLATED INTO A SET OF CHARACTERS (BYTES). CREX IS THE IMAGE IN CHARACTERS OF BUFR BIT FIELDS. THUS, IT IS ALSO VERY SIMPLE TO READ A CREX MESSAGE. • IN BUFR AND CREX THE PARAMETERS ARE SIMPLY LISTED AS THE USER REQUIRES. THE DATUM ARE LAYED OUT ONE AFTER THE OTHER. BUFR versus CREX Compression(1) • BUFR has an advantage over CREX: • it offers condensation (packing) or compression, therefore voluminous data (ex. satellites, ACARS, wind profilers) will require less resources for transmission and stocking. But the big disadvantage is that man cannot read it! BUFR versus CREX Compression(2) • BUFR provides efficient data packing via: – “BUFR compression” part of BUFR (and GRIB) specifications, like run length compression. Compression is performed by an algorithm which is defined within the code regulations of BUFR (and GRIB). – And somehow data condensation by: • Use of combined data descriptors, called “common sequences” • Use of scale and reference values for descriptors of datum. • CREX is not designed for packing efficiency, obviously not fit for bulky data BUFR versus CREX • CREX, provides flexibility and human readability. • It requires for the transmission of big or many reports a substantial amount of characters. (However, if considered as a bit stream, and by applying compression algorithms available today, it can be compressed somehow, like any computer file) . • • • CREX tables will have the same parameters as BUFR tables, and similar rules, but CREX will be simpler. There is no packing, no associated data (flags, substituted values). CREX is the image of BUFR and provides a standard (but not the best) for visualisation (read and decode ) of BUFR information. VERY SIMPLE CREX EXAMPLE CREX++ T000104 A000 B11001 B11002++ 260 0246+ 030 0137++ 7777 • Data Description Section: – CREX WMO Master Table 00, Edition1, Version 4, Surface data, wind direction, wind speed • Data Section: – wind 1: direction 260 degrees, speed 24.6 m/s – wind 2: direction 30 degrees, speed 13.7 m/s CODING OF SQUALL LINES IN WEST AFRICA IN CREX: Observations (3 points) and forecasted trajectory and evolution: CREX++ T00020412// A007001 P00012000 U00 S001 Y20050823 H1830 D16060++ 2005 08 23 17 50 1500 –01000 070 00010 1900 –00840 1100 –01220 02 0038 0300++ 7777 DATA DESCRIPTION SECTION: T00020412// = 00 = Master table for meteorology 02 = CREX edition number 04 = BUFR edition number 12 = version table of BUFR/CREX // = No local table A007001 = 007 = Synoptic features 001 = Squall Line P00012000 = 00012 = Originating Centre = Dakar 000 = No sub-centre U00 = 00 = Sequence number of message, = 0 = first S001 = 001 = number of sub-sets in the report = 1 Y20050823 = Date of the message = year, month, day H1830 = Hour of the message = hour, minute D16060 = Common sequence defining Squall line (type 1) CODING OF SQUALL LINES IN WEST AFRICA IN CREX: Observations (3 points) and forecasted trajectory and evolution: CREX++ T00020412// A007001 P00012000 U00 S001 Y20050823 H1830 D16060++ 2005 08 23 17 50 1500 –01000 070 00010 1900 –00840 1100 –01220 02 0038 0300++ DATA SECTION (Part 1): 7777 Time of observation: D01011 Date: 2005 = Year 08 = Month 23 = Day D01012 Hour: 17 = Hour 50 = Minute Position of Squall Line Centre: B05002 Latitude: 1500 = 15 deg. North B06002 Longitude: -01000 = 10 deg. West B19005 Direction of moving feature: 070 = 070 degrees B19006 Speed of moving feature: 00010 = 10 m/s CODING OF SQUALL LINES IN WEST AFRICA IN CREX: Observations (3 points) and forecasted trajectory and evolution: CREX++ T00020412// A007001 P00012000 U00 S001 Y20050823 H1830 D16060++ 2005 08 23 17 50 1500 –01000 070 00010 1900 –00840 1100 –01220 02 0038 0300++ DATA SECTION (Part 2): 7777 Amplitude of feature, from most external point to centre point: North side: B05002 Latitude: 1900 = 19 deg. North B06002 Longitude: CODE TABLE -00840 = 8 deg. 40 West 0 20 048 South side: Evolution of feature B05002 Latitude: Code figure 1100 = 11 deg. North 0 Stability B06002 Longitude: 1 Diminution -01220 = 12 deg. 20 West 2 Intensification 3 Unknown B20048 Evolution of feature: 4-14 Reserved 02 = Intensification 14-99 Not used B11041 maximum burst expected: 0038 = 38 m/s B13055 intensity of rain expected: 0300 = 300 mm/h EXAMPLE FOR SYNOP: A SYNOP message from an RA VI high altitude station - the station VAN (Turkey): SMTU10 LTAA 230600 AAXX 23064 17170 11665 60907 10025 21030 38444 48605 52026 69942 70262 83540 333 21035 3/105 42002 70025 83635 86359= KSMDii LTAA 230600 CREX++ T000104 A000 D07222++ 17 170 1 2005 11 23 06 00 3845 04332 16690 16620 08444 ///// 0026 02 08500 01605 14 090 0035 025 -030 067 1500 002 06 02 075 07 03 0105 35 24 10 0002 01 03 06 0105 02 06 03 0270 12 -052 00002 00025 -0012 00004 00205 -0012 //// -0012 -0350++ 7777 Note: The underlined groups in the data section represent additional information that is not included in the SYNOP report. The Table Driven Codes Forms (TDCF) BUFR/CREX • Offer great advantages compared to traditional alphanumeric codes (TAC) (1): the – Self-description, flexibility, expandability, sustainability, packing (BUFR), readability (CREX= image of BUFR in alphanumeric code ) – For new parameters or new data types, no need to change software, just additional table entries (this is fundamental in light of the fast evolution of science and technology: there are regular requests for representation of NEW DATA TYPES, METADATA, HIGHER RESOLUTION (TIME AND SPACE) AND HIGHER ACCURACY) The Table Driven Codes Forms (TDCF) BUFR/CREX • Offer great advantages compared to the traditional alphanumeric codes (TAC) (2): • • • • The systematic passing of metadata including geographical coordinates (latitude, longitude, height) in every report, which can be easily performed with the table driven codes, would alleviate the notorious WMO Volume A problems. The Volume A, containing stations coordinates, is updated with too much delay, the WMO secretariat receiving sometimes with considerable delay and other times not at all, the updates that the Countries should send. The use of BUFR or CREX would solve the majority of the cases where there is a problem of wrong coordinates for a station. The reliability of binary data transmission leads to expectation for an increase in data quality and data quantity received in meteorological centres. More data and of better quality leading to better data assimilation, and consequently better products: better forecasts, better climate studies and generation of better products by data processing centres SUMMARY: TABLE DRIVEN CODES, LIKE BUFR, GRIB 2 AND CREX OFFER: • • • • • • • SELF DESCRIPTION FLEXIBILITY EXPANDABILITY SUSTAINABILITY COMPRESSION (PACKING) FOR BUFR EASY READABILITY FOR CREX DATA OF BETTER QUALITY LEADING TO BETTER PRODUCTS • IT IS A MUST TO USE THEM IN THE 21ST CENTURY FOR THE SAKE OF SCIENCE EVOLUTION AND PROGRESS THANK YOU FOR YOUR ATTENTION Questions???