Data Structures to Avoid When Designing and Using e-Business Messaging X12C TG3 Content with Multiple Meanings. Content with Ambiguous meanings. Redundant Construction and Structures. The X12 PO1 segment, beginning the 850 Purchase Order Line Item detail loop, has room for up to 10 different Product/Service IDs (e.g., UPC/EAN code, SKU, etc.) and their qualifiers; any additional pairs would be continued on subsequent LIN segments. EDIFACT, with a more orthogonal design, tends to use repetitions of the single PIA segment for holding the (possible numerous) product IDs in the Line Item detail. Programmatically Hostile Structures. The 820 Payment Transaction Set includes the BPR segment in the header (Beginning Segment for Payment Order/Remittance Advice) which includes the total payment amount. The detail remittance information follows in the transaction set. Generating the 820 would require two passes over the data: 1) once to accumulate the sum, and 2) once more to emit the detail remittance details. Another example may be the X12 837 Health Care Claim, where all information concerning providers, subscribers and patients is co-mingled in a single HL loop. Though implementation guidelines clarify which segments are used in certain circumstances - by presenting different images of the HL depending on whether a provider, subscriber, or patient is being talked about - the raw X12 transaction set would be almost impossible to interpret correctly for use. Artificial Constructs. The most obvious example of an artificial construct is the LX segment, which is used merely to begin a segment loop that requires an assigned identification. Other examples are the LS and LE (Loop Start and Loop End) segments introduced to avoid segment collision in X12. Sometimes segments are made mandatory or have artificial repeat counts (of 1) simply to resolve segment conflict. Mimicking Paper Forms. This is perhaps the most egregious offense that can be committed against an EDI transaction set or guideline! Designers take the word "document" to heart and literally design the electronic message to the form! This unnecessarily restricts the message to use with a particular company or agency, when if it had been designed in a general sense with a view to the functional data, it could be used by all in a similar industry. Take as an example the X12 813 Electronic Filing of Tax Return Data transaction set: the first indicator that there's a problem here is the TXS Tax Form segment loop, which is used "[t]o indicate the tax form or the type of tax form being reported." The IRS has its forms, and each state theirs, and they all have different form names. The paper forms are all different, but there is common set of information that each jurisdiction requires. The information should really be sent to the taxing authority as a set of ledger entries, and the receiving application should be responsible for placing the data into the appropriate structures. Arithmetic Extensions - Data that can be calculated. Summations and sub-totals often accompany the previous problem - mimicking paper forms. There's almost no need for sub-totals in an electronic message, as the recipient presumably has a computer for making the calculations! Overburdened Code Lists. X12 D.E. 355 - Unit or Basis for Measurement Code – has code values which pertain not only to units of measurement such as Milliliter, Meter, Inch, and Ampere – but also stuff like Loaf, Wheel, and Truckload. Ordinality in Codes. X12 contains element code values expressing ordinality – i.e., the code is a synonym for a numeric ordinal, which could have been better expressed just as a number. For example, Data Element 236, Price Identifier Code, is an ID type field, and contains a good number of individual codes which stand for different contract pricing tiers, e.g., code Cnn means “Contract Tier nn”, where nn is a number from 1 to 30! Other examples abound: D.E. 1254 - Immunization Status Code – has codes 1 through 9 which mean “First Inoculation,” “Second Inoculation,” and so on, through “Ninth Inoculation.” EDIFACT is not immune from this syndrome: D.E. 1227 (Calculation sequence indicator, coded) enumerates code values 1 through 9 for “First step of calculation” through “Ninth step of calculation.” Multi-Part Codes. X12 has some data elements, such as D.E. 103 - Packaging Code, which is a single alphanumeric value comprised of two parts concatenated together, making the application or translator handle the code as a special case. For example, code “BAG79” means “Plastic Bag” where the first part, BAG is obvious, and the 79 means ”Plastic.” This is a deprecated X12 use that now has been replaced by multiple element qualifiers within a composite. EDIFACT avoided this problem altogether, as composite data elements were available from the outset. Fragmented Semantic Dispersal. EDIFACT, and to some extent X12, often requires the use of multiple elements among a number of different segments to convey a single idea or concept. In EDIFACT, this is actually a quite elegant way of building up meaning from any number of segments from a relatively small segment dictionary. The disadvantage is that implementation conventions or guidelines are even more important than they are in X12. For example, the EDIFACT segment group PAT-DTM would be required to communicate one payment term, which is more economically conveyed in X12 with the ITD segment. Floating Free-Form Text Elements. The bane of EDI is free-form text, which has to be mechanically processed by the recipient. The ITD segment, for example, includes an 80 character “free-form description to clarify the related data elements and their content.” This is where people often stick in “Net 30,” perhaps expecting the receiving computer to process the term, rather than using the codes and specific data elements which would be used to build up the extended payment terms. The notorious “floating” NTE (Note) segment has been deprecated in X12. Artificial Lengths and Repetitions. Some data elements are fairly generic and are defined as long enough to contain any piece of data that might belong in that position. For example, PO106 and PO107 in X12 are the qualifier and data, respectively, in the PO1 segment of the PO. A UCC/EAN product code may have a maximum length of 14 digits long, but there’s no way by reading the X12 book you’d be able to infer that. Though both X12 and EDIFACT attempted to design “reasonable” segment and loop repeat counts, sometimes they’ve been so artificial as to leave one befuddled. Max repeat counts of 1 seem obvious, for that data where a message couldn’t possibly have use for more than one instance (e.g., the CUR segment in the 850 header, which sets the default currency for the transaction set: you would not have two “defaults”). And Max repeat counts of >1 (no limit) in X12 and hundreds of thousands in EDIFACT (for all practical purposes, an infinite repeat count) are also of obvious use. Even nice round numbers like 40 (for the 850 MEA Measurements segment) might make sense in the real world. But how would one ever explain the max repeat count of 43 for the YNQ Yes/No Question in the X12 242 Real Estate Information Report? It may in fact be residue of designing to the standard form (see Mimicking Paper Forms, above). Data Types not supported by XML Schema. X12 supports two different type of numeric data types – N (numeric) and R (real). Numeric is a holdover from COBOL, and its specification has a fixed number of implied decimal positions (in order to save space by not carrying the decimal point). Fortunately, EDIFACT never contained this abomination. A number IS a number, whether “real” or “floating.” Numbers should just conform to recognized standards, such as IEEE 754 real numbers representation (with exponents) - as in XML Schema. Other examples abound. Both EDIFACT and X12 provide a number of formats for dates, times and ranges, e.g., CCYYMMDD or YYMMDDHHMM, driven by a qualifier code value. XML Schema enforces use of ISO 8601, an unambiguous International standard which is adequate for representing dates and times without the use of a separate qualifier, or adjective, to describe the format. Syntax Relational Conditions. Both EDIFACT and X12 include syntax relationals – rules that specify the interdependencies between elements and or composites within a segment (these are the PRECL syntax notes in X12). The concept of element relationals was not really needed in EDIFACT up till recently because composites tended to keep related data that had mutual dependencies bunched together. Likewise, in XML, its need will not arise a lot because qualifiers may very well be "bunched" within elements in the form of attributes. Use of an Element for other than its Intended Purpose. This problem usually arises in implementation conventions, rather than the base X12 transactions or EDIFACT UNSMs. Problems With Dropping Optional Data. Some users of X12 or EDIFACT EDI deliberately ignore optional data elements, or all elements in some optional segments, for which they have not explicitly asked. But that optional data may have been intended by the sender to indicate necessary information. For example, in an 850 PO, the sender may have submitted an optional DTM segment indicating that the order line item is not to be fulfilled if it cannot be shipped by a certain date. If the supplier ignores the DTM segments, which are otherwise optional per X12 rules, it could run into problems having accepted the PO when the shipment is sent after the drop-dead date. Others which have not been discussed in X12C TG3: Rigid sequential structure. Segment instances must be included in EDI data in precisely the order dictated by the X12 or EDIFACT message layout. For example, every segment and loop within the PO line item detail loop within the 850 Purchase Order is optional, save for the PO1 loop trigger itself. This necessarily implies that there would be no segment collisions even if the segments were presented out of order. For various reasons, it might be preferred to emit the LDT Lead Time loop before the SCH Line Item Schedule loop.