Data Structures to Avoid in Designing and using e

advertisement
Data Structures to Avoid
When Designing and Using e-Business Messaging
X12C TG3
Content with Multiple Meanings.
Content with Ambiguous meanings.
Redundant Construction and Structures. The X12 PO1 segment, beginning the 850 Purchase Order
Line Item detail loop, has room for up to 10 different Product/Service IDs (e.g., UPC/EAN code, SKU,
etc.) and their qualifiers; any additional pairs would be continued on subsequent LIN segments.
EDIFACT, with a more orthogonal design, tends to use repetitions of the single PIA segment for holding
the (possible numerous) product IDs in the Line Item detail.
Programmatically Hostile Structures. The 820 Payment Transaction Set includes the BPR segment in
the header (Beginning Segment for Payment Order/Remittance Advice) which includes the total payment
amount. The detail remittance information follows in the transaction set. Generating the 820 would
require two passes over the data: 1) once to accumulate the sum, and 2) once more to emit the detail
remittance details.
Another example may be the X12 837 Health Care Claim, where all information concerning providers,
subscribers and patients is co-mingled in a single HL loop. Though implementation guidelines clarify
which segments are used in certain circumstances - by presenting different images of the HL depending
on whether a provider, subscriber, or patient is being talked about - the raw X12 transaction set would be
almost impossible to interpret correctly for use.
Artificial Constructs. The most obvious example of an artificial construct is the LX segment, which is
used merely to begin a segment loop that requires an assigned identification. Other examples are the LS
and LE (Loop Start and Loop End) segments introduced to avoid segment collision in X12. Sometimes
segments are made mandatory or have artificial repeat counts (of 1) simply to resolve segment conflict.
Mimicking Paper Forms. This is perhaps the most egregious offense that can be committed against an
EDI transaction set or guideline! Designers take the word "document" to heart and literally design the
electronic message to the form! This unnecessarily restricts the message to use with a particular
company or agency, when if it had been designed in a general sense with a view to the functional data, it
could be used by all in a similar industry. Take as an example the X12 813 Electronic Filing of Tax
Return Data transaction set: the first indicator that there's a problem here is the TXS Tax Form segment
loop, which is used "[t]o indicate the tax form or the type of tax form being reported." The IRS has its
forms, and each state theirs, and they all have different form names. The paper forms are all different,
but there is common set of information that each jurisdiction requires. The information should really be
sent to the taxing authority as a set of ledger entries, and the receiving application should be responsible
for placing the data into the appropriate structures.
Arithmetic Extensions - Data that can be calculated. Summations and sub-totals often accompany the
previous problem - mimicking paper forms. There's almost no need for sub-totals in an electronic
message, as the recipient presumably has a computer for making the calculations!
Overburdened Code Lists. X12 D.E. 355 - Unit or Basis for Measurement Code – has code values
which pertain not only to units of measurement such as Milliliter, Meter, Inch, and Ampere – but also
stuff like Loaf, Wheel, and Truckload.
Ordinality in Codes. X12 contains element code values expressing ordinality – i.e., the code is a
synonym for a numeric ordinal, which could have been better expressed just as a number. For example,
Data Element 236, Price Identifier Code, is an ID type field, and contains a good number of individual
codes which stand for different contract pricing tiers, e.g., code Cnn means “Contract Tier nn”, where nn
is a number from 1 to 30! Other examples abound: D.E. 1254 - Immunization Status Code – has codes 1
through 9 which mean “First Inoculation,” “Second Inoculation,” and so on, through “Ninth Inoculation.”
EDIFACT is not immune from this syndrome: D.E. 1227 (Calculation sequence indicator, coded)
enumerates code values 1 through 9 for “First step of calculation” through “Ninth step of calculation.”
Multi-Part Codes. X12 has some data elements, such as D.E. 103 - Packaging Code, which is a single
alphanumeric value comprised of two parts concatenated together, making the application or translator
handle the code as a special case. For example, code “BAG79” means “Plastic Bag” where the first part,
BAG is obvious, and the 79 means ”Plastic.” This is a deprecated X12 use that now has been replaced
by multiple element qualifiers within a composite. EDIFACT avoided this problem altogether, as
composite data elements were available from the outset.
Fragmented Semantic Dispersal. EDIFACT, and to some extent X12, often requires the use of multiple
elements among a number of different segments to convey a single idea or concept. In EDIFACT, this is
actually a quite elegant way of building up meaning from any number of segments from a relatively small
segment dictionary. The disadvantage is that implementation conventions or guidelines are even more
important than they are in X12. For example, the EDIFACT segment group PAT-DTM would be
required to communicate one payment term, which is more economically conveyed in X12 with the ITD
segment.
Floating Free-Form Text Elements. The bane of EDI is free-form text, which has to be mechanically
processed by the recipient. The ITD segment, for example, includes an 80 character “free-form
description to clarify the related data elements and their content.” This is where people often stick in
“Net 30,” perhaps expecting the receiving computer to process the term, rather than using the codes and
specific data elements which would be used to build up the extended payment terms.
The notorious “floating” NTE (Note) segment has been deprecated in X12.
Artificial Lengths and Repetitions. Some data elements are fairly generic and are defined as long
enough to contain any piece of data that might belong in that position. For example, PO106 and PO107
in X12 are the qualifier and data, respectively, in the PO1 segment of the PO. A UCC/EAN product code
may have a maximum length of 14 digits long, but there’s no way by reading the X12 book you’d be able
to infer that.
Though both X12 and EDIFACT attempted to design “reasonable” segment and loop repeat counts,
sometimes they’ve been so artificial as to leave one befuddled. Max repeat counts of 1 seem obvious, for
that data where a message couldn’t possibly have use for more than one instance (e.g., the CUR segment
in the 850 header, which sets the default currency for the transaction set: you would not have two
“defaults”). And Max repeat counts of >1 (no limit) in X12 and hundreds of thousands in EDIFACT (for
all practical purposes, an infinite repeat count) are also of obvious use. Even nice round numbers like 40
(for the 850 MEA Measurements segment) might make sense in the real world. But how would one ever
explain the max repeat count of 43 for the YNQ Yes/No Question in the X12 242 Real Estate Information
Report? It may in fact be residue of designing to the standard form (see Mimicking Paper Forms, above).
Data Types not supported by XML Schema. X12 supports two different type of numeric data types –
N (numeric) and R (real). Numeric is a holdover from COBOL, and its specification has a fixed number
of implied decimal positions (in order to save space by not carrying the decimal point). Fortunately,
EDIFACT never contained this abomination. A number IS a number, whether “real” or “floating.”
Numbers should just conform to recognized standards, such as IEEE 754 real numbers representation
(with exponents) - as in XML Schema.
Other examples abound. Both EDIFACT and X12 provide a number of formats for dates, times and
ranges, e.g., CCYYMMDD or YYMMDDHHMM, driven by a qualifier code value. XML Schema
enforces use of ISO 8601, an unambiguous International standard which is adequate for representing
dates and times without the use of a separate qualifier, or adjective, to describe the format.
Syntax Relational Conditions. Both EDIFACT and X12 include syntax relationals – rules that specify
the interdependencies between elements and or composites within a segment (these are the PRECL
syntax notes in X12). The concept of element relationals was not really needed in EDIFACT up till
recently because composites tended to keep related data that had mutual dependencies bunched together.
Likewise, in XML, its need will not arise a lot because qualifiers may very well be "bunched" within
elements in the form of attributes.
Use of an Element for other than its Intended Purpose. This problem usually arises in implementation
conventions, rather than the base X12 transactions or EDIFACT UNSMs.
Problems With Dropping Optional Data. Some users of X12 or EDIFACT EDI deliberately ignore
optional data elements, or all elements in some optional segments, for which they have not explicitly
asked. But that optional data may have been intended by the sender to indicate necessary information.
For example, in an 850 PO, the sender may have submitted an optional DTM segment indicating that the
order line item is not to be fulfilled if it cannot be shipped by a certain date. If the supplier ignores the
DTM segments, which are otherwise optional per X12 rules, it could run into problems having accepted
the PO when the shipment is sent after the drop-dead date.
Others which have not been discussed in X12C TG3:
Rigid sequential structure. Segment instances must be included in EDI data in precisely the order
dictated by the X12 or EDIFACT message layout. For example, every segment and loop within the PO
line item detail loop within the 850 Purchase Order is optional, save for the PO1 loop trigger itself. This
necessarily implies that there would be no segment collisions even if the segments were presented out of
order. For various reasons, it might be preferred to emit the LDT Lead Time loop before the SCH Line
Item Schedule loop.
Download