T. Section-by-Section Rationale T.I Information Systems Annex - Rationale The Information Systems Annex was designed in response to Section 10 of the Ada 9X Requirements document (Requirements for Information Systems). These requirements dictated the principal areas of concern of the IS Annex: o Decimal Arithmetic and Representation o Character Set Definition and Manipulation o String Handling o Interface Facilities The current version of the Rationale discusses the main alternatives considered for the decimal area and the reasons for the design decisions taken. other topics will be addressed in a subsequent version. The The Information Systems Annex Rationale consists of two parts. The first part is organized in parallel with the sections of the IS Annex itself and describes the reasons for specific decisions. The second part (also to be supplied in a subsequent version) describes the general set of issues relating to decimal arithmetic, the principal alternatives considered and their tradeoffs, and the reasons for the choice of decimal model. [Note: the March 1992 IS Annex Rationale contained a detailed version of the second part. It is currently undergoing revision to reflect the changes to the Ada 9X "core language" mapping since that time.] (Part One: Section-by-Section Rationale) T.I.1. Decimal Arithmetic and Representation T.I.1.1. Decimal Types There are two principal techniques for realizing decimal arithmetic. One is through a package, possibly generic, that provides appropriate type(s) and operations. The simplest package solution declares a private discriminated type, where the discriminants correspond to precision (number of decimal digits) and scale (number of digits to the right of the decimal point). The second approach is through Ada's numeric type facility; in particular, fixed point types with smalls that are powers of 10. Financial systems using Ada 83 have adopted the package approach rather than fixed point. To be usable for decimal computation in an IS application, fixed point requires 64-bit integer arithmetic, control over truncation versus rounding on a per-operation basis, and support for 'SMALL representation clauses for powers of 10. Existing Ada 83 implementations, however, have not met these requirements; hence a package approach has been necessary. The principal benefits of the package solution are: o Easy separation from the rest easy encapsulation into an Annex) of the language (thus o Consistency with SQL semantics and terminology o Availability of generic instantiations I/O subprograms without the need for On the other hand there are several problems with the package approach: semantically, to obtain literals and other convenient operations intrinsic to numeric types; pragmatically, to obtain acceptable run-time performance; and stylistically, to exploit some of Ada's principal software engineering advantages (strong typing, logical range specifications). Fixed-point illustrates the opposite tradeoffs. As a category of numeric type it immediately offers literals and other required operations; herculean optimizations are not needed to obtain acceptable performance; and type differentiation is a natural style. However, fixed-point is not an especially simple part of the language, especially for programmers new to Ada. The need to perform per-type generic instantiations for common operations such as I/O is a rather heavy style and may lead to large executables in the absence of code sharing. Moreover, the original Ada 83 design saw fixed point more as a substitute for floating point in spartan target environments than as a mechanism for realizing COBOL-style arithmetic. The choice between the two alternatives was based on a refinement of the decimal requirements, the preparation and comparison of specific solutions in the two areas, and an analysis of programming examples using the two techniques. The result was a decision to use the fixed-point approach as the basis for decimal arithmetic. Details on the refined set of requirements, specific elements of the alternatives considered, and an evaluation of the approaches with respect to the requirements, appear in Part Two. These issues are also addressed in [Brosgol 92], [Dewar 90], [Emery 92], [Wichmann 91a] and [Wichmann 91b]. T.I.1.2. Package DECIMAL Package DECIMAL comprises several implementation-defined constants as well as some types and exceptions needed for IS applications. Note that values of MAX_DELTA greater than 1.0 correspond to COBOL's "assumed" scaling ("P" in picture strings), allowing the programmer to represent large quantities without reserving storage for trailing 0's. Similarly, values of MIN_DELTA less than 10.0**(-MAX_DIGITS) allow the representation of quantities with small magnitude without reserving space for leading fractional 0's. The private type EXTERNAL_REP can be "extended" through child packages. An implementation providing an interface to a specific database system, for example, can define a child package with appropruate deferred constants. T.I.1.3. Decimal Computation -18 The capacity requirements -- MAX_DIGITS at least 18, MIN_DELTA at most 10 , and MAX_DELTA at least 1.0 -- are sufficient for IS applications (and satisfy in particular the demands of large financial systems). The conversion T(expr) and T'ROUND(expr) provide the needed control over truncation versus rounding. We also considered defining a non-biased rounding function (that is, one that rounds a midpoint value towards the closest even) but decided against this in the interest of simplicity. An implementation is permitted to provide a non-biased rounding operation, for example through a generic. GENERIC_DIVIDE is in response to the requirement for a division operation that delivers both a quotient and a remainder. An alternative approach was also considered; namely, to define an attribute T'DIVIDE for the decimal type T serving as dividend, with this attribute defined as a procedure with parameters of arbitrary decimal type for the divisor and (as out parameters) for quotient and remainder. We rejected this in favor of the generic for several reasons. First, a type attribute used as a procedure would be a new facility in the language. Second, division with remainder is needed sometimes but not often. Obtaining it through explicit generic instantiation versus having it available implictly is acceptable style and is easier to implement. T.I.1.4. Decimal Representations INTERNAL REPRESENTATION The programmer can either let the compiler choose an internal representation for a decimal type or provide guidance in the form of an attribute definition for T'MACHINE_RADIX. Note that even where a MACHINE_RADIX is specified, the compiler has some flexibility in choosing exactly how data will be represented internally. This is consistent with the Ada design philosophy, since, in contrast with external representations, the choice of internal representations is in the compiler's rather than the programmer's domain. EXTERNAL REPRESENTATION A difficult aspect of the design for decimal arithmetic is to select a method for modeling external data representations and conversions to/from internal computational format. Several approaches and variations were considered. The principal alternatives, with associated advantages and disadvantages: 1. Encapsulating external representations as types in a package (as done in V4.0), corresponding to COBOL "display" formats and other representations that need to be handled Advantages o Gives explicit support for these data formats o Establishes the interpretation of a data item (zoned leading separate, e.g.) as part of the item's declaration o Data validation function easily provided for each type Disadvantages o Complexity o Needs special optimizations of discriminant storage for efficiency (if types are private), or yields anomalies on assignment (if array types are used) o Needs generic instantiation to convert from external representation to computational type 2. Supplying representation clauses corresponding to the various external representations; the programmer derives a new type from a decimal type T and applies the relevant representation clause to the derived type, and can use between internal and external formats type conversion to convert Advantages o Easy for programmer, since conversion is performed automatically and without need for generic instantiation o Automatically obtains arithmetic for derived type with display representation, similar to COBOL Disadvantages o Difficult to implement; worse, it adds complexity to the machine- dependent part of the compiler and not just to the front end o Implementation needs to allow a type with such a representation clause to be passed as an actual to a generic that takes a decimal type o It provides functionality (arithmetic on external representations) that goes beyond what is required 3. Treating external data as arrays of bytes (or perhaps as arrays of 4- bit "nibbles" in the case of packed decimal); the programmer specifies the format as a parameter to conversion functions (TO_DECIMAL and TO_EXTERNAL, in the generic DECIMAL_IO package) that translate between external and internal representation, and performs data validity checking through a function VALID Advantages o Simplifies the IS Annex by avoiding the need for a complicated package o Readable to programming style: VALID is a convenient way obtain data validity checking Disadvantages o Allows potentially error-prone flexibility, since the same external data item (typically a record field) can be interpreted as having different formats o Generality of formats as run-time parameters needs special treatment for optimized code generation The decision to adopt alternative (3) was based on several considerations. It is simpler to implement than (2), which is important since we want to encourage compilers to support a wide variety of external data formats. And in comparison with (1), alternative (3) is simpler to present and to use. EDITED OUTPUT Several design issues relate to the topic of edited output. One is the way in which the programmer obtains this capability for a decimal type: the principal alternatives are to have this occur automatically (through the implicit provision of an attribute function) or to require an explicit generic instantiation. We decided on the generic approach for several reasons. First, the number of decimal types is not likely to be that large, even for sizable programs. Thus there will not be a large number of instantiations. Second, the edited output facilities are a direct generalization of the generic TEXT_IO.FIXED_IO. The similarities are so strong that the generic TEXT_IO.DECIMAL_IO package fits in as a natural extension to the existing TEXT_IO facilities. Indeed, an approach using attributes would seem be be a rather special-case technique. Third, a generic package is much easier for implementors than the attribute mechanism. As was observed earlier, the choice to realize edited output through attributes was based on the need for a straightforward and lightweight notation. Two versions of IMAGE are defined: one that converts from internal representation, and another that converts from external representation. In the latter case, the particular format is specified as a parameter. The reason for providing the version that converts from external format is that it is often necessary to produce edited output for input data that will not be used computationally. It would be clumsy and inefficient to have to convert first from external to internal format, then convert from internal to edited output format. T.I.1.5. Locale-Specific Characterictics To be discussed in a future version of the Rationale. T.I.2. Character Handling To be discussed in a future version of the Rationale. T.I.3. String Handling To be discussed in a future version of the Rationale. T.I.4. Interface Facilities To be discussed in a future version of the Rationale. [Part Two: Rationale for the Choice of Models for Decimal Arithmetic] To be discussed in a future version of the Rationale. Note: The March 1992 version of the IS Annex Rationale, although containing Ada 9X examples that no longer reflect the state of the mapping, may be consulted for a discussion of the issues surrounding the choice of models for decimal arithmetic. Table of Contents T. Section-by-Section Rationale T-1 T.I Information Systems Annex - Rationale T-1 T.I.1. Decimal Arithmetic and Representation T-1 T.I.1.1. Decimal Types T-1 T.I.1.2. Package DECIMAL T-1 T.I.1.3. Decimal Computation T-1 T.I.1.4. Decimal Representations T-1 T.I.1.5. Locale-Specific Characterictics T-2 T.I.2. Character Handling T-2 T.I.3. String Handling T-2 T.I.4. Interface Facilities T-2