® IBM Software Group Enterprise COBOL Education Using Rational Developer for System Z Performance Optimization Jon Sayles, IBM Software Group, Rational EcoSystems Team © 2006 IBM Corporation IBM Trademarks and Copyrights © Copyright IBM Corporation 2007,2008, 2009. All rights reserved. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. This information is based on current IBM product plans and strategy, which are subject to change by IBM without notice. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. IBM, the IBM logo, the on-demand business logo, Rational, the Rational logo, and other IBM Rational products and services are trademarks or registered trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service marks of others. 2 Course Contributing Authors Thanks to the following individuals, for assisting with this course: Tom Ross, IBM/Rational Dave Myers, IBM/Rational 3 Purpose of This Document Course Name: COBOL Foundation Training - with RDz Course Description: Learn the COBOL language, RDz and learn z/OS terms, concepts and development skills in this course. Pre-requisites: Some experience in a 3rd or 4th Generation Language is expected. SQL is also recommended. Course Length: 10 days Topics (Agenda) Getting Started - installing and configuring RDz - and the course materials, and using Eclipse to edit COBOL COBOL General Language Rules Basic COBOL Statements Structured Programming Concepts and Coding Patterns Data - numeric and character - deep/dive Records and table handling - deep/dive Input/Output and Sequential File patterns Debugging Programs - Note: Deep dive on using RDz for common COBOL programming errors (001, 0C4, 0C7, infinite loops, fall-thru, etc.) COBOL Subprograms and the Linkage Section Advanced Character Manipulation COBOL Intrinsic Functions, Date and Time coding patterns, and Language Environment calls Reports and report writing patterns OS/390 Concepts and JCL (here's where the Sandbox/mainframe access starts) Compile/Link & Run Procs on the mainframe Indexed file Coding Patterns Sort/Merge and Master File Update Coding Patterns Accessing DB2 Data and Stored Procedures COBOL in the Real World: – – – – – – – CICS - lecture only IMS (DL/I and TM) - ditto Batch processing - ditto Java calling COBOL COBOL and XML Statements SOA and COBOL - creating and calling Web Services Web 2.0 using Rich UI 4 Course Details Audience This course is designed for application developers who have learned or programmed in a 3rd or 4th generation language – and who need to build leadingedge applications using COBOL and Rational Developer for System z. Prerequisites This course assumes that the student has a basic understanding and knowledge of software computing technologies, and general data processing terms, concepts and vocabulary. Knowledge of SQL (Structured Query Language) is assumed for database access is assumed as well. Basic PC and mouse-driven development skills, terms and concepts are also assumed. Note that we will be covering RDz's mainframe-editor-compliant function key idiom in this unit 5 Source of COBOL Optimization Strategies Virtually all of the information presented in this course on COBOL language and compiler optimization guidelines were drawn from IBM publication: http://www-01.ibm.com/support/docview.wss?uid=swg27001475 The paper includes authored by Rick Arellanes 6 Course Objectives At the end of this course, you will be able to: Discuss the general tuning options for COBOL Applications Define the trade-offs and functional characteristics of COBOL Compiler options that affect Run-time Performance Distinguish COBOL statements that are candidates for performance re-writes Re-code candidate COBOL statements into better performing code patterns 7 Course Compiler Directives Units: Using an Optimal Programming Style Choosing Efficient Datatypes Handling Tables Efficiently Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Tuning CICS, IMS, DB2 and VSAM Access Appendices 8 Unit Objectives The coding style you use can affect how the optimizer handles your code. You can improve optimization by using structured programming techniques, factoring expressions, using symbolic constants, and grouping constant and duplicate computations. Using structured programming Factoring expressions Using symbolic constants Grouping constant computations Grouping duplicate computations 9 Tuning your program – 1 of 2 When a program is "well-structured", you can assess its performance. A program that has a tangled control flow is difficult to understand and maintain. The tangled control flow also inhibits the optimization of the code. Therefore, before you try to improve the performance directly, you need to assess certain aspects of your program: Examine the underlying algorithms for your program. For top performance, a sound algorithm is essential. For example, a sophisticated algorithm for sorting a million items can be hundreds of thousands times faster than a simple algorithm. Look at the data structures. They should be appropriate for the algorithm. When your program frequently accesses data, reduce the number of steps needed to access the data wherever possible. After you have improved the algorithms and data structures, look at other details of the COBOL source code that affect performance. You can write programs that result in better generated code sequences and use system services better. 10 Tuning your program – 2 of 2 The following areas affect program performance: Coding techniques. These include using a programming style that helps the optimizer, choosing efficient data types, and handling tables efficiently. Optimization. You can optimize your code by using the OPTIMIZE compiler option. Compiler options and USE FOR DEBUGGING ON ALL PROCEDURES. Certain compiler options and language affect the efficiency of your program. Runtime environment. Carefully consider your choice of runtime options and other runtime considerations that control how your compiled program runs. Running under CICS, IMS, or using VSAM. Various tips can help make these programs run efficiently. 11 Using structured programming Using structured programming statements, such as EVALUATE and inline PERFORM, makes your program more comprehensible and generates a more linear control flow. As a result, the optimizer can operate over larger regions of the program, which gives you more efficient code. Use top-down programming constructs. And design your code using the notion of functional decomposition: http://en.wikipedia.org/wiki/Functional_decomposition 12 Using structured programming – specific recommendations Misconception#1 – Inline PERFORM is faster than Out-of-line (standard) PERFORM: Inline PERFORM vs. Out-of-line (standard) PERFORM Out-of-line PERFORM statements are a natural means of realizing a top-down structured design, with top-down programming. Out-of-line PERFORM statements can often be as efficient as inline PERFORM statements, because the optimizer can simplify or remove the linkage code. Misconception#2 – GO TO and branching is faster than PERFORM chaining: Avoid using the following constructs: GO TO - other than to a Paragraph Exit – which is PERFORM'd THRU Backward branches – Except as needed for loops for which PERFORM is unsuitable PERFORM procedures that involve irregular control flow – Such as preventing control from passing to the end of the procedure and returning to the end of the PERFORM chain Definitely avoid using ALTER - as it: Makes maintenance and support a nightmare Removes some important optimization options from the compiler 13 Factoring Expressions By factoring expressions in your programs, you can potentially eliminate a lot of unnecessary computation. For example, the first block of code below is more efficient than the second block of code: MOVE ZERO TO TOTAL PERFORM VARYING I FROM 1 BY 1 UNTIL I = 10 COMPUTE TOTAL = TOTAL + ITEM(I) END-PERFORM COMPUTE TOTAL = TOTAL * DISCOUNT MOVE ZERO TO TOTAL PERFORM VARYING I FROM 1 BY 1 UNTIL I = 10 COMPUTE TOTAL = TOTAL + ITEM(I) * DISCOUNT END-PERFORM 14 Using Symbolic Constants To have the optimizer recognize a data item as a constant throughout the program, initialize it with a VALUE clause and do not change it anywhere in the program. If you pass a data item to a subprogram BY REFERENCE, the optimizer treats it as an external data item and assumes that it is changed at every subprogram call. CALL … BY REFERENCE is the COBOL language default Moral: If you know that you are passing immutable values from one program to the next in a CALL statement, pass them by value If you move a literal to a data item, the optimizer recognizes the data item as a constant only in a limited area of the program after the MOVE statement. 15 Unit Objectives – Summary Having finished this section you now should able to explain: How the coding style you use can affect how the optimizer handles your code. How you can improve optimization by using structured programming techniques, factoring expressions, using symbolic constants, and grouping constant and duplicate computations. 16 Course Compiler Directives Units: Using an Optimal Programming Style Choosing Efficient Datatypes Handling Tables Efficiently Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Appendices 17 Unit Objectives Choosing the appropriate data type and PICTURE clause: Can produce more efficient code As can avoiding USAGE DISPLAY and USAGE NATIONAL data items in areas that are heavily used for computations. Consistent data types can reduce the need for conversions during operations on data items. Making code significantly more efficient at run-time You can also improve program performance by carefully determining when to use fixed-point and floating-point data types. 18 Choosing efficient computational data items – 1 of 2 When you use a data item mainly for arithmetic or as a subscript, code USAGE BINARY (comp) on the data description entry for the item. The operations for manipulating binary data are faster than those for manipulating decimal data. Note that this is especially important for integer math, and not as hard and fast a rule for decimal computations, although USAGE DISPLAY data is almost always less efficient for doing math in COBOL If a fixed-point arithmetic statement has intermediate results with a large precision (number of significant digits), the compiler uses decimal arithmetic anyway, after converting the operands to packed-decimal form. You will see this in some of the labs you complete For fixed-point arithmetic statements, the compiler normally uses binary arithmetic for simple computations with binary operands if the precision is eight or fewer digits. Above 18 digits, the compiler always uses decimal arithmetic. With a precision of nine to 18 digits, the compiler uses either form. Generally More Efficient Generally Less Efficient COBOL Math Variable Type "Efficiency-o-meter" USAGE BINARY (COMP) USAGE PACKED DECIMAL (COMP-3) 19 USAGE DISPLAY Choosing efficient computational data items – 2 of 2 To produce the most efficient code for a BINARY data item, ensure that it has: A sign (an S in its PICTURE clause) Eight or fewer digits For a data item that is larger than eight digits or is used with DISPLAY or NATIONAL data items, use PACKED-DECIMAL (comp-3) data items The code generated for PACKED-DECIMAL data items can be as fast as that for BINARY data items in some cases, especially if the statement is complicated or specifies rounding. To produce the most efficient code for a PACKED-DECIMAL data item, ensure that it has: A sign (an S in its PICTURE clause) An odd number of digits (9s in the PICTURE clause), so that it occupies an exact number of bytes without a half byte left over (when business requirements permit) 15 or fewer digits in the PICTURE specification to avoid using library routines for multiplication and division 20 Using consistent data types In operations on operands of different types, one of the operands must be converted to the same type as the other. Each conversion requires several instructions. For example, one of the operands might need to be scaled to give it the appropriate number of decimal places. You can largely avoid conversions by using consistent data types and by giving both operands the same usage and also appropriate PICTURE specifications. That is, you should ensure that two numbers to be compared, added, or subtracted not only have the same usage but also the same number of decimal places (9s after the V in the PICTURE clause). This means that, for COBOL work areas in the DATA DIVISION – (not file and database buffers) – ensure consistent PIC clauses as follows: Change this to this: 21 Making floating point arithmetic expressions efficient Computation of arithmetic expressions that are evaluated in floating point is most efficient when the operands need little or no conversion. Use operands that are COMP-1 or COMP-2 to produce the most efficient code. Declare integer items as BINARY or PACKED-DECIMAL with nine or fewer digits to afford quick conversion to floating-point data. Also, conversion from a COMP-1 or COMP-2 item to a fixed-point integer with nine or fewer digits, without SIZE ERROR in effect, is efficient when the value of the COMP-1 or COMP-2 item is less than 1,000,000,000. 22 Making large exponentiations efficient Use floating point for exponentiations for large exponents to achieve faster evaluation and more accurate results. For example, the first statement below is computed more quickly and accurately than the second statement: COMPUTE fixed-point1 = fixed-point2 ** 100000.E+00 COMPUTE fixed-point1 = fixed-point2 ** 100000 A floating-point exponent causes floating-point arithmetic to be used to compute the exponentiation. 23 Unit Objectives - Summary Having finished this section, you now should be able to: Choose the appropriate data type and PICTURE clause: Discuss the benefits of consistent data-type declarations Discuss the benefits of different floating point operations 24 Course Compiler Directives Units: Using an Optimal Programming Style Choosing Efficient Datatypes Handling Tables Efficiently Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Appendices 25 Table Elements – 1 of 4 A table element has a collective name, but the individual items within it do not have unique data-names. To refer to an item, you have a choice of three techniques: 1. Use the data-name of the table element, along with its occurrence number (called a subscript) in parentheses. This technique is called subscripting. 2. Use the data-name of the table element, along with a value (called an index) that is added to the address of the table to locate an item (as a displacement from the beginning of the table). This technique is called indexing, or subscripting using index-names. 3. Use both subscripts and indexes together 26 Table Handling – 2 of 4 You can use several techniques to improve the efficiency of table-handling operations, and to influence the optimizer. The return for your efforts can be significant, particularly when tablehandling operations are a major part of an application. The following two guidelines affect your choice of how to refer to table elements: 1. Use indexing rather than subscripting. Although the compiler can eliminate duplicate indexes and subscripts, the original reference to a table element is more efficient with indexes (even if the subscripts were BINARY). The value of an index has the element size factored into it, whereas the value of a subscript must be multiplied by the element size when the subscript is used. The index already contains the displacement from the start of the table, and this value does not have to be calculated at run time. However, subscripting might be easier to understand and maintain. Note that this is especially important during development and testing. 2. Use relative indexing. Relative index references (that is, references in which an unsigned numeric literal is added to or subtracted from the index-name) are executed at least as fast as direct index references, and sometimes faster. There is no merit in keeping alternative indexes with the offset factored in. 27 Handling Tables – 3 of 4 Whether you use indexes or subscripts, the following coding guidelines can help you get better performance: Put constant and duplicate indexes or subscripts on the left. You can reduce or eliminate runtime computations this way. Even when all the indexes or subscripts are variable, try to use your tables so that the rightmost subscript varies most often for references that occur close to each other in the program. This practice also improves the pattern of storage references and also paging. If all the indexes or subscripts are duplicates, then the entire index or subscript computation is a common sub-expression. Specify the element length so that it matches that of related tables. When you index or subscript tables, it is most efficient if all the tables have the same element length. That way, the stride for the last dimension of the tables is the same, and the optimizer can reuse the rightmost index or subscript computed for one table. If both the element lengths and the number of occurrences in each dimension are equal, then the strides for dimensions other than the last are also equal, resulting in greater commonality between their subscript computations. The optimizer can then reuse indexes or subscripts other than the rightmost. Avoid errors in references by coding index and subscript checks into your program. If you need to validate indexes and subscripts, it might be faster to code your own checks than to use the SSRANGE compiler option. 28 Table Handling – General Performance Tips – 4 of 4 You can also improve the efficiency of tables by using these guidelines: Use binary data items for all subscripts. When you use subscripts to address a table, use a BINARY signed data item with eight or fewer digits. – In some cases, using four or fewer digits for the data item might also improve processing time. Use binary data items for variable-length table items. For tables with variable-length items, you can improve the code for OCCURS DEPENDING ON (ODO). To avoid unnecessary conversions each time the variable-length items are referenced, specify BINARY for OCCURS . . . DEPENDING ON objects. Use fixed-length data items whenever possible. Copying variable-length data items into a fixed-length data item before a period of highfrequency use can reduce some of the overhead associated with using variable-length data items. Organize tables according to the type of search method used. If the table is searched sequentially, put the data values most likely to satisfy the search criteria at the beginning of the table. If the table is searched using a binary search algorithm, put the data values in the table sorted alphabetically on the search key field. 29 Subscripting – 1 of 2 The lowest possible subscript value is 1, which references the first occurrence of a table element. In a one-dimensional table, the subscript corresponds to the row number. You can use a literal or a data-name as a subscript. If a data item that has a literal subscript is of fixed length, the compiler resolves the location of the data item. When you use a data-name as a variable subscript, you must describe the data-name as an elementary numeric integer. The most efficient format is COMPUTATIONAL (COMP) with a PICTURE size that is smaller than five digits. You cannot use a subscript with a data-name that is used as a subscript. The code generated for the application resolves the location of a variable subscript at run time. You can increment or decrement a literal or variable subscript by a specified integer amount. For example: TABLE-COLUMN (SUB1 - 1, SUB2 + 3) 30 Subscripting – 2 of 2 You can change part of a table element rather than the whole element. To do so, refer to the character position and length of the substring to be changed. For example: 01 ANY-TABLE. 05 TABLE-ELEMENT PIC X(10) OCCURS 3 TIMES VALUE "ABCDEFGHIJ" . . . . MOVE "??" TO TABLE-ELEMENT (1) (3 : 2). The MOVE statement in the example above moves the string '??' into table element number 1, beginning at character position 3, for a length of 2 characters. 31 Workshop – Proving out the theory In the last three sections you have ingested quite a # of "Best Practices" for COBOL optimization Wouldn't it be nice to actually see whether or not making certain changes to a program improved performance? Our bet… yes On the next few slides are steps for you to run a program that contains use cases as isolated variable in just such an experiment. You will run this program on: Workstation RDz A z/OS mainframe After the completing the job you should: Verify each of the use cases – tying the coding example to one (or more) of the best practices in this section Confirm or deny that the best practice is – just that 32 Workshop – Steps Create a new program in a Local COBOL Project, named: cobperf.cbl From the slide notes, Copy/Paste the code into cobperf.cbl Edit the program, and as described in the comments, lower the multiplier and multiplier-2 factors by removing at least one zero In your project: Nominate cobperf.cbl as the Entry Point Rebuild your project Clean up any syntax errors Create a new Run Configuration for cobperf.exe Run the program Note the results in the Display (DOS box) Verify results by returning to the section in this course where the particular coding construct was described 33 Workshop – z/OS Results If time permits, use the JCL in the Notes section to: Compile under z/OS Run under z/OS When complete, check out the SYSOUT displays: Compare/Contrast with the Workstation run Compare with the Best Practices Note especially, that: Binary fields and decimal math do not always get along Efficient table-handling depends on a large # of variables 34 Course Compiler Directives Units: Using an Optimal Programming Style Choosing Efficient Datatypes Handling Tables Efficiently Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Appendices 35 Unit Objectives At the end of this course, you will be able to: Discuss the general tuning options for COBOL Applications Define the trade-offs and functional characteristics of COBOL Compiler options that affect Run-time Performance 36 COBOL Application Tuning - Overview There are many opportunities for the COBOL programmer to tune the COBOL application program and run-time environment for better CPU time performance and better use of system resources. The COBOL programmer has many compiler options, run-time options, data types, and language features from which to select The proper choice may lead to significantly better performance. Conversely, making the wrong choice can lead to significantly degraded performance. The goal of this course is to make you aware of the various options that are available so that you both the system programmer installing the product as well as the COBOL programmer responsible for the application – can choose the right ones for your application program that will lead to the best performance for your environment. 37 Tuning the Run-Time - Overview This section focuses on some of the options that are available for tuning an application, as well as the overall LE run-time environment. This in itself may not produce high performing code since both the coding style and the data types can have a significant impact on the performance of the application. In fact, the coding style and data types usually have a far greater impact on the performance of an application than that of tuning the application via external means Compiler options Run-time options Space management tuning Placing the library routines in shared storage Adding buffers via DCB=BUFNO= or AMP= JCL statements to both input and output files Running in a high priority batch initiator Verify your files are not on high-traffic disk channels or packs Don't write data if you don't have to -- the fastest I/O is the one you don't do Etc. 38 Compiler Directives – Overview and List Here are the compiler options we'll be focusing on: ARITH - EXTEND or COMPAT AWO or NOAWO DATA(24) or DATA(31) DYNAM or NODYNAM FASTSRT or NOFASTSRT NUMPROC - NOPFD, MIG, or PFD OPTIMIZE(STD), OPTIMIZE(FULL), or NOOPTIMIZE RENT or NORENT RMODE - AUTO, 24, or ANY SSRANGE or NOSSRANGE TEST or NOTEST THREAD or NOTHREAD TRUNC - BIN, STD, or OPT You will note that the use of the above is not one-dimensional, in that you will have take into many account factors 39 ARITH - EXTEND or COMPAT The ARITH compiler option allows you to control the maximum number of digits allowed for decimal numbers (packed decimal, zoned decimal, and numeric-edited data items and numeric literals). Options: ARITH(EXTEND), the maximum number of digits is 31 ARITH(COMPAT), the maximum number of digits is 18. Issue: ARITH(EXTEND) will cause some degradation in performance for all decimal data types due to larger intermediate results. The amount of degradation that you experience depends directly on the amount of decimal data that you use. Performance considerations using ARITH: On the average, ARITH(EXTEND) was 1% slower than ARITH(COMPAT), Range of equivalent to 38% slower. Default: ARITH(COMPAT) 40 AWO or NOAWO The AWO compiler option causes the APPLY WRITE-ONLY clause to be in effect for all physical sequential, variable-length, blocked files, even if the APPLY WRITE-ONLY clause is not specified in the program. Options: With APPLY WRITE-ONLY (AWO) in effect, the file buffer is written to the output device when there is not enough space in the buffer for the next record. Without APPLY WRITE-ONLY (NOAWO) the file buffer is written to the output device when there is not enough space in the buffer for the maximum size record. If the application has a large variation in the size of the records to be written, using APPLY WRITE-ONLY can result in a performance savings since this will generally result in fewer calls to Data Management Services to handle the I/Os. Performance considerations using AWO: One program using variable-length files and AWO was 88% faster than NOAWO. This faster processing was the result of using 98% fewer EXCPs to process the writes. Default: NOAWO 41 Storage and its addressability When you run COBOL programs, the programs and the data that they use reside in virtual storage. Storage that you use with COBOL can be either below the 16-MB line or above the 16-MB line but below the 2-GB bar. Two modes of addressing are available to address this storage: 24-bit and 31-bit. You can address storage below (but not above) the 16-MB line with 24-bit addressing. You can address storage either above or below the 16-MB line with 31bit addressing. Unrestricted storage is addressable by 31-bit addressing and therefore encompasses all the storage available to your program, both above and below the 16-MB line. Enterprise COBOL does not directly exploit the 64-bit virtual addressing capability of z/OS; however, COBOL applications running in 31-bit or 24-bit addressing mode are fully supported on 64-bit z/OS systems. Addressing mode (AMODE) is the attribute that tells which hardware addressing mode is supported by your program: 24-bit addressing, 31-bit addressing, or either 24-bit or 31-bit addressing. This attribute is AMODE 24, AMODE 31, or AMODE ANY, respectively. The object program, the load module, and the executing program each has an AMODE attribute. All Enterprise COBOL object programs are AMODE ANY. Residency mode (RMODE) is the attribute of a program load module that identifies where in virtual storage the program will reside: below the 16-MB line, or either below or above. This attribute is RMODE 24 or RMODE ANY. 42 DATA(24) or DATA(31) Using DATA(31) with your RENT program will help to relieve some below the line virtual storage constraint problems. When you use DATA(31) with your RENT programs, most QSAM file buffers can be allocated above the 16MB line. When you use DATA(31) with the run-time option HEAP(,,ANYWHERE), all non-EXTERNAL WORKING-STORAGE and non-EXTERNAL FD record areas can be allocated above the 16MB line. With DATA(24), the WORKING-STORAGE and FD record areas will be allocated below the 16 MB line. Notes: 1. For NORENT programs, the RMODE option determines where non-EXTERNAL data is allocated. 2. See “QSAM Files” on page 27 for additional information on QSAM file buffers. 3. Data(24) can be combined with the Run-time option: ALL31 43 DYNAM or NODYNAM The DYNAM compiler option specifies that all subprograms invoked through the CALL literal statement will be loaded dynamically at run time. This allows you to share common subprograms among several different applications, allowing for easier maintenance of these subprograms since the application will not have to be re-link-edited if the subprogram is changed. DYNAM also allows you to control the use of virtual storage by giving you the ability to use a CANCEL statement to free the virtual storage used by a subprogram when the subprogram is no longer needed. However, when using the DYNAM option, you pay a performance penalty since the call must go through a library routine, whereas with the NODYNAM option, the call goes directly to the subprogram. Hence, the path length is longer with DYNAM than with NODYNAM. Performance considerations using DYNAM with CALL literal (measuring CALL overhead only): On the average, for a CALL intensive application, the overhead associated with the CALL using DYNAM ranged from 16% slower to 100% slower than NODYNAM. 44 FASTSRT or NOFASTSRT For eligible sorts, the FASTSRT compiler option specifies that the SORT product will handle all of the I/O and that COBOL does not need to do it. This eliminates all of the overhead of returning control to COBOL after each record is read in or after processing each record that COBOL returns to sort. The use of FASTSRT is recommended when direct access devices are used for the sort work files since the compiler will then determine which sorts are eligible for this option and generate the proper code. If the sort is not eligible for this option, the compiler will still generate the same code as if the NOFASTSRT option were in effect. A list of requirements for using the FASTSRT option is in the COBOL programming guide. Performance considerations using FASTSRT: One program that processed 100,000 records was 35% faster when using FASTSRT compared to using NOFASTSRT. 45 NUMPROC - NOPFD, MIG, or PFD – 1 of 2 Using the NUMPROC(PFD) compiler option generates significantly more efficient code for numeric comparisons. It also avoids the generation of extra code that NUMPROC(NOPFD) or NUMPROC(MIG) generates for most references to COMP-3 and DISPLAY numeric data items to ensure a correct sign is being used. With NUMPROC(NOPFD), sign fix-up processing is done for all references to these numeric data items. With NUMPROC(MIG), sign fix-up processing is done only for receiving fields (and not for sending fields) of arithmetic and MOVE statements. With NUMPROC(PFD), the compiler assumes that the data has the correct sign and bypasses this sign fix-up processing 46 NUMPROC - NOPFD, MIG, or PFD – 2 of 2 NUMPROC(MIG) generates code that is similar to that of OS/VS COBOL. Using NUMPROC(NOPFD) or NUMPROC(MIG) may also inhibit some other types of optimization. However, not all external data files contain the proper sign for COMP-3 or DISPLAY signed numeric data, and hence, using NUMPROC(PFD) may not be applicable for all application programs. For performance sensitive applications, NUMPROC(PFD) is recommended when possible. Performance considerations using NUMPROC: On the average, NUMPROC(PFD) was 1% faster than NUMPROC(NOPFD), with a range of 20% faster to equivalent. On the average, NUMPROC(PFD) was 1% faster than NUMPROC(MIG), with a range of 9% faster to equivalent. On the average, NUMPROC(MIG) was equivalent to NUMPROC(NOPFD), with a range of 13% faster to equivalent. 47 OPTIMIZE(STD), OPTIMIZE(FULL), or NOOPTIMIZE – 1 of 3 To assist in the optimization of the code, you should use the OPTIMIZE compiler option. With the OPTIMIZE(STD) or OPTIMIZE(FULL) options in effect, you may receive optimizations that include: eliminating unnecessary branches simplifying inefficient branches simplifying the code for the out-of-line PERFORM statement, moving the performed paragraphs in-line, where possible simplifying the code for a CALL to a contained (nested) program, moving the called statements in-line, where possible eliminating duplicate computations eliminating constant computations aggregating moves of contiguous, equal-sized items into a single move deleting unreachable code 48 OPTIMIZE(STD), OPTIMIZE(FULL), or NOOPTIMIZE – 2 of 3 Additionally, with the OPTIMIZE(FULL) option in effect, you may also receive these optimizations: deleting unreferenced data items and the associated code to initialize their VALUE clauses Many of these optimizations are not available with OS/VS COBOL, but are available with IBM Enterprise COBOL. Application Development Considerations: NOOPTIMIZE is generally used while a program is being developed when frequent compiles are necessary. NOOPTIMIZE also makes it easier to debug a program since code is not moved; NOOPTIMIZE is required when using the TEST compiler option with a value other than TEST(NONE). OPTIMIZE requires more CPU time for compiles than NOOPTIMIZE, but generally produces more efficient run-time code. For production runs, OPTIMIZE is recommended. 49 OPTIMIZE(STD), OPTIMIZE(FULL), or NOOPTIMIZE – 3 of 3 WARNING: If your program relies upon unreferenced level 01 or level 77 data items, you should not use OPTIMIZE(FULL), since OPTIMIZE(FULL) will delete all unreferenced data items. One way to prevent the data item from being deleted by the OPTIMIZE(FULL) option is to refer to the data item in the PROCEDURE DIVISION For example, initialize the data item with a PROCEDURE DIVISION statement instead of with VALUE clauses). Performance considerations using OPTIMIZE: On the average, OPTIMIZE(STD) was 1% faster than NOOPTIMIZE, with a range of 12% faster to equivalent. On the average, OPTIMIZE(FULL) was equivalent to OPTIMIZE(STD). One RENT program calling a RENT subprogram with 500 unreferenced data items with VALUE clauses was 9% faster with OPTIMIZE(FULL) or OPT(STD) compared to NOOPT. The same RENT program calling a RENT subprogram with 500 unreferenced data items with VALUE clauses using the IS INITIAL clause on the PROGRAMID statement was 90% faster with OPTIMIZE(FULL) compared to OPT(STD). 50 RENT or NORENT Using the RENT compiler option causes the compiler to generate some additional code to ensure that the program is reentrant. Reentrant programs can be placed in shared storage like the Link Pack Area (LPA) or the Extended Link Pack Area (ELPA). Also, the RENT option will allow the program to run above the 16 MB line. Producing reentrant code may increase the execution time path length slightly. Note: The RMODE(ANY) option can be used to run NORENT programs above the 16 MB line. Performance considerations using RENT: On the average, RENT was equivalent to NORENT. 51 RMODE - AUTO, 24, or ANY The RMODE compiler option determines the RMODE setting for the COBOL program. When using RMODE(AUTO), the RMODE setting depends on the use of RENT or NORENT. For RENT, the program will have RMODE ANY. For NORENT, the program will have RMODE 24. When using RMODE(24), the program will always have RMODE 24. When using RMODE(ANY), the program will always have RMODE ANY. Note: When using NORENT, the RMODE option controls where the WORKING-STORAGE will reside. With RMODE(24), the WORKING-STORAGE will be below the 16 MB line. With RMODE(ANY), the WORKING-STORAGE can be above the 16 MB line. 52 SSRANGE or NOSSRANGE Using SSRANGE generates additional code to verify that all subscripts, indexes, and reference modification expressions do not refer to data beyond the bounds of the subject data item. This in-line code occurs at every reference to a subscripted or variablelength data item, as well as at every reference modification expression, and it can result in some degradation at run time. In general, if you need to verify the subscripts only a few times in the application instead of at every reference, coding your own checks may be faster than using the SSRANGE option. For performance sensitive applications, NOSSRANGE is recommended (see prior discussion of this in the coding section of this course) Performance considerations using SSRANGE: On the average, SSRANGE with the run-time option CHECK(ON) was 1% slower than NOSSRANGE, with a range of equivalent to 27% slower. On the average, SSRANGE with the run-time option CHECK(OFF) was 1% slower than NOSSRANGE, with a range of equivalent to 9% slower. On the average, SSRANGE with the run-time option CHECK(ON) was 1% slower than SSRANGE with the run-time option CHECK(OFF) with a range of equivalent to 16% slower. 53 TEST or NOTEST – 1 of 3 The TEST compiler option produces object code that enables Debug Tool to perform batch and interactive debugging. The amount of debugging support available depends on which TEST suboptions you use. The TEST option also allows you to request that symbolic variables be included in the formatted dump produced by Language Environment. When using the SYM suboption of the TEST option, you can control where the symbolic information will be kept. If you use TEST(,SYM,NOSEPARATE), the symbolic information will be part of the object module, which could result in a much larger object module. If you use TEST(,SYM,SEPARATE), the symbolic information will be placed in a separate file and will be loaded only as needed. Note: If you used the FDUMP option with VS COBOL II, TEST(NONE,SYM) is the equivalent option with IBM Enterprise COBOL. 54 TEST or NOTEST – 2 of 3 Using TEST with a value other than NONE can cause a significant performance degradation when used in a production environment since this additional code occurs at each COBOL statement. Hence, the TEST option with a value other than NONE should be used only when debugging an application. Additionally, when TEST is used with a value other than NONE, the OPTIMIZE option is disabled. For production runs, NOTEST or TEST(NONE) is recommended. Notes: With the latest levels of Debug Tool, you can step through your program even if there are no compiled-in hooks, by using the overlay hooks function of Debug Tool. However, you must compile with the NOOPTIMIZE and TEST(NONE) options to use this feature. You should also use the SYM and SEPARATE sub-options of TEST to get the symbolic debug information without substantially increasing the size of your load modules. Additionally, when using TEST(NONE,SYM) with a large data division and an abend occurs producing a CEEDUMP, a significant amount of CPU time may be required to produce the CEEDUMP, depending on the size of the data division. 55 TEST or NOTEST – 3 of 3 Performance considerations using TEST: On the average, TEST(ALL,SYM) was 20% slower than NOTEST, with a range of equivalent to 200% slower when not producing a CEEDUMP. On the average, TEST(NONE,SYM) was equivalent to NOTEST when not producing a CEEDUMP. On the average, TEST(NONE,SYM,NOSEPARATE) resulted in a 236% increase in the object module size compared to using NOTEST or TEST(NONE,SYM,SEPARATE), with a range of 9% larger to 670% larger. On the average, TEST(NONE,SYM,SEPARATE) resulted in an increase of approximately 200 bytes in the object module size compared to using NOTEST One program with a large data division (about 1 million items) using TEST(NONE,SYM) took 200 times more CPU time to produce a CEEDUMP with COBOL's formatted variables compared to using NOTEST to produce a CEEDUMP without COBOL's formatted variables. 56 THREAD or NOTHREAD The THREAD compiler option enables the COBOL program for execution in a Language Environment enclave with multiple POSIX threads or PL/I tasks. A program compiled with the THREAD compiler option can also run in a non-threaded environment, however there will be some degradation in the initialization and termination of the COBOL program due to the overhead of serialization logic that is done for these programs. The THREAD compiler option also requires the use of the IS RECURSIVE clause on the PROGRAM-ID statement. Performance considerations using THREAD (measuring CALL overhead only): One testcase (Assembler calling COBOL) using THREAD was 35% slower than using NOTHREAD. One testcase (COBOL statically calling COBOL) using THREAD was 30% slower than using NOTHREAD. One testcase (COBOL dynamically calling COBOL) using THREAD was 30% slower than using NOTHREAD. Note: The IS RECURSIVE clause was used in both the THREAD and NOTHREAD cases. 57 TRUNC - BIN, STD, or OPT – 1 of 3 When using the TRUNC(BIN) compiler option, all binary (COMP) sending fields are treated as either half-word, full-word, or double-word values, depending on the PICTURE clause, and code is generated to truncate all binary receiving fields to the corresponding half-word, full-word, or double-word boundary (base 2 truncation). The full content of the field is significant. This can add a significant amount of degradation since typically some data conversions must be done, which may require the use of some library subroutines. Considerations: BIN is usually the slowest of the three sub options for TRUNC. When using the TRUNC(STD) compiler option, the final intermediate result of an arithmetic expression, or the sending field in the MOVE statement, is truncated to the number of digits in the PICTURE clause of the binary (COMP) receiving field (base 10 truncation). This can add a significant amount of degradation since typically the number is divided by some power of ten (depending on the number of digits in the PICTURE clause) and the remainder is used; a divide instruction is one of the more expensive instructions. TRUNC(STD) behaves in a similar way as TRUNC in OS/VS COBOL. 58 TRUNC - BIN, STD, or OPT – 2 of 3 However, with TRUNC(OPT), the compiler assumes that the data conforms to the PICTURE and USAGE specifications and manipulates the result based on the size of the field in storage (half-word, full-word or double-word). Although TRUNC(OPT) most closely resembles the behavior of NOTRUNC in OS/VS COBOL and is recommended for compatibility with NOTRUNC, there are some cases where the result will be different. Please consult the COBOL Migration Guide and Programming Guide for additional details. TRUNC(STD) conforms to the ANSI and SAA standards, whereas TRUNC(BIN) and TRUNC(OPT) do not. TRUNC(OPT) is provided as a performance tuning option and should be used only when the data in the application program conforms to the PICTURE and USAGE specifications. For performance sensitive applications, the use of TRUNC(OPT) is recommended when possible. 59 TRUNC - BIN, STD, or OPT – 3 of 3 Performance considerations using TRUNC: On the average, TRUNC(OPT) was 24% faster than TRUNC(BIN), with a range of 88% faster to equivalent. On the average, TRUNC(STD) was 15% faster than TRUNC(BIN), with a range of 78% faster to equivalent. On the average, TRUNC(OPT) was 6% faster than TRUNC(STD), with a range of 65% faster to equivalent. Note: On the average, TRUNC(BIN) with COBOL for OS/390 & VM Version 2 Release 2 and later is 15% faster than TRUNC(BIN) with COBOL for OS/390 & VM Version 2 Release 1 and prior, with a range of equivalent to 95% faster. BEFORE: on the average, 2.1.0 compiler with TRUNC(BIN) was 4.8 times slower than TRUNC(OPT), with a max of 30 times slower AFTER: on the average, 2.2.0 compiler with TRUNC(BIN) was 1.8 times slower than TRUNC(OPT), with a max of 13 times slower 60 Summary Having finished this section, you should now be able to: Select the proper compiler options is another factor that affects the performance of a COBOL application. 61 Checkpoint Under construction 62 Course Compiler Directives Units: Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Appendices 63 Overview Selecting the proper run-time options is another factor that affects the performance of a COBOL application. Therefore, it is important for the system programmer responsible for installing and setting up the LE environment to work with the application programmers so that the proper run-time options are set up correctly for your installation. Let's look at some of the options that can help to improve the performance of the individual application, as well as the overall LE run-time environment. 64 AIXBLD The AIXBLD option allows alternate indexes to be built at run time. However, this may adversely affect the run-time performance of the application. It is much more efficient to use Access Method Services to build the alternate indexes before running the COBOL application than using the NOAIXBLD run-time option. Note that AIXBLD is not supported when VSAM datasets are accessed in RLS mode. Performance considerations using AIXBLD: One VSAM program was 8% slower when using AIXBLD compared to using NOAIXBLD. 65 ALL31 – 1 of 2 The ALL31 option allows LE to take advantage of knowing that there are no AMODE(24) routines in the application. ALL31(ON) specifies that the entire application will run in AMODE(31). This can help to improve the performance for an all AMODE(31) application because LE can minimize the amount of mode switching across calls to common run-time library routines. Additionally, using ALL31(ON) will help to relieve some below the line virtual storage constraint problems, since less below the line storage is used. ALL31(ON) All EXTERNAL WORKING-STORAGE and EXTERNAL FD records areas can be allocated above the 16MB line if you also use the HEAP(,,ANYWHERE) run-time option and compile the program with either the DATA(31) and RENT compiler options or with the RMODE(ANY) and NORENT compiler options. ALL31(OFF) is required for all OS/VS COBOL programs that are not running under CICS, all VS COBOL II NORES programs, and all other AMODE(24) programs. 66 ALL31 – 2 of 2 Note that when using ALL31(OFF), you must also use STACK(,,BELOW). Note: Beginning with LE for z/OS Release 1.2, the run-time defaults have changed to ALL31(ON),STACK(,,ANY). LE for OS/390 Release 2.10 and earlier run-time defaults were ALL31(OFF),STACK(,,BELOW). Performance considerations using ALL31 (measuring CALL overhead only): On the average, ALL31(ON) was equivalent to ALL31(OFF). One program with many library routine calls was 10% faster when using ALL31(ON). 67 CHECK The CHECK option activates the additional code generated by the SSRANGE compiler option, which requires more CPU time resources for the verification of the subscripts, indexes, and reference modification expressions. Using the CHECK(OFF) run-time option deactivates this code but still requires some additional CPU time resources at every use of a subscript, index, or reference modification expression to determine that this check is not desired during the particular run of the program. This option has an effect only on a program that has been compiled with the SSRANGE compiler option. Performance considerations using CHECK: On the average, CHECK(ON) with SSRANGE was 1% slower than CHECK(OFF) with SSRANGE, With a range of equivalent to 16% slower. 68 DEBUG The DEBUG option activates the COBOL batch debugging features specified by the USE FOR DEBUGGING declarative. This may add some additional overhead to process the debugging statements. This option has an effect only on a program that has the USE FOR DEBUGGING declarative. Performance considerations using DEBUG: The eleven programs measured ranged from equivalent to 2080% slower when using DEBUG compared to using NODEBUG. Notes: The programs measured in this test were modified to use WITH DEBUGGING MODE on the SOURCE-COMPUTER paragraph and to contain a USE FOR DEBUGGING ON ALL PROCEDURES declarative that did a DISPLAY DEBUGITEM. Since the debugging code in these cases is generated only for paragraph and section labels, other programs may have significantly different results. 69 RPTOPTS, RPTSTG The RPTOPTS option allows you to get a report of the run-time options that were in use during the execution of an application. Generating the report can result in some additional overhead. Specifying RPTOPTS(OFF) will eliminate this overhead. For batch: On the average, RPTOPTS(ON) was equivalent to RPTOPTS(OFF). Note: Although the average for batch programs shows equivalent performance for RPTOPTS(ON), you may experience some degradation in a transaction environment (for example, CICS) where main programs are repeatedly invoked. The RPTSTG option allows you to get a report on the storage that was used by an application. This report is produced after the application has terminated. The data from this report can help you fine tune the storage parameters for the application, reducing the number of times that the LE storage manager must make system requests to acquire or free storage. Collecting the data and generating the report can result in some additional overhead. Specifying RPTSTG(OFF) will eliminate this overhead. On the average, RPTSTG(ON) was 5% slower than RPTSTG(OFF), with a range of equivalent to 88% slower. Note that when using call intensive applications, the degradation can be 200% slower or more. 70 RTEREUS – 1 of 2 The RTEREUS option causes the LE run-time environment to be initialized for reusability when the first COBOL program is invoked. The LE run-time environment remains initialized (all COBOL programs and their work areas are kept in storage) in addition to keeping the library routines initialized and in storage. This means that, for subsequent invocations of COBOL programs, most of the run-time environment initialization will be bypassed. Most of the run-time termination will also be bypassed, unless a STOP RUN is executed or unless an explicit call to terminate the environment is made Note: using STOP RUN results in control being returned to the caller of the routine that invoked the first COBOL program, terminating the reusable run-time environment). Because of the effect that the STOP RUN statement has on the run-time environment, you should change all STOP RUN statements to GOBACK statements in order to get the benefit of RTEREUS. The most noticeable impact will be on the performance of a non-COBOL driver repeatedly calling a COBOL subprogram (for example, an assembler driver that repeatedly calls COBOL applications). The RTEREUS option helps in this case. 71 RTEREUS – 2 of 2 However, using the RTEREUS option does affect the semantics of the COBOL application: each COBOL program will now be considered to be a subprogram and will be entered in its last-used state on subsequent invocations if you want the program to be entered in its initial state, you can use the INITIAL clause on the PROGRAM-ID statement. WARNING: This means that storage that is acquired during the execution of the application will not be freed. Therefore, RTEREUS may not be applicable to all environments. Performance considerations using RTEREUS (measuring CALL overhead only): One test-case (Assembler calling COBOL) using RTEREUS was 99% faster than using NORTEREUS. 72 Modifying COBOL's Reusable Environment Behavior – 1 of 2 Using the IGZRREOP customization job can be used to improve the performance of running in a COBOL reusable environment. REUSENV=COMPAT provides behavior that is compatible with VS COBOL II's reusable environment. With this setting, when a program check occurs while the COBOL reusable environment is dormant (i.e., after returning from the topmost COBOL program back to the non-Language Environment conforming assembler caller), a S0Cx abend will occur, but it significantly affects the performance of a such an application running under Language Environment. This degradation is due to an ESPIE RESET being issued prior to the return from COBOL to the assembler driver and then an ESPIE SET upon each reentry to the topmost COBOL program. 73 Modifying COBOL's Reusable Environment Behavior – 2 of 2 REUSENV=OPT changes this behavior by allowing Language Environment to trap all program checks, including those that occur while the COBOL reusable environment is dormant. This option provides behavior that is not the same as VS COBOL II and will result in an abend 4036 if a program check occurs while the COBOL reusable environment is dormant. However, since an ESPIE RESET and ESPIE SET do not have to be issued between each invocation of the topmost COBOL program, performance will be improved over using REUSENV=COMPAT. Sample source code to make these changes is in members IGZERREO and IGZWARRE of the SCEESAMP dataset. Performance considerations using IGZRREOP: When using IGZRREOP with REUSENV=OPT, assembler programs calling COBOL programs repeatedly under COBOL's reusable run-time environment can be 60 to 90% faster than using IGZRREOP with REUSENV=COMPAT. 74 STORAGE – 1 of 2 The first parameter of this option initializes all heap allocations, including all external data records acquired by a program, to the specified value when the storage for the external data is allocated. This also includes the WORKING-STORAGE acquired by a RENT program (unless a VALUE clause is used on the dataitem) when the program is first called or, for dynamic calls, when the program is canceled and then called again. Storage is not initialized on subsequent calls to the program. This can result in some overhead at runtime depending on the number of external data records in the program and the size of the WORKING-STORAGE section. Note: If you used the WSCLEAR option with VS COBOL II, STORAGE(00,NONE,NONE) is the equivalent option with Language Environment. The second parameter of this option initializes all heap storage when it is freed. The third parameter of this option initializes all DSA (stack) storage when it is allocated. The amount of overhead depends on the number of routines called (subroutines and library routines) and the amount of LOCAL-STORAGE data items that are used. This can have a significant impact on the CPU time of an application that is call intensive. 75 STORAGE – 2 of 2 Performance considerations using STORAGE: On the average, STORAGE(00,00,00) was 17% slower than STORAGE(NONE,NONE,NONE), with a range of equivalent to 130% slower. One RENT program calling a RENT subprogram using IS INITIAL on the PROGRAMID statement with a 40 MB WORKING-STORAGE was 28% slower. Note that when using call intensive applications, the degradation can be 200% slower or more. On the average, STORAGE(00,NONE,NONE) was equivalent to STORAGE(NONE,NONE,NONE). One RENT program calling a RENT subprogram using IS INITIAL on the PROGRAMID statement with a 40 MB WORKING-STORAGE was 4% slower. On the average, STORAGE(NONE,00,NONE) was equivalent to STORAGE(NONE,NONE,NONE). One RENT program calling a RENT subprogram using IS INITIAL on the PROGRAMID statement with a 40 MB WORKING-STORAGE was 13% slower. On the average, STORAGE(NONE,NONE,00) was 17% slower than STORAGE(NONE,NONE,NONE), with a range of equivalent to 130% slower. One RENT program calling a RENT subprogram using IS INITIAL on the PROGRAMID statement with a 40 MB WORKING-STORAGE was 6% slower. Note that when using call intensive applications, the degradation can be 200% slower or more. 76 TEST The TEST option specifies the conditions under which Debug Tool assumes control when the user application is invoked. Since this may result in Debug Tool being initialized and invoked, there may be some additional overhead when using TEST. Specifying NOTEST will eliminate this overhead. 77 TRAP The TRAP option allows LE to intercept an abnormal termination (abend), provide the abend information, and then terminate the LE run-time environment. TRAP(ON) also assures that all files are closed when an abend is encountered and is required for proper handling of the ON SIZE ERROR clause of arithmetic statements for overflow conditions. TRAP(OFF) prevents LE from intercepting the abend. In general there will not be any significant impact on the performance of a COBOL application when using TRAP(ON). Performance considerations using TRAP: On the average, TRAP(ON) was equivalent to TRAP(OFF). 78 Course Other Tuning Options Units: Using an Optimal Programming Style Choosing Efficient Datatypes Handling Tables Efficiently Compiler Options that Affect Run-time Performance Run-time Options that Affect Run-time Performance Efficient COBOL Coding Techniques Tuning CICS, IMS, DB2 and VSAM Access Appendices 79 Running efficiently with CICS, IMS, or VSAM You can improve performance for online programs running under CICS or IMS, or programs that use VSAM, by following these tips. CICS: If your application runs under CICS, convert EXEC CICS LINK commands to COBOL CALL statements to improve transaction response time. IMS: If your application runs under IMS, preloading the application program and the library routines can help reduce the overhead of loading and searching. It can also reduce the input-output activity. For better system performance, use the RENT compiler option and preload the applications and library routines when possible. You can also use the Language Environment library routine retention (LRR) function to improve performance in IMS/TM regions. VSAM: When you use VSAM files, increase the number of data buffers for sequential access or index buffers for random access. Also, select a control interval size (CISZ) that is appropriate for the application. A smaller CISZ results in faster retrieval for random processing at the expense of inserts. A larger CISZ is more efficient for sequential processing. For better performance, access the records sequentially and avoid using multiple alternate indexes when possible. If you use alternate indexes, access method services builds them more efficiently than the AIXBLD runtime option. 80 Improving VSAM performance – 1 of 2 Your system programmer is most likely responsible for tuning the performance of COBOL and VSAM. As an application programmer, you can control the aspects of VSAM that are listed below. 81 Improving VSAM performance – 1 of 2 Application methods for improving VSAM performance …. 82 SQL Performance – 1 of 3 SQL Performance is a very big topic. If you are responsible for writing EGL code that must generate to high-performance SQL database access statements you should research this topic on the Internet, and with books written by experts such as Joe Celko and Craig Mullins (among others) Background SQL efficiency is a direct correlation to the # of rows physically accessed by the DBMS versus the # of rows needed by your business logic. As an example, if a list page only displays 10 rows at a time, but accesses 2,000 rows from the database per execution of the SQL function this is probably inefficient. However, if a batch process needs 225,000 rows for its business process, and accesses 300,000 rows to get them, this is relatively efficient. It’s the correlation of # of rows needed, versus DBMS work-to-be-done to get those rows that is the true measure of SQL efficiency. In order to determine the above, you must fully understand the service level requirements for your database access. Typically, you are concerned with web-page performance and user-wait time. Problems with pages occur when the SQL run to access the data for the page returns thousands (or tens or hundreds of thousands of rows). So, you will need to understand both the logical AND physical structure of your database – and its choices of data access, in order to completely understand how much effort it takes the database to get the data you need. This can be (read, “usually is”) a very complex and time-consuming process, unique to every different business application The best we can do in this tutorial is to give you general rules-of-thumb that will make queries “more” efficient. However, it is strongly recommended that you discuss this with your systems database administrator (DBA) – for any/all pages that require maximum SQL performance Note that there is no end of decent SQL performance articles – mostly database-specific on this topic available freely on the internet – and a # of good books as well. http://www-128.ibm.com/developerworks/db2/library/techarticle/0210mullins/0210mullins.html http://www.javaworld.com/javaworld/jw-02-2005/jw-0228-apm.html 83 SQL Performance – Rules of Thumb – 2 of 3 For Batch applications that process all or a significant % of table rows, index access is often inefficient. This is because the database must first read the index dataset – then read the table dataset to retrieve each row. Index access is efficient only when your SQL requirements are for a small percentage of the entire table: (typically less than 20%). For online/pages Attempt to limit the number of rows retrieved per get statement to the minimum needed per page view. This will require you to do at least two things (neither of which are easy or quick): Add additional WHERE clause conditions using SQL BETWEEN that reference database index columns – and in your EGL code manage the values so that the database is retrieving as close to the # of records needed per page view as possible Code your own programmatic paging – by incrementing the values in the SQL BETWEEN clauses per page forward – and decrement the values for page backward. Separate the database access from the page fields by using basicRecords to render the fields. In this tutorial, you have primarily dragged and dropped database records onto a form. This is not as efficient as doing the database access through sqlRecords, then moving the records return Using explicit SQL, select only the # of columns you need in your page view (i.e. if you are only showing 4 fields tied to a 30 column table row, delete the other 26 columns. Attempt to avoid ORDER BY unless it is on an indexed column AND the index exists in the same order you need for your sort process. Obviously this is driven by the business requirements, but database sorting can be very expensive. Statements that force sorts include: ORDER BY GROUP BY TABLE JOINS UNION Certain Sub-selects Avoid using SQL WHERE clause keywords that disallow index access Computations in the where clause Like – with the % or underscore before the literal data 84 SQL Performance – Rules of Thumb – 3 of 3 Use custom SQLRecords to limit the number of: Rows returned for SQL queries Use Programmatic Paging for SELECT statements that would return > 200 rows Columns of information Create custom SQLRecords that only access the information needed by your page Attempt to limit the number of JSF dataTable cells rendered by EGL/JSF A cell is the intersection of a row and column in a dataTable, so that if you have 5,000 rows and 20 columns, you generating 100,000 cells. This will almost definitely cause severe page performance problems. As a rule of thumb, keep the number of total cells under 10,000 (i.e. 500 rows/20 columns), etc. Large-scale SQL performance improvements have been documented by so doing. SQL Stored Procedures will always out-perform dynamic SQL access. So for your ultra-performanceintensive pages, consider using SQL Stored Procedures. This has been well-documented in IBM Redbooks Use the EGL MAXSIZE property - to halt the fetch process from the database on potentially large queries. MAXSIZE, which you specify as a property of a dynamic array interacts with the database engine’s fetch looping, and once reached, closes the cursor and stops the database access process. You should note that the majority of the cost of the database access occurs when a cursor is OPEN’d. However, if your page only renders 10 rows at a time, and you are doing programmatic paging, specifying MAXSIZE = 10 will help performance – somewhat Use any of your DBMS’s SQL efficiency features. For example DB2 has several keywords that influence the Optimizer to access a stated # of rows: OPTIMIZE FOR n ROWS; //tells the optimizer you will probably only fetch n rows FETCH FIRST n ROWS ONLY; //tells the optimizer you will ONLY fetch n rows FOR READ ONLY //tells the optimizer you will not do any update where current of commands WITH UR //allows you to read uncommitted rows 85 Grouping duplicate computations*** (probably remove) When components of different expressions are duplicates, ensure that the compiler is able to optimize them. For arithmetic expressions, the compiler is bound by the left-to-right evaluation rules of COBOL. Therefore, either move all the duplicates to the left side of the expressions or group them inside parentheses. If V1 through V5 are variables, the computation V2 * V3 * V4 is a duplicate (known as a common sub-expression) in the following two statements: COMPUTE A = V1 * (V2 * V3 * V4) COMPUTE B = V2 * V3 * V4 * V5 In the following example, V2 + V3 is a common sub-expression: COMPUTE C = V1 + (V2 + V3) COMPUTE D = V2 + V3 + V4 In the following example, there is no common sub-expression: COMPUTE COMPUTE COMPUTE COMPUTE A B C D = = = = V1 V2 V1 V4 * * + + V2 * V3 * V4 V3 * V4 * V5 (V2 + V3) V2 + V3 The optimizer can eliminate duplicate computations. You do not need to introduce artificial temporary computations; a program is often more comprehensible without them. 86 Grouping Constant Computations*** (probably remove) When several items in an expression are constant, ensure that the optimizer is able to optimize them. The compiler is bound by the left-to-right evaluation rules of COBOL. Therefore, either move all the constants to the left side of the expression or group them inside parentheses. For example, if V1, V2, and V3 are variables and C1, C2, and C3 are constants, the expressions on the left below are preferable to the corresponding expressions on the right: More efficient Less efficient V1 * V2 * V3 * (C1 * C2 * C3) V1 * V2 * V3 * C1 * C2 * C3 C1 + C2 + C3 + V1 + V2 + V3 V1 + C1 + V2 + C2 + V3 + C3 In production programming, there is often a tendency to place constant factors on the right-hand side of expressions. However, such placement can result in less efficient code because optimization is lost. 87