The Progress Transaction “Leak”

advertisement

The Progress

Transaction “Leak”

by David Takle

Have you ever wondered why it is that the total quantity on order in the inventory master record doesn’t match the detail records that support the figure? Or why the customer’s accounts receivable detail doesn’t add up to the balance due field in the customer record? You might hear an explanation offered such as

“someone must have pressed control-c.” You might even have spent a few days looking for some bug by combing through the programs that can update those fields, and given up because the code looks perfect. It is very possible that what you are experiencing is the subtlety of “Transaction Leak”.

One of the great characteristics of the Progress DBMS is a feature called “transaction integrity.” Properly employed, it guarantees that a collection of record updates is either committed in its entirety to the database or completely rolled back. This provides application programmers an excellent means of encapsulating changes, and relieves them of the effort of writing data-nursing routines to clean up the nasty messes left over from programs that trip over unexpected conditions that would otherwise contaminate the data.

But in the real world where programmers are fallible and perhaps even unaware, it is very easy to write a procedure in which the transaction can “leak” by rolling back some database changes while committing others. In fact, due to the inherent block properties in Progress , it takes a conscious and deliberate effort on the part of the developer in order to prevent that possibility . And anywhere that a programmer fails to override the default properties in a transaction block, the application may be subject to partial transaction rollback.

The key to this phenomenon is the default error property that is part of the Progress architecture. Every

Procedure block, Repeat block, For block, and Do-Transaction block has an implied error property that tells

Progress what to do in the event of a Progress-level error. And as we shall demonstrate shortly, it is the default error logic that opens the door to transaction leak. Thus it is up to the programmer to take deliberate steps to override the default error properties in order to prevent this problem.

Let’s look at an example, using the infamous Sports database.

REPEAT:

PROMPT-FOR cust-num.

FIND customer EXCLUSIVE-LOCK USING cust-num.

DISPLAY customer.name.

IF credit-limit LT 5000

THEN ASSIGN credit-limit = credit-limit * 2.

END.

IF remarks EQ “” THEN ASSIGN remarks = “(none)”.

IF contact EQ “” THEN ASSIGN contact = Rep-name.

As soon as we choose a customer with a blank contact field, we have the possibility of raising the error condition (salespsn is not available). At that point Progress initiates an “Undo, Retry” action, because that is the default property of the Repeat block, and the entire transaction is backed out, including any update to the credit-limit or remarks. So far, so good.

Rule #1. The default error property for any block requesting user input should be

UNDO, RETRY.

That is exactly the convention chosen by Progress when they designed the language. And since this is the default, nothing special needs to be done for blocks that request user input. They generally take care of themselves. The exception might be a “Do” block that is not also a transaction block, because it defers to the enclosing block for error handling. In that case you may need to explicitly add Undo, Retry to control the interaction (depending on the desired result).

The problem of transaction leak comes from the second part of the convention chosen by Progress. For whatever reason, they decided that blocks that did not have user input should default to “Undo, Leave.”

Returning to our example, let’s imagine that the process has become a little more complicated, and that we need to break up the code into separate procedures to manage this application.

REPEAT TRANSACTION:

PROMPT-FOR cust-num.

FIND customer EXCLUSIVE-LOCK USING cust-num.

DISPLAY customer.name.

END.

RUN Update-Credit.

RUN Update-Other-Stuff.

.PROCEDURE Update-Credit:

. IF credit-limit LT 5000

. THEN ASSIGN credit-limit = credit-limit * 2.

.END PROCEDURE.

.PROCEDURE Update-Other-Stuff:

. IF remarks EQ “” THEN ASSIGN remarks = “(none)”.

. IF contact EQ “” THEN ASSIGN contact = Rep-name.

.END PROCEDURE.

Just to be doubly sure that we control the transaction scope, the “Transaction” keyword has been added to the Repeat block. Does the application still work the way it did before? Not even close. If an error occurs in Update-Other-Stuff, the Progress error property is triggered as before. But the default of “Undo, Retry” is changed to “Undo, Leave” because there are no user interactions in the procedure, and no Retry functions. Consequently the calling routine (i.e. the Repeat block) knows absolutely nothing about the error that occurred in the called routine, and continues on its merry way, “completing” the transaction. So although the remarks field has been returned to its original value due to the Undo, Leave, the credit-limit field remains updated. The transaction has “leaked.” Part of it was committed, and part was not.

What’s more, this type of error is very easy to come by in a large, complex program with many procedure calls. In this particular example, a programmer modifying the Update-Other-Stuff procedure might be under the impression that the salesperson record is supposed to be available. Thus an error is introduced, the data is corrupted, and we have to write a special “data-fix” program to put things back in order.

Now this is admittedly a very simple example, and one could argue that little harm was done. But the same principle applies to more serious situations such as updating the total quantity on-order in one procedure, and writing a new order line in another procedure. An error in either one will cause the on-order value to be incorrect, even though a Transaction block was expressly created for the purpose of keeping on-order in synch with the order line detail.

The best solution to the problem is to override the default error property of the internal procedure:

PROCEDURE Update-Other-Stuff:

DO ON ERROR UNDO, RETURN ERROR:

IF remarks EQ “” THEN ASSIGN remarks = “(none)”.

IF contact EQ “” THEN ASSIGN contact = Rep-name.

END.

END PROCEDURE. or more simply,

PROCEDURE Update-Other-Stuff:

IF RETRY THEN RETURN ERROR.

IF remarks EQ “” THEN ASSIGN remarks = “(none)”.

IF contact EQ “” THEN ASSIGN contact = Rep-name.

END PROCEDURE.

Now if an error occurs, it is propagated back to the calling routine which in turn is able to undo the entire transaction and prompt the user again. So at this point, it again works like the original version where all of the updating was done in a single block. The same concept would apply if this were an external procedure called from anywhere within the main program. This can be generalized to ...

Rule #2. With few exceptions (see Rule #3), the default error processing for any block without user interaction should be overridden to be

UNDO, RETURN ERROR .

“Undo, Retry” is fine for blocks that have user interaction. But “Undo, Leave” has the effect of hiding the error from all parent routines. On the other hand, “Undo, Return Error” helps to insure that the error condition is propagated up the call stack to a level where it is actually useful (i.e. can back out an entire transaction). This is in fact the default error process for database trigger blocks, since there is little else that would make sense at that level of code. Unfortunately, that default does not apply to other procedure blocks, and any procedure block that does not explicitly propagate the error condition will simply do an

Undo, Leave and “leak” the transaction.

The myth of the master error handler

Often when programmers are presented with the idea of propagating errors up the call stack, they object that this process only works if you have some sort of master error handling routine at the top end that can manage any type of error. But such a view ignores the obvious. Look again at the original routine, where all of the updating was performed within the confines of a single Repeat block. Any error that occurs there is handled just fine, without any special coding. That’s the beauty of the default error handling in Repeat blocks used for data entry. All errors unwind the transaction (or sub-transaction) and let the user try again.

With Rule #2 all we have done is preserve this simple yet elegant architecture, by propagating errors back from called procedure blocks. In other words, the best place to raise the error condition is in the block that requests user interaction, because the default error properties in Progress relieve the programmer from worrying about how to handle the errors. And the only way to raise the error condition there is to propagate the errors from all other blocks. (More precisely, the first place we must target with the error condition is the top-level transaction block so that the transaction is backed out. After that, we need to target the block with user interaction, so that the user can decide what to do next. But we must propagate errors from all other blocks in order to accomplish either of these goals.)

Aside from the initial startup procedure, nearly every process accessible to the user begins with some interaction (even if only a menu option). So the best default for any non-interactive routine is to raise the error condition in the parent routine. Because eventually, the error will propagate back to a routine that requested user interaction, and the default handling at that level will restore order. In the mean time, all transactions remain intact, and all of the data is protected.

Of course for processes that have no user interaction, such as a program run in background, it is important to add an error logging routine that can make a record of the problem so that it can be addressed. But this should be viewed as a step above and beyond the default processing. As it is, without explicitly raising the error condition, background processes (other than reports) can generate errors, produce transaction leaks, and keep right on running, perhaps creating thousands of partial transactions with no method in place to stop the problem, and no logging in effect to find the problem. Overriding the default error logic with the standard outlined above at least stops the process and backs out the current transaction in its entirety.

Logging is an issue either way, so really nothing has been lost and a lot has been gained by passing the error status back to the parent procedures. Consequently, the idea that propagating the error up the call stack requires a general error handler is just not true.

Given the conclusions reached thus far, there are a few additional considerations that must be covered:

1. how to properly enforce rule #2

2. what to do with processes that are primarily non-interactive in nature, such as reports and batch updates.

3. how event-driven models differ from the procedural model described above.

4. when the RETRY function is ignored.

Enforcing Rule #2

There are several things to keep in mind when overriding the default error process. First, decide on a standard override for all non-interactive blocks that have the error property. One method for doing this is to add the phrase “ON ERROR UNDO, RETURN ERROR” to each of the following block headers:

REPEAT ... END.

FOR EACH .... END.

DO TRANSACTION ... END.

That leaves only procedure blocks. The two methods for overriding procedure blocks are:

(a) Enclose all of the code in the block with DO ON ERROR UNDO, RETURN ERROR:, or

(b) Insert “IF RETRY RETURN ERROR” as the first executable line of code in the procedure.

To be consistent, it may be best to insert “IF RETRY THEN RETURN ERROR” into all four of the above block types. This works, of course, because the Progress error process is always “undo, retry” if there is a retry function in the block prior to where the error occurs.

Exception to rule #2: blocks that have the error property that are nested within a Transaction block.

Transaction-Block:

DO TRANSACTION:

IF RETRY THEN RETURN ERROR.

RUN Update-Something.

FOR EACH customer EXCLUSIVE-LOCK:

IF RETRY THEN RETURN ERROR.

END.

RUN Update-Customer-Fields.

END.

In this example, we see a second way for transaction leak to occur. If an error is propagated back from

“Update-Customer-Fields” it will undo only the sub-transaction that is within the context of the FOR-

EACH . Whatever was updated in the “Update-Something” routine will be left intact, because the

“Transaction-Block” was not undone. In a more general sense, this leak is the result of explicitly leaving or returning from a transaction block without undoing it. The correct approach is to change the

“RETRY” inside the ”FOR EACH” to one of the following forms:

IF RETRY THEN UNDO Transaction-Block, RETURN ERROR.

(or)

IF RETRY THEN UNDO Transaction-Block, RETRY Transaction-Block.

Although both of the above forms will correctly unwind the transaction, the last one is prehaps less desirable, because it assumes you have added the RETRY function to the transaction block as well. If it were missing, the ”RETRY Transaction-Block” would be ignored, and Progress would do an ”UNDO,

LEAVE” instead (see discussion below: “When the On-Error phrase is ignored”).. Remember that there are two goals here: (1) to undo the top-level transaction block, and (2) to raise the error condition in an interactive block. That brings us to Rule #3.

Rule #3: When overriding the default error property in a block that is nested within the top-level transaction block, be sure to explicitly undo the transaction block.

Error Handling in Background processing

As stated earlier, the above rules are preferable to the default error handling even for background processes where no user interaction is present and when no universal error handler has been written. However, the ideal approach would be to capture all background errors in a log file for review by a developer. That way, bugs can be found earlier and easier, and no one has to wonder why the update did not finish as expected.

The easiest way to do this is to insert an include file as the first executable code in every block that has the error property ( REPEAT, FOR, DO TRANSACTION, DO ON ERROR, procedure ). This include file would replace the statement used above: “IF RETRY THEN RETURN ERROR” , and might look something like this:

{retry.i {&FILE-NAME} {&LINE-NUMBER} &Prop=YES }

The first two arguments are standard Progress pre-processor names. The “&Action” argument allows the developer to state whether or not to propagate the error up the call stack or perform some other basic function. The content of the include file should be kept fairly simple so that the object files do not grow by any considerable amount. And one possibility for the file would be:

/*================================================================ retry.i == Standard Retry Function

Syntax: {retry.i {&FILE-NAME} {&LINE-NUMBER} &Action=ERROR }

================================================================*/

IF RETRY THEN DO:

IF RETURN-VALUE NE "Propagating-Error"

THEN RUN errorlog.p ( ‘{1}’,{2} ).

&IF "{&Action}" EQ "ERROR" &THEN

RETURN ERROR "Propagating-Error".

&ELSEIF "{&Action}" EQ "RETURN" &THEN

RETURN. /* exit without raising error */

&ELSEIF "{&Action}" EQ "LEAVE" &THEN

LEAVE. /* imitate standard default handling */

&ENDIF /* otherwise no-op, try again */

END.

/* ---end--- retry.i */

With this include file in every procedure block that has the error property, every Progress-level error would be captured. The program errorlog.p could create a record in an error log table and insert all of the following information: date and time user ID source filename and line number where the Retry occurred (passed as parameters) complete call stack back to the startup routine (use PROGRAM-NAME function)

Recent error numbers that were generated (use the ‘undocumented’ _MSG function)

If running in foreground, errorlog.p could also present an alert box to the user, indicating that they should write down the last interaction and the error message that had been displayed just prior to the one they are looking at.

On the down side, this approach does not relieve the developers of the task of understanding the basic error control processes. If we place this include inside the top-level REPEAT loop in our original example, we can no longer code “UNDO, RETRY” at various points in the loop where we want the user to try again.

Because if we did, it would assume an error had occurred that needed to be logged in the error-log table.

To get around this there are several options. One approach would be to leave the include file out of procedures that have user interaction, since the default error processing works just fine there. Another approach would be to code “RUN undo.p” instead of “UNDO, RETRY” and have “undo.p” simply do a

”RETURN ERROR ‘Propagating-Error’.” to prevent the retry block from doing any mischief. It would raise the error condition in the Repeat block, but indicate that any required logging had already been done. In this case the “&Action” parameter would have to be set to “NO-OP” to prevent propagating the error or leaving the block prematurely.

On the plus side, “UNDO, RETRY” has now taken on a new purpose. You can simulate raising the error property at any time by coding “Undo, Retry” in a procedure. This could effectively be used to capture logical data errors that are detected by the application, and at the same time guarantee that any current transaction is backed out correctly.

Error Handling in Event-Driven models

At a strategic level, event-driven applications need to manage Progress-level errors in the same way that procedural applications need to: that is, they need to completely unwind any transaction in process and return the user to the place of the last interaction. From a tactical perspective, there are only a few minor differences from that of procedure models.

First, we need to address the default error handling within a trigger block. Progress decided on the same default which they used for any procedure block: UNDO, LEAVE . Unfortunately, this has the effect of permitting default trigger processing (e.g. moving the cursor on to the next field), even though an error has occurred. Consequently, the correct thing to do (as a default) is begin every trigger block with “IF RETRY

THEN RETURN NO-APPLY” . This insures that no default processing takes place after an error. A similar phrase would need to be included in any enclosed block that had the error property.

The second issue is that since there are no “top-level” repeat blocks containing user interactions, all procedures other than trigger blocks should be treated according to rules 2 and 3. And since they will propagate errors back to the trigger block where we will perform a “RETURN NO-APPLY” , all errors will be treated equally, and the user will regain control at the last point of interaction.

When the RETRY function is Ignored

One last note on the RETRY function. Every once in a while a developer will attempt to write a routine that is supposed to capture all errors and keep trying the desired process until it succeeds. The most common variety of this involves the creation of a record with a unique key in a situation where the key is generated automatically and could possibly collide with another session. The code might look something like this:

FIND LAST whatever NO-LOCK.

ASSIGN temp-value = whatever.key1.

REPEAT ON ERROR UNDO, RETRY:

CREATE whatever.

ASSIGN temp-value = temp-value + 1

whatever.key1 = temp-value.

LEAVE.

END.

The theory here is that if another session beats you to the key value, you can simply undo your create and try again with another key value. However, this is not what happens. If an error occurs, Progress will leave the repeat block, even though the developer explicitly requested a Retry. And in this particular instance, after the LEAVE is performed the buffer does not contain a record, so that any other code that is expecting the record to be available will also fail. In order for a Retry to occur, there must be either a Retry function or a user interaction in the block prior to where the error is encountered. Otherwise, Progress overrides the error process and leaves the block. Since the larger procedure is unaware of the error, it is entirely possible that this type of structure could introduce Transaction Leak. Details on this behavior of

Retry can be found in the Progress Programming Handbook, Condition Handling and Messages.

Benefits

The number one benefit of propagating errors up the call stack is transaction integrity. But in addition to that, there are two other spin-offs that are worth considering.

The first is the ability to perform an UNDO, RETRY (i.e. RETURN ERROR “Propagate Error” ) at any point in the application. This makes it easy to manage all types of logical integrity issues with a great deal of confidence. For example, standard low-level routines can validate parameters, test for the presence of supporting records, and so forth, and simply raise the error condition in the caller if it cannot complete the expected task. This avoids the tedious issue of setting, resetting, and testing flags, and the possible bugs that can be introduced when trying to manage transaction error recovery with explicit application code.

Finally, standardized error logging routines are so simple to produce, and provide such a valuable resource, that it should be adopted as the standard for all application development.

Conclusion

The ever-present threat of Transaction Leak is an intrinsic part of any application that does not propagate all errors back to top-level transaction blocks. The cost of repairing data that is damaged from partial transactions can be considerable. And that is assuming that is it even noticed at all and diagnosed correctly.

On top of that, the task of nursing data along in a database system that is fully capable of protecting itself from such problems is as boring as it is dangerous. Fixing program errors is one thing: we expect to do that sooner or later in any development environment. But with proper methods and standards, we should never have to write programs to fix partial transactions.

What would be really great is if we were given the option to make these defaults implicit in the error properties of the language itself (e.g. set with a startup parameter). Maybe someday Progress will realize the dangers of transaction leak and give us this feature. In the mean time, we need to adopt the necessary standards to insure transaction integrity and protect the data.

David Takle

Find First Consultant, Inc. davidtakle@iname.com

Download