TOWARD A THEORY OF TEST DATA SELECTION John B

advertisement
TOWARD A THEORY OF TEST DATA SELECTION
John B. Go0denough
Susan L. Gerhart ~
SofTech, Inc., Waltham, Mass.
Ke]rwords and Phrases:
testing,
• W h a t a r e t h e p o s s i b l e s o u r c e s of f a i l u r e i n
a program?
• What test data should be selected to demonstrate that failures do not arise from these
sources ?
proofs of correctness
Abstract
This paper examines the theoretical and
p r a c t i c a l r o l e of t e s t i n g i n s o f t w a r e d e v e l o p m e n t .
We prove a fundamental theorem showing that
p r o p e r l y s t r u c t u r e d t e s t s a r e c a p a b l e of d e m o n strating the absence of errors in a program. The
theoremts proof hinges on our definition of test
reliability and validity, but its practical utility
hinges on being able to show when a test is actuaUy reliable.
We explain what makes tests
unreliable (for example, we show by example
why testing all program statements, predicates,
or paths is not usually sufficient to insure test
reliability), and we outline a possible approach
to developing reliable tests.
We also show how
the analysis required to define reliable tests can
help in checking a program's
design and specifications as wetl~as in preventing and detecting
implementation errors.
1.
1.1
of t h i s p a p e r i s :
• t o s u r v e y t h e p u r p o s e a n d l i m i t a t i o n of t e s t ing as presently conceived,
• to demonstrate a systematic procedure for
developing valid and reliable test data.
• to establish a view of the purposes and
l i m i t a t i o n s of t e s t i n g t h a t w i l l p e r m i t t e s t ing methods to be s~rstematically improved.
I n t h e r e m a i n d e r of t h i s I n t r o d u c t i o n , w e d e f i n e
some basic concepts relevant to software testing
a n d l a y t h e b a s i s f o r a t h e o r y of t e s t d a t a s e l e c t i o n . I n S e c t i o n 2, w e d i s c u s s t w o e x a m p l e s of
programs published in the literature which conrain errors.
Our purpose there is to illustrate
• the problems a theory of testing must deal with.
I n S e c t i o n 3, w e e x a m i n e w h a t o t h e r s h a v e s a i d
about the goals, methods, and difficulties of program testing before turning attention to our prop o s e d m e t h o d , w h i c h w e p r e s e n t i n S e c t i o n 4. I n
S e c t i o n 5, w e a p p l y o u r t h e o r e t i c a l c o n c e p t s t o
t h e r e s u l t s of S e c t i o n 4.
The fundamental
out this paper are:
questions
examined
*Supported in part under Contract
C0469.
'~Now at Duke University,
Durham,
Testing Concepts
Consider a program F whose input domain is
t h e s e t of d a t a D. F ( d ) d e n o t e s t h e r e s u l t o f e x e c u t i n g F w i t h i n p u t d ~ D. O U T ( d , F ( d ) ) s p e c i f i e s
t h e o u t p u t r e q u i r e m e n t f o r F , i. e . , O U T ( d , F ( d ) )
is true if and only if F(d) is an acceptable result.
We will write OK(d) as an abbreviation for
O U T ( d , F ( d ) ) . L e t T b e a s u b s e t o f D. T c o n s t i ~
t u t e s a n i d e a l t e s t i f OK(t) f o r a l l t ~ T i m p l i e s
OK(d) f o r a l l d E D, i. e . , i f f r o m s u c c e s s f u l e x e c u t i o n of a s a m p l e of t h e i n p u t d o m a i n w e c a n
conclude the program contains no errors,
then
the sample constitutes an ideal test.
Clearly,he
validity of this conclusion depends on how "thoroughly" T exercises F. Most papers on testing
[e. g. , Poole ( 1 9 7 3 ) . S t u c k i ( 1 9 7 4 ) ] e q u a t e a t h o r ough test with one that is exhaustive, i. e., one
f o r w h i c h T = D. B u t s u c h a d e f i n i t i o n g i v e s n o
insight into problems of test data selection. Ins t e a d , w e d e f i n e a " t h o r o u g h " t e s t , T, t o b e o n e
s a t i s f y i n g C O M P L E T E ( T , C), w h e r e C O M P L E T E
is a predicate that defines, in effect, how some
d a t a s e l e c t i o n c r i t e r i o n , C, i s u s e d i n s e l e c t i n g
a p a r t i c u l a r s e t of t e s t d a t a T . C d e f i n e s w h a t
properties of a program must be exercised to
c o n s t i t u t e a " t h o r o u g h " t e s t , i. e o , o n e w h o s e
successful execution implies no errors in a
tested program.
COMPLETE insures that T
satisfies all these properties.
(We w i l l d i s c u s s
l a t e r , i n S e c t i o n 5, t h e p r a g m a t i c r e a s o n s f o r
using an incomplete test. )
Introduction
The purpose
Fundamental
The purpose of testing is to determine whether
a program contains any errors.
An ideal test,
therefore,
succeeds only when a program conrains no errors.
In this paper, one of our goals
is to define the characteristics
of a n i d e a l t e s t i n
a way that gives insight into problems of testing.
We begin with some basic definitions.
The data selection criterion C must be def i n e d s o t e s t s s a t i s f y i n g C O M P L E T E (T~ C)
produce consistent and meaningful resnlts. We
express this requirement by saying C must be
reliable and valid. In general, reliability refers
to t~ consistency with which results are prod u c e d , r e g a r d l e s s o ~ w h e t h e r t h e r e s u l t s are
meaningful.
In particulars a data selection
through-
DAAA25-74N.C.
493
Given a p r o g r a m F, wlth domain D, output r e q u i r e m e n t OK(d) = OUT(d, F(d)) and t e s t data selection
c r i t e r i o n C:
(I)
SUCCESSFUL(T)
(2)
RELIABLE(C) = (VTi, T2CD)(COMPLETE(TI, C)^COMPLETE(T2, C)D(SUCCESSFUL(TI) SSUCCESSFUL(T2)) )
(3)
VALID(C) = (VdG D)~OK(d) D (~[T: D)(COMPLETE(T, C)^'~SUCCESSFUL(T)~
=
( V t i T ) OK(t)
Fundamental T h e o r e m
(~[TC~D)(~[C)(COMPLETE(T, C)~RELIABLE(C) ^VALID(C)~SUCCESSFUL(T)) D (Vd ~ D) OK(d)
Figure
Formal
1
Definitions and the Fundamental
criterion is reliable if and only if every T satisf y i n g C O M P L E T E ( T , C) i s p r o c e s s e d s u c c e s s fully by F, or if every such T is unsuccessfully
p r o c e s s e d ( s e e F i g u r e 1). I n s h o r t t o b e r e l i a b l e , C m u s t i n s u r e s e l e c t i o n of t e s t s t h a t a r e
consistent in their ability to reveal errors,
as
opposed to necessarily being able to detect all
errors.
N o t e t h a t i f C i s r e l i a b l e , i t i s only--"
necessary to test one complete set of test data-no further information will be derived from testing other complete sets of test data.
Theorem
of T e s t i n g
s t r a t e s t h a t t e s t s s a t i s f y i n g C O M P L E T E ( T , C}
where C satisfies RELIABLE and VALID are
"thorough" in the appropriate sense.
Note that
proving a data selection criterion to be reliable
and valid, and then finding and successfully executing a complete test satisfying this criterion
is just a way of proving the correctness
of the
program.
In effect, the theorem states that
i n s o m e c a s e s , a t e s t i.s_ a p r o o f o f c o r r e c t n e s s .
The proof of a selection criterion's validity
and reliabil~ty is easy in some cases.
For example, if C is defined so the only complete test is
a n e x h a u s t i v e o n e , i. e . , i f C O M P L E T E ( T , C ) D
(T = D), t h e n C o b v i o u s l y s a t i s f i e s R E L I A B L E
and VALID. Another interesting example is
w h e n C i s u n s a t i s f i a b l e b y a n y d e D. T h e n T
will be empty, i.e., no testing will be done,
In this case, such a C clearly satisfies RELIABLE(C).
Proof of Cts validity, however, is
more difficult. In fact, such a proof exists if
and only if the program contains no errors.
Hence in this case, proof of C'S validity is
equivalent to a direct proof of the program*s
correctness.
V a l i d i t y , i n c o n t r a s t to r e l i a b i l i t y , c u s t o m arily refers to the abilit 7 to produce meaningful
r e s u l t s , r e g a r d l e s s of h o w c o n s i s t e n t l y s u c h
results are produced.
Hence a test data selection criterion is valid to the extent it does not
f o r b i d s e l e c t i o n o f t e s t d a t a c a p a b l e of r e v e a l i n g
some error.
C i s v a l i d if a n d o n l y if f o r e v e r y
e r r o r in a p r o g r a m ,
there exists a complete set
o f t e s t d a t a c a p a b l e of r e v e a l i n g t h e e r r o r .
(3ee
Figure 1 for a formal definition. ) But note:
validity does not imply that every complete test
i s n e c e s s a r i l y c a p a b l e of d e t e c t i n g e v e r y e r r o r
in a prograrr~
This is the case only if the data
selection criterion is reliable as well as valid.
Validity only implies it is possible to selectdata
that will reveal an error; it does not guarantee
that such data will be selected.
A p r o o f of v a l i d i t y i s t r i v i a l , h o w e v e r , i f
C does not exclude any member of D from some
s e t o f t e s t d a t a , i. e. , i f i t c a n b e s h o w n t h a t f o r
a l l d ~ D, t h e r e e x i s t s a T c o n t a i n i n g d a n d s a t i s f y i n g C O M P L E T E ( T , C). I n t h i s c a s e , s o m e
testing must be performed,
and in general, the
proof of Cts reliability is not trivial.
The remainder of the paper will concentrate on what
must be known about programs to insure reliability in this case, and in Section 5 we will give
some guidelines for finding non-trivial reliable
test data selection criteria.
A l s o i n S e c t i o n 5,
we will give a specific example of a type of data
selection criterion and its correspondfng deftnitions of COMPLETE, RELIABLE, andVALID.
But to motivate this example, we first need to
look at sources of errors in programs,
both in
general (Section 1.2) and with reference to example programs
( S e c t i o n 2).
T h e f o r m a l d e f i n i t i o n s of R E L I A B L E a n d
VALID given in Figure 1 merely state precisely
what we already have said informally about
RELIABLE(C) and VALID(C). To show the
utility of the definitions, we use them in stating and proving the Fundamental Theorem on
w h i c h a l l t e s t i n g i s b a s e d ( s e e F i g u r e 1). I t s
proof is simple:
Assume there exists some d ~ D for which
F f a i l s (i. e. , - v OK(d)). T h e n V A L I D ( C )
i m p l i e s t h e r e e x i s t s a c o m p l e t e s e t of t e s t
data, T, that is not successful. RELIABLE(C)
implies that if one complete test fails, all
fail. But this contradicts the theorem~s
premise, i. e., that there exists a complete
test that is successfully executed. Q.E.D.
1.2
The theorem is called "basic" not because
it is deep--obviously we have constructed our
definitions of RELIABLE, VALID and COMPLETE
to permit the theorem to be proved.
Instead the
t h e o r e m i s b a s i c b e c a u s e of t h e i n s i g h t i t g i v e s
into the concepts of test reliability and validity,
and these concepts are basic to the construction
of "thorough" tests.
The theorem merely demon-
Types of Program
Errors
In general, software errors can be classed
as performance
errors (failure to produce results within specified or~desired time and space
limitations) and logic errors (production of incorrect results independent of the time and
space required).
It is useful to distinguish
various types of logic errors:
494
s const~'t~ction errors (failure to satisfy a
specification through error in an implementation);
@ specification errors (failure to write a
specification that correctly represents a
design);
@ design errors (failure to satisfy an understood requirement);
• requirements
errors (failure to satisfy the
real requirement).
ful for other purposes, e.g. p to understand the
effect of a programming language on software
reliabillty [Rubey (1968)], or to understand why
e r r o r s occur [Boehm (1975)]. But insight into
test reliability is given by the proposed classification. For example, consider the test data
selection criterion, "Choose data to exercise
a l l s t a t e m e n t s a n d b r a n c h c o n d i t i o n s i n a n i~mp l e m e n t a t i o n . " I n e v a l u a t l n g t h e r e l i a b i l i t y of
this criterion, we would ask, "WiUall construction,
specification, design, and requirements
errors
always be detected by exercising programs with
data satisfying this criterion?"
Clearly, if a
design error, for example, is manifested as a
missing path in an implementation,
then this
criterion for test data selection will not be reliable.
So o u r t y p o l o g y o f i m p l e m e n t a t i o n
errors is useful to help understand factors
impairing test reliability.
2. E x a m p l e s o f P r o g r a m E r r o r s
Thorough tests must be able to detect errors
a r i s i n g f o r a n y of t h e s e r e a s o n s , a n d t h i s m e a n s
that test data selection criteria must reflect information derived from each stage of software
development.
Ultimately, however, each type
of l o g i c e r r o r i s m a n i f e s t e d a s a n i m p r o p e r e f fect produced by an implementation,
and for this
reason, it is useful to categorize sources of
errors in implementation terms:
• Missing Cont'rbl Flow Paths - This type of
error arises from failure to test a particular
condition, and hence result in the execution (or
non-execution) of inappropriate actions.
For
example, failure to test for a zero divisor before executing a division may be a missing path
error.
Other examples will be given later. This
t y p e of e r r o r r e s u l t s f r o m f a i l i n g t o s e e t h a t
some condition or combination of conditions req u i r e s a unicLue s e q u e n c e o f a c t i o n s t o b e b a n d i e d
properly.
When a program contains this type of
error, it may be possible to execute all control
flow• p a t h s t h r o u g h t h e p r o g r a m w i t h o ~ - d e t e c t i n g
the error.
This is why exercising all program
paths does not constitute a reliable test.
In this section we wiU discuss errors in two
programs that have appeared in the published
l i t e r a t u r e t o i l l u s t r a t e t h e k i n d s of p r o b l e m s
t e s t i n g m u s t d e a l w i t h a n d to l a y t h e g r o u n d w o r k
for our proposed testing methodology.
2. 1
Example
I:
ASimple TextReformatter
In the article "Programming
by Action
C l u s t e r s , ,I p . N a u r (1969) p r e s e n t s t h e f o U o w i n g
problem:
"Given a text consisting of words separated
by BLANKS or by NL (new line) characters,
convert it to a line-by-line form in accordance with the following rules:
• Inappropriate Path Selection - This type of
error occurs when a condition is expressed incorrectly, and therefore, an action is some-times performed (or omitted) under inappropriate conditions.
For example, writing IF A
instead of IF A AND B means that when A is
true and B false, an inappropriate action will
be taken or omitted.
When a program contains
this type of error, it is quite possible to exercise all statements and ali branch conditions
without detecting the error.
This error canalso
o c c u r n o t m e r e l y t h r o u g h f a i l u r e to t e s t t h e r i g h t
combination of conditions, but through failure to
s e e t h a t t h e m e t h o d o f t e s t i s n o t a d e q u a t e , e. g . ,
testing for the equality of three numbers by writi n g ( X + Y + Z ) / 3 = X.
(I) line breaks must be made only where
the given text has BLANK or NL;
(2) e a c h l i n e i s f i l l e d a s f a r a s p o s s i b l e
as long as
(3) n o l i n e w i l l c o n t a i n m o r e t h a n h 4 A X P O S
characters. "
N a u r ' s p u r p o s e i s , i n p a r t , to j u s t i f y a n a p p r o a c h
to algorithm construction "on the basis of General
S n a p s h o t s n e e d e d t o p r o v e t h e a l g o r i t h m . ,i H e
does not intend to present a formal proof, but he
does present "prescriptions"
(assertions) and uses
t h e m t o g u i d e a n d j u s t i f y t h e c o n s t r u c t i o n of a c t i o n
clusters in a top-down manner.
So h e r e w e h a v e
~a p u b l i s h e d a r t i c l e w h i c h a t t e m p t s t o u s e t h e t o p down design method and the guidance of program
p r o v i n g , b u t t h e p r o g r a m ( s e e F i g u r e 2) t u r n s o u t
to have at least seven problems.
• Inappropriate or M.Issing Action - Examples
are calculating a value using a method that does
n o t n e c e s s a r i l y g i v e t h e c o r r e c t r e s u l t (e. g . ,
W * W i n s t e a d o f W+W), o r f a i l i n g t o a s s i g n a v a l u e
to a variable, or calling a function or procedure
with the wrong argument list. Some of these
errors are revealed when the action is executed under any circumstances.
Requiring all
statements in a program to be executed will
catch such errors.
But sometimes,
the action
is incorrect only under certain combinations
of conditions~ in this case, merely exercising
the action (or the part of the program where a
m i s s i n g a c t i o n s h o u l d a p p e a r ) wi~l n o t n e c e s sarily rev&al the error.
F o r exa~mple, t h i s i s
t h e c a s e i f W ~ W i s w r i t t e n i n s t e a d o f W+ W.
N__I: T h e p r o g r a m d o e s n o t t e r m i n a t e w h e n t h e
end of the given text is reached, although Naur
provides for termination (through the undefined
language construct "Alarm") if a word containing
more than h/AXPOS characters
(an oversizeword)
is seen.
(Note that in this case the specification
cannot be satisfied. ) Non-termination on end of
text will, of course, be discovered with any test
data not containing an oversize word. The effect
of processing an oversize word, however, would
only be seen if test data contained such a word:
A reliable test methodology will insure that oversize words are presented, to the program (even
though they are not mentioned inthe specification),
since such data are not excluded from the prog r a m V s i n p u t d o m a i n a n d t h e p r o g r a m WiU n o t
necessarily process oversize words correctly if
it processes shorter words correctly.
This classification of errors is useful because
our goal is to detect errors by constructing appropriate tests.
Other classifications are use-
495
bufpos := 01
outcharacter(LF);
fill := 01
next character :
incharacter (CW)
i f CW = B L A N K V C W = LF
then b e g i n
i_f fill + 1 + b u f p o s < M A X P O S
then b e g i n
o u t c h a r a c t e r (BIJOU) ;
fill := fill + 1 end
els,e b e q i n
o u t c h a r a c t e r (LF) ;
fill := 0 end;
f o r k := 1 step 1 until b u f p o s d ~
o u t c h a r a c t e r (buffer [k]);
fill := fill + b u f p o s ;
b u f p o s := 01
end
else
i_f b u f p o s = M A X P O S
th
e__n A l a r m
else b e g i n
b u f p o s := b u f p o s + i;
b u f f e r [bufpos]
:= CW end;
~_o t o next character;
Figure
1 Alarm := .false;
2 bufpos := 01
f£11 := O;
3
4 repeat
£ n c h a r a c t e r (CW) ;
5
I f C W = B L V CW = N L V C W = ET
6
then
7
if bufpos ~ 0
8
t h e n ,be~!n
9
I f fill + ' b u f p o s < M A X P O S A fill # 0
10
then b e g i n
11
o u t c h a r a c t e r (BL) ;
12
fill := fill + 11 end
13
else b e g i n
14
o u t c h a r a c t e r (NL) ;
15
fill := 0
end;
16
for k := 1 step 1 u n t i l b u f p o s d_o
17
o u t c h a r a c t e r (buffer [k]);
18
fill := fill + b u f p o s ;
19
b u f p o s := 0
end
20
else
21
i_f b u f p o s = M A X P O S
22
.then A l a r m := true;
23
.e.ls e b e g i n
24
b u f p o s :=. b u f p o s + I ;
25
b u f f e r [bufpos]
:= C W .end
26
27 ~intil A l a r m V C W = ET~
Figure
Z
Corrected
Naurls original program
3
version of Naur's program
a s s u m p t i o n a b o u t t h e f o r m o f t h e i n p u t (i. e . ,
no c o n s e c u t i v e BLANKS o r N L c h a r a c t e r s ) a n d
do s o n o t r e v e a l t h e e r r o r .
How can this sort
of error be discovered through a systematic
a p p r o a c h to t e s t i n g ?
N2: T h e l a s t w o r d i n t h e t e x t w i l l n o t be o u t
put unless it is followed by a BLANK or NL.
N3: A b l a n k w i l l a p p e a r b e f o r e t h e f i r s t w o r d
on the f i r s t line e x c e p t w h e n the f i r s t w o r d is
exactly MAXPOS characters long. This can
c a u s e a v i o l a t i o n of c o n s t r a i n t (2) f o r t h e f i r s t
line. The reason for this error is clear.
After
creating action clusters for a word buffer, Naur
makes the following assertion: "The input chara c t e r p r e c e d i n g t h e o n e h e l d i n b u f f e r [1] w a s a
B L A N K q r N L . T h i s h a s n o t b e e n o u t p u t " ( p . 252).
This assertion is false for the first word and is
never disproved.
T h u s a b o u n d a r y type of c a s e
(first character, first word) causes a proof
error.
Of c o u r s e t h i s e r r o r w i l l b e f o u n d f o r
any test data whose first word is less than
MAXPOS characters long.
N e : If t h e f i r s t w o r d o f t h e i n p u t
p r e c e d e d by a BL or NL, the output
rain either two blanks preceding the
or a line c o n t a i n i n g j u s t two b l a n k s ,
c o n s t r a i n t (2).
text is
will confirstword
violating
N7: T h e s p e c i f i c a t i o n s u s e
NL as the newline character but the program uses LF. This
e r r o r p r o b a b l y a r i s e s f r o m f a i l u r e to p r o o f r e a d , hut c o u l d a l s o b e t r a c e d to a f a i l u r e to
specify the character set of the problem.
If
LF and NL are distinct characters,
then any
input text containing a NL or LF character
will reveal the problem.
N4: W h e n t h e f i r s t w o r d i s M A X P O S c h a r a c t e r s long, it will be p r e c e d e d by two line b r e a k s
in the output, even though an empty line violates
c o n s t r a i n t (2).
What can we conclude from this example ?
N5: No p r o v i s i o n i s m a d e f o r p r o c e s s i n g
s u c c e s s i v e a d j a c e n t b r e a k s (e. g . , t w o b l a n k s ) .
This error arises because a word is defined as
the characters (other than NL or BLANK) appearing between successive NL or BLANK
characters,
and w o r d s of z e r o l e n g t h a r e s i m ply not c o n s i d e r e d in any d i s c u s s i o n of the
program's assertions.
S p e c i f i c a t i o n (Z) r e quires as many words as possible on a line,
and this specification makes no sense if zerolength words are permitted.
So a n i m p o r t a n t
case has not been considered in either the program, the input description, or the program's
informal proof. Naur's suggested test data for
t h i s p r o g r a m a p p e a r to m a i n t a i n t h i s i m p l i c i t
496
1.
Top-down construction and very informal
p r o o f s do n o t a l w a y s p r e v e n t p r o g r a n ~
errors.
Nor does the reading of the desc r i p t i o n of t h e p r o g r a m g u a r a n t e e t h a t
errors will be caught. Presumably this
paper went through some review process
and was proofread, yet the errors persisted.
The reviewer in Computing
R e v i e w s ( R e v i e w 19,420) m e n t i o n e d only
the gratuitous blank preceding the first
w o r d ( e r r o r N3). L o n d o n (1971) c o r r e c t e d
e r r o r s N1, N3, N4, a n d N7 b u t n o t e r r o r s
NZ, N5, a n d N6, e v e n t h o u g h h e " p r o v e s "
h i s v e r s i o n of the p r o g r a m c o r r e c t .
2.
If t h i s p r o g r a m had b e e n c o d e d and r u n
on s o m e t e s t data, s u c h a s t h a t u s e d to
i l l u s t r a t e t h e p r o g r a m o u t p u t (p. Z51),
e r r o r s N1, NZ, N3, a n d N7 w o u l d h a v e
been detected.
E r r o r s N4, Nb, a n d N6
would not have been revealed.
N7:
Since w e w i s h to u s e t h i s e x a m p l e to i l l u s t r a t e
other types of possible errors and then again in
Section 4 to illustrate a general method for selecting test data, we will clean up the specifications
and program:
Note that breaks preceding the first word
of the text will also be ignored.
L F w a s s y s t e m a t i c a l l y c h a n g e d to NL
throughout the program.
Note how sometimes several error symptoms are
corrected with a single program modification; it's
o f t e n not c l e a r w h e t h e r the n u m b e r of e r r o r s in a
p r o g r a m s h o u l d b e m e a s u r e d by the n u m b e r of c o r rections needed to produce a correct program, or
by the n u m b e r of s y m p t o m s d i s c o v e r e d .
G i v e n an input t e x t h a v i n g the f o l l o w i n g p r o p e r ties:
I I : I t i s a s t r e a m of c h a r a c t e r s , w h e r e t h e
characters are classified as break and
non-break characters.
A break charact e r is a BL (blank), NL (new line i n d i c a tor), or ET (end-of-text indicator).
IZ: T h e f i n a l c h a r a c t e r i n t h e t e x t i s E T .
I3: A w o r d i s a n o n e m p t y s e q u e n c e o f n o n break characters.
I4: A b r e a k i s a s e q u e n c e o f o n e o r m o r e
break characters.
(Thus the input can be viewed as a sequence
of words separated by breaks with possibly
leading and trailing breaks, and ending with
ET. )
We will now use this program to show how
t e s t i n g b a s e d s o l e l y on k n o w l e d g e of a p r o g r a m ' s
i n t e r n a l s t r u c t u r e c a n n o t l e a d to r e l i a b l e t e s t s .
To m a k e our d i s c u s s i o n m o r e r e a d i l y u n d e r s t a n d a b l e a n d to l a y t h e g r o u n d w o r k f o r f u r t h e r d i s c u s s i o n i n S e c t i o n 4, w e f i r s t c a s t t h e c o r r e c t e d p r o gram into the form of a limited entry decision
table [King(1969), Pooch(1974)] (see Table i). All
p r e d i c a t e s of the p r o g r a m a r e w r i t t e n in the c o n d i tion stub and all assignment and input/output statements in the action stub. The relevant outcomes
o f t h e p r e d i c a t e s a r e r e p r e s e n t e d a s Y, N, (Y), (N),
o r - - , w h e r e (Y) m e a n s t h e o u t c o m e i s n e c e s s a r i l y
t r u e , (N) m e a n s t h e o u t c o m e i s n e c e s s a r i l y f a l s e ,
and -- means the outcome can be either true or
false. A rule is a single column in the table and
s p e c i f i e s a s e q u e n c e of a c t i o n s to be p e r f o r m e d
f o r a p a r t i c u l a r c o n d i t i o n c o m b i n a t i o n , i. e . , a
p a r t i c u l a r s e t o f o u t c o m e s of t h e p r e d i c a t e s . F o r
s o m e r u l e s , the o u t c o m e s of c e r t a i n p r e d i c a t e s
imply that the outcomes of other predicates are
e i t h e r d e t e r m i n e d (and so of no i n t e r e s t ) o r a r e
irrelevant.
F o r e x a m p l e , w h e n C4 i s t r u e , C6
i s n e c e s s a r i l y f a l s e , a n d s o i n r u l e 1, C6 n e e d
not be tested.
Predicates whose outcomes are
r e p r e s e n t e d a s (Y), (N), o r - - a r e n o t a c t u a l l y
evaluated in the program represented by the decision table. For example, the table shows that
w h e n C1 i s Y, c o n d i t i o n s CZ a n d C6 a r e n o t e v a l u a t e d (see King (1969)).
The program's output should be the same
s e q u e n c e of w o r d s a s in the i n p u t w i t h the
following properties:
O1 : A n e w l i n e s h o u l d s t a r t o n l y b e t w e e n
w o r d s and at the b e g i n n i n g of the output
t e x t , if any;
OZ: A b r e a k i n t h e i n p u t i s r e d u c e d to a
s i n g l e b r e a k c h a r a c t e r in the output;
03: As many words as possible should be
p l a c e d on e a c h l i n e (i. e. , b e t w e e n
successive NL characters);
0 4 : No l i n e m a y c o n t a i n m o r e t h a n M A X P O S
characters (words and BLs);
0 5 : A n o v e r s i z e w o r d (i. e. , a w o r d c o n t a i n ing more than MAXPOS characters)
should cause an error exit from the
p r o g r a m (i. e. , a v a r i a b l e A l a r m s h o u l d
have the value TRUE);
The fact that the conditions specified in the
condition stub are not all independent is represented by the condition constraints given at the
r i g h t s i d e of the t a b l e ; it is t h e r e , f o r e x a m p l e ,
t h a t t h e r e l a t i o n b e t w e e n t h e t r u t h of C4 a n d t h e
f a l s i t y o f C6 i s a s s e r t e d .
The predicates assoc i a t e d w i t h t h e a c t i o n s t u b s (e. g. , ( C 1 V C Z ) A C 3 )
indicate under what conditions the associated
a c t i o n i s to b e p e r f o r m e d .
Specifying these conditions is not necessary, but provides useful
r e d u n d a n c y in c h e c k i n g t h e c o r r e c t n e s s o f a
table.
The c o r r e c t e d v e r s i o n of N a u r ' s p r o g r a m a p pears in Figure 3. First let's look at the correct i o n s f o r e r r o r s N1 t h r o u g h N7.
N1, N2: T h e e n d l e s s f l o o p c o n s t r u c t e d w i t h a
goto in Naur's program has been replaced
with a repeat-until having Alarm as a
Boolean variable and ET as an end-of-text
indicator.
N3: T h e c o n d i t i o n f i l l ~ 0 h a s b e e n c o n j o i n e d
with the condition fill + bufpos < MAXPOS
to p r e v e n t the output of a BL b e f o r e the
f i r s t w o r d and to p r o d u c e a N L i n s t e a d .
The condition fill ~ 0 holds only for the
first line, since every other line will contain at least one word. Equally well, we
c o u l d h a v e i n i t i a l i z e d fill to the v a l u e
lVIAXPOS, i n s u r i n g t h a t f i l l + b u f p o s <
IV[AXPOS w o u l d b e f a l s e t h e f i r s t t i m e .
N4: L i n e Z of N a u r ' s p r o g r a m , o u t c h a r a c t e r
(NL), i s r e m o v e d s o t h a t o n l y o n e N L
c h a r a c t e r w i l l p r e c e d e t h e f i r s t w o r d of
the output text.
N5, N6: A n e x t r a p r e d i c a t e , b u f p o s ~ 0, p r e vents output when two consecutive breaks
o c c u r ; the f i r s t b r e a k f o r c e s the w o r d to
be output and b u f p o s to be r e s e t to z e r o .
The d e c i s i o n t a b l e f o r m a t m a k e s it e a s i e r to
s e e how v a r i o u s s e t s of t e s t data s a t i s f y v a r i o u s
test data selection criteria° Below the program
a r e w r i t t e n f o u r s e t s of t e s t d a t a . E a c h s e t a s s u m e s
I~AXPOS = 3 and meets some test data selection
c r i t e r i o n b a s e d on the p r o g r a m ' s s t r u c t u r e .
D1
e x e r c i s e s a l l s t a t e m e n t s (by e x e r c i s i n g r u l e s 3, 5, 9
a n d 1 0 ) , DZ a l l s t a t e m e n t s a n d c o m p o s i t e p r e d i c a t e s ,
a n d D3 a l l s t a t e m e n t s a n d i n d i v i d u a l p r e d i c a t e s . ~
~An i n d i v i d u a l p r e d i c a t e is c o n s i d e r e d to b e e x e r cised when it is necessessarily
evaluated for
s o m e data and it t a k e s on b o t h t r u e and f a l s e
values. For example, data satisfying rules 1-4
a r e n o t c o n s i d e r e d t o e x e r c i s e CZ b e c a u s e CZ n e e d
n o t b e e v a l u a t e d . CZ i s e x e r c i s e d b y d a t a s a t i s f y i n g o n e o£ r u l e s 5 - 8 a n d e i t h e r r u l e 9 o r r u l e 10.
497
Table 1
DECISION T A B L E
Initial c0nditions:.
Cl:
C2:
C$:
CA:
C5:
C6:
AI a:
b:
A2 a:
b:
A3 a:
b:
c:
REPRESENTATION
"7 C3A-TCSA
CW= B L V C W = NL
CW = ET
OF PROGRAM
CW = incharacter
1
zl3
4
S
y
yIY
Y
(N))N', (N] (N)
819
10
N
N
NiNIN
X
Y
Y
Y
N
N
-
'-eC3~--1C6
N
C4~--~C6; --~C4~ (C3 VCS)
C3A "-7 C5 ~ C4; ~ C5 A C6=) "-7C4
C6 ~ C 3 ; C6~--~C4
N
Y
Y
Y
N
I'
Y
Y
N
(Y)
Y
Y
N
-
Y
,Y
N
-
(N)-
Y
N
-
Y
N
outcharacter (BL)
/ill := /111 + I
outcharacter (NL)
till := 0
-
(N) (N) (N)
X
X
-
(N) Y
Ix, !x
x
/ o r k := 1 untilbufpoJ
xlx'
x X x,
Ix~x
xlxx
fill := fill + bufpos
bufpos := 0
X
X
X
X
k4:
Alarm
A5 a:
b:
bufpol := bufpos + 1
buffer [bufposl := CW
A6 a:
b:
incharacter (CW)
Repeat table
A7:
Exit table
DI.1
DI.2
Test Data
A,AoA.A. ET
A.A,A. BL, B. BL. C, ET
DZ.I
DZ.2
A . A . A . A . ET
A. A,A. BL. B, BL, C. NL. ET
D3.1
D3.Z
A . A . A . A . ET
A, BL. B, NL. C. NL. ET
D4.1
D4.2"
D4. 3
D4.4
D4.5
A.A.A.A.ET
A. BL0 BL. B. BL. C. NL. ET
A. ET
A. BL, BoET
A. BL, B. NL. C0 BL, D. D. D. ET
X
X
X
X
C2
C2 ~ - 7 C l
C1~'-~
(CIV C2) AC3 A C4 A C5A'~C6
X
X
x
= FALSE.
77
f i l l + bufpo~ < MAXPOS
flU ~ 0
bufpos = MAXPOS
-
(CW) AAlarm
&
bufpoe ~ 0
(N) (N)
AND TEST DATA
X
X
(C1 VCZ) AC3 A (-~C4V-~C5)
(Cl VCZ) AC3
X
X
"7 Cl A "s CZ AC6
:= T R U E
"7CIA "7 C2 A"I C6
X!X
XiX
X
X
X
X
ClV (--~ClA--TC2A'-~C6)
X
X
X
:, CZ V(--TC1 AC6)
X X
Rule Exercised
j
Xs
X
X
X
X
X
X
N
X
I0. I0, I 0 . 9
X
X
10, 10, 10, 3, 10, 3, 10, 1, 8
X
X X
X
:X
X
Ixllx
BI.
Ill*
.N' .|
.
I~L
X
X N L.X I l l , X I l L
X
D3 a l s o e x e r c i s es all loop iteration paths, i.e., all pa ths
through a loop that are possible on some iteration
of the loop; by exercising all individual predicates
a s w e l l , D3 e n s u r e s a l l l o o p e x i t i n g c o n d i t i o n s a r e
also tested.
D4 exercises
all rules of the table.
Note that exercising all rules is equivalent to
exercising
each statement under all condition
combinations considered relevant by the writer
of the program.
Table Z shows more clearly
what combination of rules need to be exercised
to satisfy the various test data selection criteria.
as weU as how the data actually satisfy the criteria.
F o r e x a m p l e , t h i s ta~ble s h o w s t h a t t h e c o m p o s i t e
p r e d i c a t e C 4 A C 5 (i. e . , f i U + b u f p o s < l v I A X P O S ^
f i l l ~0) i s t r u e f o r a n y d a t a e x e r c i s i n g r u l e s I
a n d 5 a n d f a l s e f o r a n y d a t a e x e r c i s i n g r u l e s Z,
3, 6 o r 7. F o r d a t a e x e r c i s i n g a n y o t h e r r u l e ,
the v a l u e o f t h i s c o m p o s i t e
predicate is irrelevant.
T h e a c t u a l t e s t d a t a e x e r c i s e r u l e s I a n d 3.
498
Rule Sequence Exercised
I0, I0. I0.9
I0. I0. I0.3, I0.3, I0.5
X
X
X
X
X
I0. I0. I0.9
I0,2, I0, I. I0, 3, 8
10,10,10,9
10, 2, 4, 10, 1, 10, 3, 8
10,6
10,2, 10, 5
10,2,10,1,10,3,10,10,10,7
We wish to show that none of these exercising criteria are completely reliable,
i.e.,
itis
p o s s i b l e to e x e r c i s e a p r o g r a m
containing an
error using any of these criteria without necessarily discovering
the error.
This shows
that tests based solely on a programts
internal
structure are unreliable; their success is poor
evidence that a program
contains no errors.
Five examples of errors
are summarized
in
T a b l e 3. T h e e r r o r s
can be characterized
as
follow s:
1) a n i n c o r r e c t p r e d i c a t e ( f i l l + b u f p o s ~ M A X P O S
instead of fill + bufpos < MAXPOS) causing
rules 1 or 5 to be selected under circumstances where rules 3 or 7 should have been
selected.
This is an inappropriate
path selection type of construction error.
It w i l l n o t b e
detected by any of the four test data sets. To
Table 2
1.
R U L E S E X E R C I S E D T O SATISFY V A R I O U S C O M P L E T E N E S S
Composite Predicates
Composite
Rules To Be Exercised
Predicates
(I-8)
CI V C 2
Alarm VC2
C3
(5-9)
(1-3, 5-7)
(1, 5)
|
(9)
C4 A C5
J
C6
Z.
Rules Exercised By D2
TRUE
FALSE
FALSE
(9, 10)
(1-4, 10)
(4.8)
(z. 3, 6.7)
(10)
1, 3
8, 9
lr3
1
9
Rules Exercised By DI
TRUE
9, 10
1, 3, 10
8
3
10
FALSE
3
9, 10
5.9
10
3,5
5
3
9
10
Individual Predicates
Rules Exercised By D3
TRUE
FALSE
i, 2
3, 8, 9, I0
R u l e s To B e E x e r c i s e d
Individual
Predicates
TRUE
FALSE
CIBL
(I-4)
(I-10)
CINL
(1-4)
(5-10)
3
cz
(5-8)
(9, lO)
8
c3
(l-3, 5-7)
c4
(1, z, 5, 6)
i
Rules Exercised ByD2
TRUE
FALSE
I
3, 8, 9, I 0
8, 9, I0
3
8, 9, 10
9,10
8
9, 10
(4, 8)
1, Z. 3
8
1,3
8
(3, 7)
1, Z
3
1
3
(2, 6)
1
Z
1
C5
l
(1, 5)
C6
!
(9)
(10)
9
10
9
10
(9)
(10)
9
10
9
10
Alarm
3.
I
TRUE
CRITERIA
Loop Iteration Paths
Distinct Paths
Rules Exercised By D 3
1
(I, 5)
(Z, 3, 6, 7)
2,3
(4, 8)
(9)
8
9
(lO)
10
Table 3
E F F E C T OF ERRORS
Error in Program
I.
Change
f i l l + bufpos
< MAXPOS
to
f i H ÷ bufpos
< ~.AXPOS
2. O m i t l i n e 13
flU:= f i l l + 1
3. O m i t bufpos ~ 0
t e s t (line 8)
(This i s N a u r ' s
e r r o r N5, 6. )
4. O m i t
from
(This
error
flU 4 0
l i n e 10
is Naur's
N3. )
5. O m i t CW= E T
f r o m line 6
E r r o r Type
inappropriate path
selection
missing action
missing path
E f f e c t on T a b l e
C o n d i t i o n C4
changed similarly.
No m a r k on
Action Alb.
C o n d i t i o n C3 l i n e
w i n be r e m o v e d a n d
therefore rules 4
and 8 can be dropped.
inappropriate path
selection
C o n d i t i o n C5 r e m o v e d
from table. Rules
2 and 6 are eliminated,
being subsumed under
r u l e s I a n d 5.
inappropriate path
selection
R u l e s 5-8 e l i r a i n a t e d
and rule 9 is modified.
See T a b l e 4.
499
Table
DECISION
TABLE
REPRESENTATION
Illzl3i4
I
Cl:
CW=BLVCW=NL
CZ:
CW = ET
C3:
C4:
bufpos ~ 0
C5:
C6:
bufpos = M A X P O S
A 1 a:
b:
IY
N
fill +,bufpos < MAXPOS
-
(Y)
(N)
fill ~ 0
Y
N I
-
-
-
(N)
Y'
outcharaeter
(BL)
fill := fill + I
AT:
Exit table
. _C4~-7C6;'-~C4~ (C3 VC5)
. '-7C3A'-IC5=~C4;':'7C5
N
, C6DC3;
AC6~--YC4
C6~-~C4
C I A C3 A C4 AC5A--wC6
X
X
C1 AC3 A (-'~C4V-7C5)
IX
X
X
X
X
C1 AC3
X
X
X
X
X
X
-~C1A C6
X
X
X
,,
buffer [bufpos] := CW
i n c h a r a c t e r ICW)
Repeat table
CZ=:~--tCl
~-IC3~'-vC6
X
bufpos := bufpos + 1
A6 a:
: b:
CI~'wCZ
X
A l a r m := TRUE
b:
N.,
N
N
f o r k := 1 u n t l l b u f p o s
"--outcha r a c ~ (buffer [k])
fill := fill + bufpos
bufpos := 0
A 5 a:
N
Y
Y
N
(N) (N) -
5
5 1 6 1 7 el9' lO
I
Y
Y
A3a:
A4:
YIYIY
P R O G R A M CONTAINING ERROR
Y
Y
o u t c h a r a c t e r (NL~
fill:= 0
c:
I
I ( N ) ) N } (N) (N)
AZ a:
b:
b:
4
OF
X
X
X
X
X
X
X
X
X
X
•-rC1A-7 Cb
X
X
C1 V ('-'tC6A--sCZ)
o
I
I
X
X
cZ v ( ~ C l ^ C 6 )
r e p l a c e d w i t h f i l l := M A X P O S ) . T h i s e r r o r
is easily detected by any data whose first
word is less than MAXPOS characters long.
N o t e t h a t D1 a n d D2 do n o t d e t e c t t h i s e r r o r
precisely because the first word is exactly
MAXPOS characters long. In fact, data set
DZ e x e r c i s e s a l l i n d i v i d u a l p r e d i c a t e s , i n t e rior loop paths, and statements in the erroneous program without showing this error.
It is a l s o r e a d i l y p o s s i b l e to d e v i s e d a t a to
exercise all rules in a decision table representation of the erroneous program without
revealing the error.
show this error, the loop must be executed
w i t h f i l l + b u f p o s = M A X P O S a n d f i l l ~ 0. T h i s
r e q u i r e s at l e a s t M A X P O S + 2 i t e r a t i o n s of
the loop and the word lengths must be just
right. Although this is an "oil-by-one" type
of error, not just any data such that fill+bufpos
= M A X P O S w i l l r e v e a l it, e . g . , D1.Z a n d
DZ.Z d o n o t d o s o .
Z) a m i s s i n g a c t i o n ( f i l l := f i l l ÷ 1) y i e l d i n g e s s e n tially the same effect as error 1 when there
a r e o n l y ttwo w o r d s o n a l i n e . T h i s e r r o r a l s o
w i l l not be d e t e c t e d by any of the f o u r t e s t d a t a
sets.
5) a m i s s i n g p r e d i c a t e (CW = E T i n l i n e 6) c a u s ing inappropriate path selection.
This is
e s s e n t i a l l y e r r o r N2. T h i s e r r o r w i l l n o t
b e f o u n d b y D2 o r D3. It a l s o i s n o t n e c e s sarily found by data exercising all rules in
the decision table representing the altered
p r o g r a m ( s e e f o r e x a m p l e , T a b l e 4; d a t a
D4.1 a n d D 4 . 2 e x e r c i s e a l l t h e r u l e s i n t h i s
table without revealing an error).
This
error shows that even when all conditions
relevant to the correct operation of a program appear explicitly in the program,, test
data selection criteria based solely on a
program's internal structure will not necessarily guarantee tests that will reveal the
error.
3) a m i s s i n g p r e d i c a t e ( b u f p o s ~ 0) y i e l d i n g a
m i s s i n g p a t h t y p e of e r r o r t h a t w o u l d not be
d e t e c t e d w i t h t h e D1 d a t a s e t . ( T h i s i s o n e o f
Naur's errors. ) Although the other data sets
would detect this error, different data sets
can be constructed that will not detect the
error and yet will satisfy the various exercising criteria for the erroneous program.
F o r e x a m p l e , D1 a s i t s t a n d s w o u l d e x e r c i s e
all statements and composite predicates in
t h e m o d i f i e d p r o g r a m ( s e e T a b l e 2 w i t h C3
eliminated).
A l s o , t e s t D4. 2 c o u l d t h e n
b e e l i m i n a t e d a n d t h e o t h e r D4 d a t a w o u l d
then be sufficient to exercise all rules, into.
rior loop paths, and individual predicates
without revealing the error.
T h i s a n a l y s i s of f i v e p o s s i b l e e r r o r s s h o w s
the unreliability of test data selected merely to
exercise all statements, compopite predicates,
individual predicates, loop iteration paths, or
rules in a program's decision table representation. These exarnDles of errors show that:
4) a m i s s i n g p r e d i c a t e ( f i l l ~ 0) c a u s i n g a n i n appropriate path selection. (This is also one
of Naur's errors; ) Alternatively, this error
could be characterized as an inappropriate
a c t i o n ( n a m e l y , f i l l := 0 i n l i n e 3 s h o u l d b e
5O0
Table 5
A L L F E A S I B L E R U L E S I M P L I E D BY T A B L E 1
$1L
CW - B L V C W • N L
C l ~ - I C2
~3.*
b u r as iI 0
•-~ C3:~ -'i C6
C4:
C5.,=.:~
f l - + bu,f o$ < M A X P O S
t i l l II 0
--i C3A --i CS ~ C4; ~ C5 A C 6 ~ "-~iC4
C6:
buf~__._.___...=...M__~A X P O 3
C6 ~ C 3 : C 6 ~ " I C 4
AI a:
Table I Rule Numborl
ou~charactor
BL
( C l V C 2 ) A C 3 A C4 A C S A ' V C 6
b:
~2 a:
b:
~3 at
C4~--tC6I ~C4=3 (C3 V C$)
f i l l := R I I ÷ 1
~
outcharactor
.
.
~
~
t i l l := f i l l + b _ ~ _ p . ~ _ ~
e:
but 08 :- 0
f~4.~
Alarm
:= T R U E
h5 a :
buf.o~
bur
b:
buffer
b:
( C l VC2) A C 3
k :u I u n t i l bu.fpo8
--outcharac~er
buffee" k
~
b:
&6 a :
( e l VC2) A C 3 A (-'JC4 V ' ~ C 5 )
f i l l :• 0
bu.f o s
lncharacter
--VCl A--IC2 A C 6
• + 1
'
"aCt A --'l C2 A--B C6
:= CW
~_~
C I V ~"~ C l A ~ C 2 / ~
C6)
Re e a t t a b l e
C2 V ( ~ C l
AT_,~___Extt t a b l e
Test I~bL
AC6)
l~alo S e q u e n c e E x e r c i s e d
Rules E x e r c i s e d
27 20 20 22
27,20.20 5
S 6 25.10
27 20 20 5 25 6 2$ 1 17
27,2,7,25.1,26°6,16
27,11
27.2. Z5.10
1) T o d e t e c t e r r o r s r e l i a b l y , i t i s i n g e n e r a l
n e c e s s a r y to e x e c u t e a s t a t e m e n t u n d e r
m o r e t h a n o n e c o m b i n a t i o n o f c o n d i t i o n s to
v e r i f y that its e f f e c t is a p p r o p r i a t e u n d e r all
circumstances.
Exercising any statement
just once is usually inadequate.
e f f e c t c o m b i n e s r u l e s 12, 13, 2"1, a n d ZZ to f o r m
r u l e 9' a n d r u l e s 10, 11, 1 4 - 1 8 to f o r m r u l e 8 ' . I f
f o r e r r o r s 3 a n d 4, c o n d i t i o n s C5 a n d C3 a r e r e tained in Table 5 even though they are not tested
in the p r o g r a m , t e s t data to d e t e c t t h e s e e r r o r s
w i l l n e c e s s a r i l y b e g e n e r a t e d i f a l l 27 c o n d i t i o n
combinations are exercised.
Errors 1 and 2 will
be found if the c o n d i t i o n f i l l + bufpos = M A X P O S
i s a d d e d to t h e t a b l e . D a t a to e x e r c i s e t h i s c o n dition in combination with all the others will
n e c e s s a r i l y r e v e a l e r r o r s 1 a n d 2. ( T h i s c o n d i t i o n a d d s o n l y s i x r u l e s to T a b l e 5 b e c a u s e i t
i s not i n d e p e n d e n t of the o t h e r c o n d i t i o n s . )
2) E q u a l l y w e l l , t h e s a m e p a t h t h r o u g h a l o o p
w i l l u s u a l l y h a v e to be e x e r c i s e d m o r e t h a n
once before the right combfnation of condit i o n s i s f o u n d to r e v e a l a m i s s i n g p a t h o r i n appropriate path selection error. For examp l e , e r r o r 5 ( s e e T a b l e 3) r e q u i r e s e x e r c i s i n g r u l e 8' o f T a b l e 4 w i t h b u f p o s ¢ 0 to s h o w
the error.
This path should also be exerc i s e d w i t h b u f p o s = 0 to g u a r d a g a i n s t s o m e
o t h e r e r r o r , e . g . , e r r o r 3, e v e n t h o u g h
bufpos's value is not even tested when executing r u l e 8'.
In s h o r t , the s e c r e t of r e l i a b l e t e s t i n g is to
f i n d a l l c o n d i t i o n s r e l e v a n t to a p r o g r a m l s c o r r e c t o p e r a t i o n a n d to e x e r c i s e a l l p o s s i b l e c o m b i n a t i o n s o f t h e s e c o n d i t i o n s . In S e c t i o n 4, w e
will illustrate how relevant conditions can be
discovered.
In s h o r t , a r e l i a b l e t e s t i s d e s i g n e d n o t s o m u c h
to e x e r c i s e p r o g r a m a t l a s
to e x e r c i s e a t e
under circumstances such that an error is detectable if one exists. Tests based solely on the
i n t e r n a l s t r u c t u r e of a p r o g r a m a r e l i k e l y to be
unreliable.
2.3
E x a m p l e 2:
An Exam Scheduler
This program is presented by Hoare in Dahl
(1972) a s a n e x a m p l e of t h e u s e o f v a r i o u s a b s t r a c t data s t r u c t u r e s and an i n f o r m a l p r o o f of
correctness.
To find data that reliably reveal errors like
those illustrated here, all the conditions relev a n t t o t h e c o r r e c t o p e r a t i o n of a p r o g r a m m u s t
b e k n o w n a n d t e s t d a t a m u s t b e d e v i s e d to e x e r c i s e a l l p o s s i b l e c o m b i n a t i o n s of t h e s e c o n d i t i o n s ,
w h e t h e r or not the p r o g r a m l s i n t e r n a l s t r u c t u r e
indicates certain condition combinations cause
d i f f e r e n t s e q u e n c e s o f a c t i o n s to b e p e r f o r m e d .
For example,, Tables 1 and 4 actually represent
27 d i s t i n c t r u l e s i f a l l " d o n t t c a r e " c o n d i t i o n s
a r e w r i t t e n out e x p l i c i t l y as d i f f e r e n t r u l e s .(see
T a b l e 5). T e s t d a t a r e q u i r e d to e x e r c i s e a l l 27
r u l e s w i l l d e t e c t e r r o r 5 b e c a u s e t h i s e r r o r in
T h e p r o b l e m i s to c o n s t r u c t a. t i m e t a b l e f o r
university examinations such that
1) T h e n u m b e r o f s e s s i o n s i s k e p t s m a l l , t h o u g h
no a b s o l u t e m i n i m u m i s s t i p u l a t e d .
Z) E a c h e x a m i s s c h e d u l e d f o r o n e s e s s i o n .
3) No e x a m i s s c h e d u l e d f o r m o r e t h a n o n e
session.
4)No session involves more than K exams.
S) No s e s s i o n i n v o l v e a m o r e t h a n h s t u d e n t s .
6)No student takes more than one exam in a
session.
501
-t
The solution p r o p o s e d b y H o a r e t r i e s a l l c o m b i n a t i o n s o f u n s c h e d u l e d e x a m s t h a t s a t i s f y (Z)
t h r o u g h (6) a n d s e l e c t s t h e c o m b i n a t i o n t h a t
m a x i m i z e s h.
p r o o f of c o r r e c t n e s s
bilit~r.
3.
a p p r o a c h to s o f t w a r e r e l i a -
Views on Testing
Having discussed some fundamental definitions
in the t e s t i n g a r e a and e x a m i n e d e x a m p l e s of t h e
p r o b l e m s to b e s o l v e d b y a n a p p r o a c h to t e s t i n g ,
we will now look at what others have said about
the goals, methods, and difficulties of program
t e s t i n g b e f o r e t u r n i n g o u r a t t e n t i o n to o u r p r o posed method.
The specifications and assertions, except
(1) a r e p r e c i s e l y a n d f o r m a l l y s t a t e d s u c h t h a t
a formal proof could be performed.
However,
a n e r r o r c r e e p s i n r e l a t e d to (1). It i s i m p o s s i b l e to s t a t e (1) p r e c i s e l y s i n c e t h e r e i s n o w a y
to p r e d e t e r m i n e the n u m b e r of s e s s i o n s , so it
i s a s s u m e d t h a t t h e s o l u t i o n m e t h o d w i l l do t h e
best it can. The error arises from a variable
" r e m a i n i n g " w h i c h i s to c o n t a i n all t h o s e e x a m s
w h i c h do n o t a p p e a r i n t h e t a b l e s o f a r , i. e. ,
unscheduled exams.
The error is manifested
in two ways:
3.1
"Exhaustive"
testing
"Software certification.., ideally.., means
checking all possible logical paths through
a p r o g r a m . " [ B o e h m (1973)]
"Exhaustive testing, i. e. , testing all possible
(input)... " [Poole (1973), p. 310]
1) P e r f o r m a n c e .
E x a c t l y 500 e x a m s a r e
a l w a y s p r o c e s s e d e v e n if t h e r e a r e f e w e r
t h a n 500 e x a m s t o b e s c h e d u l e d . S i n c e t h e
program is recursive and searches all comb i n a t i o n s of u n s c h e d u l e d e x a m s , the p e r f o r m
anc~ will be degraded by processing "empty
e x a m s " ( e x a m s f o r w h i c h no s t u d e n t i s r e g i s tered).
Exhaustive testing, defined either in terms of
program paths or a program's input domain (alt h o u g h t h e s e d e f i n i t i o n s a r e n o t e q u i v a l e n t ) S, i s
usually kited as the impractical means of achieving a reliable and valid test. The importance of
the cited statements is in the questions which
arise from them:
Z) V i o l a t i o n o f c o n s t r a i n t s .
Empty exams still
h a v e to b e s c h e d u l e d in s o m e s e s s i o n , s o i t
i s p o s s i b l e t h a t t h e l i m i t a t i o n i m p l i c i t i n (1)
could be violated unnecessarily.
That is,
the p r o g r a m a s it s t a n d s m a y be u n a b l e to
produce a solution where it should if empty
exams were not processed.
• Is t h e r e a w a y to t e s t a p r o g r a m w h i c h is
e q u i v a l e n t t o e x h a u s t i v e t e s t i n g (in t h e s e n s e
of b e i n g r e l i a b l e and v a l i d ) ?
• Is there a practical
tive testing ?
approximation
to e x h a u s -
• If so, how good is the a p p r o x i m a t i o n ?
T h i s e r r o r i s e a s y to c o r r e c t by i n i t i a l i z i n g " r e m a i n i n g " to i n c l u d e only t h o s e e x a m s f o r w h i c h
at least one student has registered.
It s h o u l d b e n o t e d t h a t e x h a u s t i v e t e s t i n g o f a
program:s input domain is a process not necess a r i l y g u a r a n t e e d to t e r m i n a t e .
There are some
p r o g r a m s w h o s e b e h a v i o r (e. g . , w h e t h e r t h e y
stop) is i m p o s s i b l e to v e r i f y by t e s t i n g o r any
other means and some programs have infinite
input domains, so exhaustive testing can never
be completed.
This error is very different from those of
Example 1 and is important for the following
reasons:
1) T h i s p r o g r a m w o u l d b e v e r y h a r d t o u n d e r s t a n d and to t e s t t h o r o u g h l y w i t h o u t s o m e
s o r t o f a n i n f o r m a l p r o o f to g u a r a n t e e t h a t
all c o m b i n a t i o n s a r e p r o p e r l y h a n d l e d . But
t h e p r o o f o n l y c o n s i d e r s a s s e r t i o n s (2) - (6)
a n d i g n o r e s (1) b e c a u s e i t c a n ' t b e f o r m a l i z e d . T h e e r r o r t h e n a r i s e s b e c a u s e (1) i s
not checked out informally even if it can't be
formally stated and proved.
Even formally
s p e c i f i a b l e n e c e s s a r y c o n d i t i o n s f o r (1) t o b e
s a t i s f i e d a r e n o t p r o v e d , e . g . , t h a t no s e s s i o n s c h e d u l e s an e m p t y e x a m .
This error
s h o w s t h e d a n g e r o f c o n c e n t r a t i n g on t h e
d i f f i c u l t p a r t s of a p r o g r a m a n d t h e f o r m a l i z e d s p e c i f i c a t i o n s to t h e e x c l u s i o n o f t h e
easier parts.
Although some programs cannot be exhaustively tested, a basic hypothesis for the reliab i l i t y and v a l i d i t y of t e s t i n g is that the input
domain of a program can be partitioned into a
f i n i t e n u m b e r of e q u i v a l e n c e c l a s s e s s u c h t h a t
a t e s t o f a r e p r e s e n t a t i v e of e a c h c l a s s w i l l , b y
induction, t e s t the e n t i r e c l a s s , and h e n c e , the
e q u i v a l e n t of e x h a u s t i v e t e s t i n g of the input d o main can be performed.
If s u c h t e s t s a l l t e r m inate and if the p a r t i t i o n i n g is a p p r o p r i a t e , a
c o m p l e t e l y r e l i a b l e t e s t of the p r o g r a m will h a v e
been performed.
T h i s i s n o t , of c o u r s e , a n o v e l
i d e a . H o a r e [in B u x t o n ( 1 9 7 0 ) , p . 21] h a s p o i n t e d
o u t t h a t t h e e s s e n c e o f t e s t i n g i s to e s t a b l i s h t h e
b a s e p r o p o s i t i o n of a n i n d u c t i v e p r o o f (we p u r s u e
t h i s i d e a f u r t h e r i n S e c t i o n 5). T h i s p i n p o i n t s
the f u n d a m e n t a l p r o b l e m of t e s t i n g - - t h e i n f e r e n c e
f r o m t h e s u c c e s s of o n e s e t o f t e s t d a t a t h a t o t h e r s
will also succeed, and that the success of one
Z) T h i s p r o g r a m i s s i m i l a r t o m a n y o t h e r p r o grams where the correct answer can't be
f o r m a l l y c h a r a c t e r i z e d and w h e r e it w o u l d
b e d i f f i c u l t to e v e n d e t e r m i n e w h e t h e r t h e
produced timetable has the fewest number
of sessions.
A s s u m i n g no p r o o f e x i s t e d o r a s c o n f i r m a t i o n
of the p r o o f , w h a t type of t e s t d a t a s h o u l d be s e l e c t e d ? What c r i t e r i a f o r t e s t data can h o p e to
y i e l d a r e l i a b l e t e s t of t h i s p r o g r a m ?
Our prop o s e d a p p r o a c h i s n o t y e t s u f f i c i e n t l y p o w e r f u l to
answer these questions.
We c i t e t h i s e x a m p l e t o
show the difficulties that must be resolved by a
c o m p l e t e t h e o r y of t e s t data s e l e c t i o n a s w e l l a s
to s h o w t h a t t h e s e d i f f i c u l t i e s c h a l l e n g e e v e n a
SExhaustive testing of paths will not necessarily
find all errors.
For example, both paths in the
following program can be tested without necessarily revealing the error:
IF ( X + Y + Z ) / 3 = X
THEN
P R I N T ("X, Y, and Z are equal in value");
E L S E P R I N T ("X, Y, and Z are not equal in
value");
5~
t e s t data s e t i s e q u i v a l e n t t o s u c c e s s f u l l y c o m pleting an exhaustive test of a program's
input
domain.
3. Z
Is Proving Really Better
What testing lacks (and what this paper is
attempting to provide a start towards) is a theoreticaUy sound (but practical) definition of what
constitutes an adequate test. When our unders t a n d i n g of w h a t m a k e s a n a d e q u a t e t e s t i s a s
g o o d a s o u r u n d e r s t a n d i n g of w h a t m a k e s a n ' a d e quate proof, testing can assume its proper role
in the theory and practice of software development.
Than Testing?
"Program testing can be used to show the
presenc_e of_bugs, but never to show their
absence! "[in Dahl (1972), p. 6]
" [ W h e n ] y o u h a v e g i v e n t h e p r o o f o f [a p r o gram'sJ correctness .... [you] can dispense
with testing altogether:'[inNaur
(1969b),p. 517
3. 3
Just because testing is not a completely reliaMe means of demonstrating program correctness
does not mean it's sensible to rely solelyonproofs.
Proofs aren't completely reliable either. Proofs
c a n o n l y p r o v i d e a s s u r a n c e of c o r r e c t n e s s i f a l l
the following are true:
on
"Testing must be planned for throughout
the entire design and coding process"
[ H e t z e l (1973), p. 18]
Testing must not only he planned for, it is in
fact performed in one way or another at every
stage of the programming process, not just when
a program has been completed.
Specifications
m u s t b e t e s t e d a g a i n s t e x a m p l e s to s e e i£ t h e y
are complete, consistent, and unambiguous; programs are desk checked by informally consideri n g w h a t w i l l h a p p e n w h e n c e r t a i n k i n d s of d a t a
are processed; a proof must partition data into
cases that affect the validity of assertions differe n t l y . T e s t i n g , i n t h e s e n s e of i d e n t i f y i n g d i s tinguishable cases and evaluating their effect on
the programming
requirement,
specifications,
c o d e , o r a p r o o f , o c c u r s i n e v e r y p h a s e of p r o gramming.
Moreover, the need to test affects
t h e p r o c e s s of p r o g r a m c o n s t r u c t i o n i n v a r i o u s
ways:
a. There is a complete axiomatization of the
e n t i r e r u n n i n g E n v i r o n m e n t of t h e p r o g r a m - all language, operating system, and hardware processors.
b. The processors
are proved consistent
the axiomatization,
T h e E f f e c t of T e s t i n g C o n s i d e r a t i o n s
Program Development
with
c. T h e p r o g r a m i s c o m p l e t e l y a n d f o r m a l l y
implemented in such a way a proof can be
performed or checked mechanically.
d. The specifications are correct in that if
every program in the system is correct
with respect to its specifications, then the
entire system performs as desired.
a) T e s t i n g b e i n g i n e v i t a b l e , i t i s g o o d p r a c t i c e
to identify testing needs, e.g., weak or
critical links in a system, early in program
d e s i g n (e. g . , s e e H a n s e n (1973)).
These requirements
a r e f a r b e y o n d t h e s t a t e of
t h e a r t of p r o g r a m s p e c i f i c a t i o n a n d m e c h a n i c a l
theorem proving, and we must be satisfied in
practice with informal specifications, axiomatizations, and proofs.
Then problems arise when
proofs have errors, specifications are incomplete,
a m b i g u o u s , o r u n f o r m a l i z a b l e ( a s i n e x a m p l e Z),
and systems are not axiomatizable.
The two
examples already discussed have clearly shown
that an incomplete attempt at a program proof
does not assure a program will not fail. These
examples are very realistic in terms of the state
of t h e a r t of p r o g r a m p r o v i n g f o r r e a l p r o g r a m s .
b) Programs
should be structured so logical
t e s t i n g o f v a r i o u s a b s t r a c t i o n s of t h e p r o g r a m c a n r e d u c e a c t u a l t e s t i n g of t h e f i n a l
program.
c) S p e c i f i c a t i o n s
testable.
must be precise
enough to be
d) T h e n e e d t o g e t s p e c i a l i n f o r m a t i o n j u s t t o
verify the successful execution of a test run
a f f e c t s p r o g r a m d e s i g n , (e. g . , t e s t i n g a n
operating system scheduler requires access
to information not ordinarily available).
Despite the practical fallibility of proving,
attempts at proof are nonetheless valuable. They
can reveal errors and assist in their prevention.
Furthermore,
programs cannot be proved unless
they can be understood.
Facilitating provability
sets a worthy standard for good languages, program structure,
specifications , and documentation, and thereby assists in preventing errors.
I m p r o v e d u n d e r s t a n d i n g o f t h e n a t u r e of t e s t ing will help in developing and performing tests at
all stages in the program development process.
3. 4
Conclusions
There is a general concept which unifies all
the activities involved in demonstrating that a
program does not fail. That concept is case
analysis.
Case analysis occurs in
In short, neither proofs nor tests can, in
practice, provide complete assurance that prog r a m s w i l l n o t f a i l . E v e n s o , a t t e m p t s to p r o v e
programs,
like attempts to test them completely,
a r e e s s e n t i a l t o t h e d e v e l o p m e n t of u s a b l e p r o grams.
Tests have the advantage of providing
accurate information about a program's
actual
behavior in its actual environment; a proof is
limited to conclusions about behavior in a p0stu 1~ted environment.
Both have an essential role
In program development, though each is a fallible
technique.
Testing and proving are complementary methods for decreasing the likelihood of program failure.
a) D e s i g n a n d s p e c i f i c a t i o n w h e r e i t i s n e c e s s a r y
to exclude cases of data where the program is
not expected to operate or to identify cases in
the input or output which require Special treatment.
I n E x a m p l e 1, t h e s e w o u l d i n c l u d e s u c cessive break characters and termination characters, and in Example Z the empty exam.
b) P r o g r a m c o n s t r u c t i o n , w h e r e t h e s o l u t i o n t o
the problem dictates that certain cases require
503
special treatment and other cases can be
lumped together.
These are iUustrated in
Example 1 where the predicates bufpos ~ 0
and fill ~ 0 form paths which treat successive
break characters and the first word on the
first line, respectively,
and in Example Z
where the empty exam must be filtered out
of t h e e x a m s t o b e p r o c e s s e d .
c) P r o g r a m p r o v i n g , w h e r e a s s e r t i o n s m u s t b e
made about special cases or generalized to
cover several cases and where proofs break
into cases.
Example 1 shows an assertion
which must have a case which covers the first
word as well as the case that covers all nonfirst words and Example Z shows that the
assertion which is not formalized must be
checked out in a different manner from the
more formalized assertions.
d) T e s ~ ,
where it must be demonstrated that
the total of all cases involved in program
specification, design, construction, and
proof have been correctly and completely
handled.
Ideally, the cases created during
design specification, construction, and
proof should be an adequate basis for test
data selection, although the cases may need
further analysis and combination in order to
obtain a reasonably sized and still effective
set of test data.
There is a great need for more examples to
m a k e v a g u e t e s t i n g c o n c e p t s (e. g . , " b o u n d a r y "
condition) at least more intuitively clear and for
theories of test data selection which try to anal y z e t h e f a c t s a b o u t t e s t i n g i n r e l a t i o n to e a c h
o t h e r f o r t h e p u r p o s e of g u i d i n g t h e s e l e c t i o n o f
test data toward greater effectiveness.
Section
4 will illustrate a technique for selecting test
data and Section S will analyze this technique
i n t e r m s o f t h e t h e o r y s k e t c h e d i n S e c t i o n 1.
4.
Developing Effective Program
Tests
Our goal is to define how to select test data
for a particular program such that if the program
processes all test data correctly, we can be reas o n a b l y c o n f i d e n t t h e p r o g r a m Will p r o c e s s al_~
data correctly.
For this to be possible, the
test data must cover all errors in the program
b e i n g t e s t e d , i. e . , t e s t d a t a s e l e c t i o n c r i t e r i a
must ensure that data revealing any errors will
be selected.
Our approach to testing (and indeed, any
approach) is based on assumptions about how
program errors occur.
Our approach assumes
errors are primarily due to 1)inadequateunderstanding of all conditions a program must deal
with, and Z)failure to realize that certain comb i n a t i o n s of c o n d i t i o n s r e q u i r e s p e c i a l t r e a t m e n t .
We distinguish:
• Test data, the actual values from a program's
input domain that collectively satisfy some
test data selection criterion;
• Test predicates,
a description of conditions
a n d c o m b i n a t i o n s of c o n d i t i o n s r e l e v a n t t o
the program's
correct operation.
In short, test predicates describe what aspecte
of a p r o g r a m a r e t o b e t e s t e d ; t e s t d a t a c a u s e
these aspects to be tested.
O u r p u r p o s e i n t / U s p a r t of t h e p a p e r i s 1 ) t o
iUustrate the l~roblems and issues faced in developing reliable and valid tests for a particular
p r o g r a m s o t h a t l a t e r , i n S e c t i o n 5, w h e n w e
discuss formal criteria for reliability and validity, examples illustrating the formal crite.riawill
b e a t h a n d ; a n d 2) t o s h o w i n f o r m a l l y , w h a t m a y ,
with further research,
become a practical approach to defining reliable tests.
4. 1
An Overview
of t h e M e t h o d
There are several sources of information
about the conditions and combinations of conditions relevant to a programts operation:
• the general requirement a program is to
satisfy;
• the program's
specification;
• general characteristics
of t h e i m p l e m e n tation method used, including special
conditions relevant to data structures
and how data is represented; and
• the specific internal structure of an actual
implementation.
N o n e of t h e s e s o u r c e s of i n f o r m a t i o n i s b y i t s e l f
sufficient for generating a reliable test. Inforro.ation from all these sources must be used to
i n s u r e a l l c o m b i n a t i o n s of c o n d i t i o n s a r e i d e n t i fied.
Our technique for developing and describing
test predicates is derived from decision table
techniques, and will be called the condition table
method.
A condition table looks like the top half
of a decision table. Each column in a condition
table is a logically possible combination of conditions that can occur when executing the program
being described.
Each column is our representation of a test predicate.
The basic idea is to ident i f y c o n d i t i o n s d e s c r i b i n g s o m e a s p e c t of t h e
p r o b l e m o r p r o g r a m to b e t e s t e d , a n d t h e n b u i l d
condition tables for groups of conditions.
Sometimes separate condition tables are combined
into a single table; when to combine tables and
when to leave them separate is an important
issue needing further study.
4. Z
Deriving
Test Cases from Specifications
A program's
specifications are an important
source of testbecause data satisfying suchpredicates are able to detect missing path and inappropriate path selection construction errors.
Such
data can also detect design and specification errors,
as we shall see. We will useNaur's
specifications
for Example 1 to illustrate test case development
from specifications.
Then, using knowledge of
an actual implementation (either that shown in
F i g u r e Z o r F i g u r e 3), w e w i l l s h o w h o w t o e l i m inate some test predicates without impairing test
reliability.
(It m a y a l s o h a p p e n t h a t s o m e ' t e s t
predicates will have to be added when information
about an actual implementation becomes available.
but this does not occur here. )
T h e f i r s t p a r t of N ~ u r ' s
specification
states:
(1) " G i v e n a t e x t , c o n s i s t i n g of w o r d s s e p a r a t e d
by BLANKs or NL (new line)characters .... "
This clause attempts to describe the program's
input domain.
We begin to develop a condition
Table 6
FIRST CONDITION TABLE FOR INPUT DOMAIN
•
6~
Cls Pre~h~s~SL, ML, Ot] B L 8L B ~ BL N L ML N L HL IO t i O t
Ot
t
i
cz, c = c 1 , ~ d s ~ N L . o t ]
S L S L I C t 1. s c ~ c c e 1. I B ~ ~ L l o t
Ot
1'
1'
I'
1'
1.
s~
~.,
c¢
1.
ZZ. 23
Table 7
REVISED
1
el, PrlrvCba.~BL.NL. Ot]
BL
2
$
BL B L
4
S
BL B L
i C2: ~ r C h a d B L . NL, Or, ET} B L BL N L NL dot
C3: F i r s t Word?
Y
H
Y
N
•
CONDITION
TABLE
FOR
INPUT
DOMAIN
24
15
2'6 127
z8
BL B L B L B L B L B L
BL I ML N L N L N L : N L N L N L N L NL N L N L NL
I
!
Gt
Ot
Ot
Ot
Ot
Ot
ETBL
BL ~ L
NL ~
~
Y'
N
N
r
6
?
10Jl1
819
i lZ [ ~3
14
16
17
18
19
20
11
29130;31
3z
l
ET!ET
r
ET
Y
N
N
Y
N
Y
N
>1
1
>1
' 1'
?
1'
1'
Y
~
I~
N
i
E T E T E T IE ~
i
N
Yiy
>1
!
N
N
iOt
1,
1, [ 1.
1,
B L N L Ot
ET
BL NL 0t
E~
(1') (Y) ~ )
(Y] (N) (N) (1,1:) ('M)
I
C4,
Sink
lensth[1,>l]
. I
1' i 1'
?
?
1 ' >1
'1
>1 J 1
1
>1
1
>1
1
>1
1, I ?
?
1.
1.
1'
1'
C o n s t r a i n t 8 b e t w e e n ¢onditiol~s
C 2 ~ L ) V CZ(NL) : C4(1.)
Cl(Ot) : G3
c1(1') =-~c3
table by loolcing for conditions relevant to the input domain that are also relevant to the processi n g r e q u i r e d b y t h e s p e c i f i c a t i o n , i. e . , w e t r y t o
extract conditions from the specification that we,
as programmers,
would consider relevant to deciding when it is appropriate to perform certain
actions.
F o r e x a r n p l e , i f w e t h i n k i n t e r m s of
scanning the input text character by character,
then it's possible and reasonable to describe the
i n p u t i n t e r m s of t h e c h a r a c t e r c u r r e n t l y b e i n g
~canned and the one irnrnediately preceding it.
This approach gives rise to the condition table
s h o w n i n T a b l e 6. T h e n o t a t i o n t h e r e s h o w s
t h a t t h e v a l u e s of P r e v C h a r ( c o n d i t i o n C I ) a n d
C u r C h a r ( c o n d i t i o n CZ) c o n s t i t u t e c o n d i t i o n s
relevant to the program's
correct operation. In
particular,
t h e p o s s i b l e v a l u e s f o r C1 a n d CZ a r e
partitioned into three subsets, the value BL (for
BLANK), the value NL, and all other character
v a l u e s , O t . A n y c o n d i t i o n m a y , of c o u r s e , b e
undefined (represented in the table by a question
mark), and so there are four possibilities to be
c o n s i d e r e d f o r e a c h c o n d i t i o n , y i e l d i n g 16 p r e d i c a t e s i n a11.
f o r d e a l i n g w i t h v a r i o u s t y p e s of i n p u t m e d i a ( e . g . ,
cards or paper tape) and this interpretation subsumes the more restricted case of single break
characters,
s o w e a d o p t i t . Of c o u r s e , t h e a m biguity should be checked with the specification
writer, since our reasoning for permitting data
satisfying these predicates may not be valid in
t h e c o n t e x t of t h e a c t u a l u s e o f t h e p r o g r a m .
It
can be seen here how preparation of a condition
table is a natural way to check a specification
f o r c o r n p l e t e n e s s a n d l a c k of a m b i g u i t y , a s w e l l
as for identifying characteristics
of test datathat
should be Presented to the program during the
test phase.
Next we need to check the test predicates for
r e l i a b i l i t y , i. e . , i f w e s e l e c t d a t a t o c o v e r e a c h
predicate, wiU we be forced to select data capa b l e o f r e v e a l i n g al'l e r r o r s i n a n i m p l e m e n t a t i o n ?
To answer this question, we must decide whether
any conditions relevant to the correct operation
of t h e p r o g r a m a r e m i s s i n g f r o m t h e t a b l e a n d
whether value sets have been partitioned appropriately.
We must also decide whether the test
p r e d i c a t e s a r e i n d e p e n d e n t , i. e . , w h e t h e r t h e
sequence in which test predicates are exercised
when processing test data is potentially signifi c a n t to t h e c o r r e c t n e s s
of the program.
For
example, we can see that data chosen to exercise all the predicates in Table 6 will have to
include data where words are separated by
at least two break characters,
but all predicates can be exercised without necessarily
having exactly one break character between any
w o r d s . ~ I s t h e o c c u r r e n c e of a b r e a k of l e n g t h
one significant to the correct operation of a
program?
If a program correctly processes
t e x t c o n t a i n i n g b r e a k s of l e n g t h t w o o r g r e a t e r ,
will it necessarily correctly process text conraining breaks of length one? What about breaks
of l e n g t h o n e a n d t w o p r e c e d i n g t h e f i r s t w o r d i n
The next step is to check if all these predicates can be satisfied.
Clearly the current or
previous character may have an undefined value.
CZ i s u n d e f i n e d w h e n ~
end of the text is found.
C1 i.s u n d e f i n e d w h e n t h e c u r r e n t c h a r a c t e r i s t h e
f i r s t c h a r a c t e r i n t h e t e x t . P r e d i c a t e 16 t h e r e fore is satisfied by an empty text. To make this
interpretation of undefined values clearer, in
later condition tables we will redefine the value
set for CurChar to include the pseudo-character
E T , d e s i g n a t i n g e n d - o f - t e x t , a n d n o t p e r m i t C2
to be undefined.
L o o k i n g f u r t h e r , a t p r e d i c a t e s 1 a n d 2, w e
immediately see an ambiguity in the specification.
It's not ciear whether words are separated by a
single BLANK or NL character, or whether several
such break characters are permitted.
If o n l y o n e
is permitted,
t h e n p r e d i c a t e s 1, Z, 5, a n d 6 a r e n o t
satisfiable.
Also, if breaks are not permitted
after the last word in the text or before the first
w o r d , t h e n p r e d i c a t e s 4, 8, 13, a n d 1 4 c a n b e . e l i m inated as unsatisfiable.
Permitting all these
predicates to be satisfied seems the most reasonable approach since it provides more flexibility
~For example, if predicate 9 is exercised foll o w e d b y p r e d i c a t e 3, a b r e a k o f l e n g t h o n e h a s
been processed,
b u t i f t h e s e q u e n c e i s 9, 1, 3,
t h e n t h e b r e a k i s t w o c h a r a c t e r s l o n g . So
length of breaks is represented as a particular
s e q u e n c e of p r e d i c a t e s b e i n g e x e c u t e d .
505
?
the text or following the last word?
From experience, we know that errors involving all these
s o r t s of c o n d i t i o n s a r e q u i t e p o s s i b l e , s o w e w i 1 1
a d d c o n d i t i o n s to f o r c e s e l e c t i o n o f d a t a t h a t
check these error possibilities.
suggested.
Examining the clause with respect
t o T a b l e 7, h o w e v e r , s h o w s t h a t a l i n e b r e a k c a n
never be made before the first word in the text
unless the first word is preceded by one or more
break characters.
Naur's program itself violates
this clause because it always puts out a line break
• at the beginning of the output. Undoubtedly, the
i n t e n t of t h e c l a u s e w a s n o t t o f o r b i d a n i n i t i a l i z ing line break in the output, but rather to say that
a word in the input text must be contained completely on a single line in the output; here is
another specification error (failure to correctly
e x p r e s s t h e i n t e n t of t h e d e s i g n e r ) r e a d i l y d e t e c t e d
by case analysis.
To distinguish breaks before the first word,
we add a condition "First Word?", which is false
u n t i l o n e Ot c h a r a c t e r i s s e e n a n d t h e n t r u e .
To
ensure test data contain breaks of length I, the
condition "Break length" is added; its values
r e f e r to t h e l e n g t h of a b r e a k p r e c e d i n g E T o r Ot.
Table 7 results; note the conditions are not aU
independent.
W h e n ' C l i s u n d e f i n e d , C3 i s f a l s e ,
etc. Constraints between conditions are indicated
below the condition table.
The remainder of the specification states that
"each line is filled as far as possible as long as
no line will contain more than MAXPOS characters. " We will not construct a condition table for
this part of the specification.
Such a table would
c o n t a i n c o n d i t i o n s d e s c r i b i n g t h e l e n g t h Of w o r d s ,
current and future lengths of lines, and possibly,
t h e n u m b e r of w o r d s o n a l i n e . T h e t a b l e w i l l
show how to exercise a program for aU relevant
combinations of boundary conditions.
Arguments
discussing what constitutes a proper set of test
p r e d i c a t e s f o r t h i s p a r t of t h e s p e c i f i c a t i o n a r e
complex and alrnost require a paper in itself. We
only intend to sketch an approach to test predicate
construction here.
Are the test predicates sufficiently independent
n o w ? Do s o m e s e q u e n c e s o f t e s t p r e d i c a t e s e x e c u t i o n c o r r e s p o n d ~o s p e c i a l p r e d i c a t e s t h a t s h o u l d
be included?
It can be seen that data consisting
of a t m o s t o n e w o r d w o u l d s u f f i c e t o e x e r c i s e a l l
the test predicates.
Equally well, all test data
c o u l d c o n s i s t of t e x t s c o n t a i n i n g m o r e t h a n o n e
word.
It's highly unlikely that a program will
process words correctly for texts several words
long but not for texts one word long, or vice versa, although such a program could of course be
specially written.
We will be content, for now,
w i t h n o t i n g t h a t t h e n u m b e r of w o r d s i n a n i n p u t
text might be a significant condition for some
programs; later when an implementation is availa b l e w e c a n d e c i d e w h e t h e r t h e n u m b e r of w o r d s
in the text is a condition for which an error may
exist; if so, the condition table can be modified
t o i n s u r e t h i s a s p e c t of t h e p r o g r a m i s t h o r o u g h l y
tested.
4. 3
Test predicates are the motivating force for
data selection.
A test predicate is found by identifying conditions and combinations of conditions
relevant to the correct operation of a program.
Conditions arise first and primarily from the
specifications for a program.
As implementations
are considered, further conditions and predicates
may be added.
Conditions can arise from many
sources, and as they come to mind, they are writt e n d o w n i n t h e c o n d i t i o n s t u b of a t a b l e .
The
columns of the table are then generated and
checked.
The checking process may suggest further conditions to be added. If the number of
predicates becomes excessive, it is better to
wait until information from an actual implementation is available to reduce the number of predicates so the impact of the reduction on test reliability can be explicitly assessed.
T e s t p r e d i c a t e s c a n b e e l i m i n a t e d (i. e . , t h e i r
exercising made unnecessary) if it can be shown
that data satisfying the predicates are treated the
same by the actual program and from the viewpoint of the specifications,
For example, distinguishing between BL and NL in Table 7 is un~necessary.
C u r s o r y e x a m i n a t i o n of t h e p r o g r a m s
in either Figure 1 or Figure 2 shows that every
t i m e CW i s t e s t e d f o r e q u a l i t y w i t h B L i t i s a l s o
tested for equality with NL; the specifications,
moreover,
do n o t r e q u i r e d i f f e r e n t e f f e c t s d e p e n d ing on whether a BL or NL is the value of a break
character.
So w e c a n s a f e l y p a r t i t i o n t h e v a l u e
set for CurChar into BL or NL, ET, and Ot and
the value set for PrevChar into BL or NL, and
Ot without reducing the reliability of this set of
test predicates for these particular programs.
This will reduce the number of test predicates
i n T a b l e 7 f r o m 32 t o 16 a n d t h e n u m b e r o f t e s t
runs required from ten (one for each ET condition) to six. This illustrates how the amount
of t e s t i n g c a n b e s a f e l y r e d u c e d a f t e r a s e t o f
test predicates has been developed independently
of program structure.
P r o o f s of s i m p l e p r o g r a m p r o p e r t i e s c a n r e d u c e t h e a m o u n t of t e s t i n g
required.
The next part of Naur's
specification
The major problem in our approach as illustrated so far is that conditions are considered
as they spring to mind.
This may mean wasting
effort on conditions unlikely to be connected with
errors in the actual implementation to be tested.
Nonetheless,
s u c h c o n d i t i o n s do t e s t s o m e t h i n g
about a program.
T h e q u a l i t y of t h e t e s t p r e d i cates cannot be decided at the exact moment of
their conception.
Test predicate analysis must
be carried out in its entirety and then checked
for overall reliability before deleting individual
test predicates.
states:
"... line breaks (in the output) must be made
only where the (input) text has BLANK or
The emphasis is on a systematic approach,
and the condition table fulfills this purpose by
recording conditions and their combinations in
an orderly and mechanically checkable fashion.
It forces attention toward specifications and
what the program should do rather than an
NL... "
T h i s c l a u s e states a constraint o n w h e n
Summary
There are two main features of the method of
test data selection presented here--the use of
test predicates and the use of a condition table.
the action
of o u t p u t t i n g a N L c h a r a c t e r i s v a l i d . S i n c e w e
have already distinguished BLANKs and NL chara c t e r s i n T a b l e 7, n o n e w t e s t c o n d i t i o n s a r e
506
a c t u a l p r o g r a m s t r u c t u r e and w h a t a p r o g r a m
s e e m s to doe I t a v o i d s the f l a w s of t e s t i n g
m e t h o d s that f o c u s s o l e l y on the i n t e r n a l s t r u c t u r e of a p r o g r a m , but i s not n e c e s s a r i l y d i vorced from a program's internal structure
since all p r o g r a m p r e d i c a t e s m u s t ultimately
be r e p r e s e n t e d in the c o n d i t i o n ' t a b l e i f the
t a b l e i s to d e f i n e a r e l i a b l e s e t of t e s t p r e d i cates.
s a i d to b e t r u e . With t h i s de~finition in m i n d ,
C O M P L E T E ( T , C) i s d e f i n e d a s f o l l o w s :
i f TC_D, C O M P L E T E ( T , C) =
( V c e C)(~It GT)c(t) h (Vt • T)(~ice C)c(t)
This definition states that every test predicate
b e l o n g i n g t o C m u s t b e s a t i s f i e d by a t l e a s t one
t e T, and e v e r y t m u s t ss~fisfy a t l e a s t one t e s t
predicate.
Both r e q u i r e m e n t s a r e n e c e s s a r y ,
s i n c e p r o o f of Cts v a l i d i t y w i l l r e q u i r e p r o v i n g
the c o r r e c t n e s s o f a p r o g r a m f o r data that s a t i s fy no t e s t p r e d i c a t e , a n d h e n c e , a r e i n c l u d e d in
no s e t of t e s t d a t a T.
Th e type of r e a s o n i n g u s e d in s e l e c t i n g
t e s t p r e d i c a t e s i s m u c h l i k e that u s e d in
c r e a t i n g a s s e r t i o n s , and h e n c e the a p p r o a c h
f o c u s e s a t t e n t i o n on the a b s t r a c t p r o p e r t i e s of
the p r o g r a m and i t s s p e c i f i c a t i o n s . A t e s t
p r e d i c a t e a n a l y s i s of a p r o g r a m m a y be a p r a c tical f i r s t step toward p r o g r a m proving with
the a d v a n t a g e t h a t both t e s t i n g and p r o v i n g
c ou l d be p e r f o r m e d s e q u e n t i a l l y o r in p a r a l l e l .
5.
T h e ex~amples in S e c t i o n 2 showe~ t h a t it i s
r e a d i l y po~lsible to c h o o s e data t h a t e x e r c i s e
t e s t p r e d i c a t e s in an o v e r l a p p i n g m a n n e r (e. g . ,
s e e T a b l e !5). T h i s s u g g e s t s a n a t u r a l p a r t i t i o n i n g of t h e i n p u t d o m a i n into r e l a t e d e q u i v a l e n c e c D i s s e s , E ( C ' ) d e f i n e d as f o U o w s :
L e t C ~_~C.
T h e T h e o r y of T e s t i n B
Th e m e t h o d o l o g y i l l u s t r a t e d in the p r e c e d i n g
s e c t i o n c o n s t i t u t e s an i n f o r m a l a p p l i c a t i o n of t h e
t h e o r e t i c a l conce15ts d e f i n e d in S e c t i o n I , i. e . ,
S e c t i o n 4 d e m o n s t r a t e s th e u s e of C O M P L E T E ( T , C )
R E L I A B L E ( C ) , and V A L I D ( C ) , a s a m e a n s of d e vising thorough t e s t s when the t e s t data s e l e c t i o n
c r i t e r i o n , C, c o n s i s t s of a s e t of t e s t p r e d i c a t e s .
In t h i s S e c t i o n , w e w i l l p o i n t out how o u r u s e of
t h i s s o r t of t e s t d a t a s e l e c t i o n c r i t e r i o n s a t i s f i e s
these previously defined theoretical concepts.
E(C' '} = { d, D I (Vc, C,)c(d)^(vc, c - c , he(d))_
i. e. , E ( C ' ) c o n t a i n s a l l d ~ D t h a t s a t i s f y a l l a n d
only t h o s e t e s t p r e i l i c a t e s b e l o n g i n g to C% N o t e
t h a t data b e l o n g i n g t o the e q n i v a l e n c e c l a s s E(~)
s a t i s f y n o t e s t p r e d i c a t e . The r e l a t i o n b e t w e e n
these equivalence classes for a C containing
t h r e e t e s t p r e d i c a t e s i s shown s c h e m a t i c a l l y in
F i g u r e 4. Note that data b e l o n g i n g to, f o r e x a m p l e , E ( C i 2 ) s a t i s f y j u s t t e s t p r e d i c a t e s c I and c~,
and t h a t i f t e s t T c o n t a i n s data b e l o n g i n g to
E(C12), T n e e d not a l s o c o n t a i n data b e l o n g i n g
to E ( C i ) and E ( C 2 ) to s a t i s f y C O M P L E T E . Note
t h a t a c o m p l e t e t e s t , t h e r e f o r e , i s e q u i v a l e n t to
s e l e c t i n g one d a t u m f r o m a s u b s e t of the e q u i v a l e n c e c l a s s e s w i t h o u t s e l e c t i n g any data f r o m E(d).
Since a test p r e d i c a t e is a condition c o m b i n a t i o n in a c o n d i t i o n t a b l e , s e l e c t i n g data to s a t i s f y
a t e s t p r e d i c a t e m e a n s s e l e c t i n g d a t a to e x e r c i s e
the c o n d i t i o n c o m b i n a t i o n i n th e c o u r s e of e x e c u ting F . N o t a t i o n a l l y , i f a d a t u m d E D s a t i s f i e s
t e s t p r e d i c a t e c ~ C i n t h i s s e n s e , t h e n c(d) i s
C
= ( c | , c2, c~
C123 = C
--
,
\
c23
= (°2
C1
= (c a
/
E(Cz3)
E(C 3)
•
E(¢)
Figure 4
S t r u c t u r e s h o w in g r e l a t i o n s h i p s b e t w e e n e q u i v a l e n c e c l a s s e s i n d u c e d
by a s e t of t e s t p r e d i c a t e s , C. If C is reli~Lble, the c o r r e c t p r o c e s s i n g of data d r a w n f r o m one e q u i v a l e n c e cDLss (e. g . , E(C~2)) p r o v e s
th e c o r r e c t n e s s of the p r o g r a m f o r data d r ~ w n f r o m c e r t a i n o t h e r
c l a s s e s ( e . g . , E(C~) and E(C2)), i . e . , s u c c e s s p r o p a g a t e s d o w n w a r d s
in the l a t t i c e of e q u i v a l e n c e c l a s s e s . S i m i l a r l y , f a i l u r e p r o p a g a t e s
u p w a r d . F o r e x a m p l e , the i n c o r r e c t p r o c e a s i n g of data d r a w n f r o m
E(C3) i m p l i e s data d r a w n f r o m E(Cz3), E( C1 3 ) , and E(C1z3) w i l l a l s o
be p r o c e s s e d i n c o r r e c t l y .
507
Of c o u r s e , not ju~st any s e t of t e s t p r e d i c a t e s
w i l l c o n s t i t u t e a r e l i a b l e and v a l i d t e s t data s e l e c t i o n c r i t e r i o n . The f o l l o w i n g r e l a t i o n s h i p c a n b e
u s e d , h o w e v e r , to sho~w that C w i l l s a t i s f y t h e
g e n e r a l d e f i n i t i o n of V A L I D g i v e n i n F i g u r e I .
(Vd e E(¢)) OK(d)
D VAI~ID(C)
T h i s m e a n s C i s v a l i d i~ a p r o g r a m i s c o r r e c t f o r
a l l d a t a s a t i s f y i n g n o t e s t p r e d i c a t e . With r e s p e c t to the c o n d i t i o n t a b l e t e c h n i q u e , t h i s v a l i d i t y
c r i t e r i o n r e q u i r e s that c o n s t r a i n t s a m o n g c o n d i t i o n s be p r o p e r l y d e s c r i b e d ; e v e r y c o n d i t i o n c o m b i n a t i o n e x c l u d e d a s i m p o s s i b l e m u s t a c t u a l l y be
i m p o s s i b l e . M o r e o v e r , t h e d o m a i n of v a l u e s
a s s o c i a t e d with s o m e v a r i a b l e m u s t be c o r r e c t l y
d e s c r i b e d , e . g . , CW in S e c t i o n 4 m u s t a t m o s t
t a k e on the v a l u e s B L , NL, E T , and O t h e r . A l t h o u g h a n o n - e m p t y E(~) c a n b e v i e w e d as w a r n ing t h a t a p r o g r a m c a n e x e c u t e a r e l i a b l e t e s t
s u c c e s s f u l l y and s t i l l c o n t a i n e r r o r s b e c a u s e i t
i s i n v a l i d , i t c a n a l s o be v i e w e d m e r e l : ~ a s i n d i catting that t h e c o r r e c t n e s s of t h e p r o g r a m f o r
s u c h data i s to be a s s e s s e d b y m e a n s o t h e r than
t h e t e s t s i m p l i e d by C. S e p a r a t e v e r i f i c a t i o n ,
f o r e x a m p l e , by p r o o f , m a y b e e a s i e r t h a n v e r i t i c a t i o n b y t e s t i n g , o r i t m a y b e that t h e d a t a in
E(~) h a v e a l r e a d y b e e n v e r i f i e d b y p r i o r t e s t s .
It m a y be t h a t a c o n s c i o u s d e c i s i o n h a s ]been
m a d e not to e x e r c i s e c e r t a i n t e s t p r e d i c a t e s
b e c a u s e it i s " o b v i o u s " the p r o g r a m will• c o r r e c t l y p r o c e s s data s a t i s f y i n g t h e m . Thins c a n
r e d u c e the t o t a l t e s t i n g e f f o r t . In a n y e v e n t ,
the p r o g r a m V s c o r r e c t n e s s f o r d a t a b e l o n g i n g
to E(~) i s not d e t e r m i n e d by t h e t e s t s d e f i n e d
by C.
It m a y s o m e t i m e s be i m p r a c t i c a l to c o n d u c t
c o m p l e t e t e s t s . In t h i s c a s e , if C i s kno,~,n to
be v a l i d b e c a u s e E(~) i s e m p t y , t h e n t e s t
data should be c h o s e n j u s t f r o m equivalLence
c l a s s e s t h a t s a t i s f y t e s t p r e d i c a t e s l i k e l y to be
e n c o u n t e r e d in p r a c t i c e . W h il e s u c h a t e s t w i l l
not n e c e s s a r i l y b e v a l i d , i t w i l l s t i l l be~ r e l i a b l e i f C i s r e l i a b l e , and e s t i m a t e s of o v e r a l l
p r o g r a m r e l i a b i l i t y c a n be d e v i s e d by e s t l m a t i n g
t h e f r e q u e n c y w i t h w h i c h data s a t i s f y i n g u n e x e r c i s e d t e s t p r e d i c a t e s w i l l be e n c o u n t e r e d i n a c t u a l
u s e of t h e p r o g r a m . T h i s s o r t of a n a l y l J i s c a n
s e t a l o w e r bound on the e s t i m a t e d f r e q m e n c y of
f a i l u r e in the p r o g r a m V s a c t u a l o p e r a t i o n , e v e n
w h e n a p r o g r a m i s t e s t e d only i n c o m p l e t e l y .
T h e f o r m a l d e f i n i t i o n of R E L I A B L E ( ( : ) i m p l i e d by the u s e of t e s t p r e d i c a t e s i s r a t h e r
c o m p l e x , b u t the e s s e n t i a l i d e a i s to d e t i r ~ e t e s t
p r e d i c a t e s so s u c c e s s o r f a i l u r e when e x e c u t i n g
a p r o g r a m F with a p a r t i c u l a r datum d depends
only on w h a t i n d i v i d u a l t e s t p r e d i c a t e s d s ~ t i s t i e s , not the p a r t i c u l a r c o m b i n a t i o n s a t i s f i e d .
T h i s i s the s e n s e in w h i c h t e s t p r e d i c a t e s 1Trust
be independent. F o r example, using the d e f i n i t i o n of C g i v e n in F i g u r e 4, i f d12c E(C12) and
OK(dlZ), then the d e f i n i t i o n of C O M P L E T E ~ s a y s
t h a t t h e r e i s no n e e d to t e s t F w i t h d a t a b e l o n g i n g to E ( C i ) o r E(C2). H e n c e O K ( d i e ) i m p l ' i e s
OK(d1) an d OK(de), W h e r e d I e E(C I) a n d d 2 c E( Cz)
i f C i s r e l i a b l e . S i m i l a r l y , OK(d1) and OKA[d2)
i m p l i e s O K ( d l 2 ) , s i n c e one c o m p l e t e t e s t c o u l d
I n c l u d e d12 and a n o t h e r , b o t h d I and dz; r e H a r d l e s s of w h i c h c o m p l e t e t e s t i s a c t u a l l y p e r f 0 , r m e d ,
t h e s u c c e s s of e i t h e r c o m p l e t e t e s t m u s t I m | ~ l y
t h e s u c c e s s of a l l o t h e r s i f C i s r e l i a b l e . N,ote
a l s o t h a t the d e f i n i t i o n of C O M P L E T E i m p l l e ; s
t h a t to b e r e l i a b l e , n e i t h e r t h e s e q u e n c e of t e s t
p r e d i c a t e e x e c u t i o n n o r t h e f r e q u e n c y of e x e c u tion can be r e l e v a n t to the s u c c e s s f u l p r o c e s s i n g
of t e s t data b e c a u s e data a r e s a i d to s a t i s f y a
t e s t p r e d i c a t e w h e t h e r the data c a u s e the p r e d i c a t e t o l b e e x e r c i s e d one t i m e o r m a n y t i m e s ,
and no m a t t e r w h a t s e q u e n c e s of p r e d i c a t e s a r e
e x e r c i s e d b e f o r e o r a f t e r a p a r t i c u l a r one. If
f r e q u e n c y o r s e q u e n c e of t e s t p r e d i c a t e e x e c u t i o n i s p o t e n t i a l l y r e l e v a n t to an e r r o r ' s e x i s t ence, s e p a r a t e test p r e d i c a t e s m u s t be defined
to c o v e r a l l r e l e v a n t s e q u e n c i n g o r f r e q u e n c y
p o s s i b i l i t i e s . ( T h i s w a s t h e r e a s o n we a d d e d
" b r e a k l e n g t h " and " f i r s t w o r d ? " to t h e c o n d i t i o n t a b l e c r e a t e d i n S e c t i o n 4. )
T h e f o r m a l d e f i n i t i o n of R E L I A B L E ( C ) , w h e n
C i s a s e t of t e s t p r e d i c a t e s , t h e r e f o r e i s :
RELIABLE(C)
= (VC 1 c C)(VC z c C)(
(~t I ¢ E(Ci)) (~t2¢ E(C2))
(OK(tl)~,OK (t2)) D
(VC' C C 1 U C z)(c'~t~ D
(Vd' e E(C')OK(d') ) )
This m e a n s that if a p r o g r a m is c o r r e c t for some
d a t u m , d, ir~ e q u i v a l e n c e c l a s s e s o t h e r t h a n E(~),
t h e n i t i s c o r r e c t f o r an y d a t a s a t i s f y i n g a (none m p t y ) s u b s e t of the t e s t p r e d i c a t e s e x e r c i s e d by
d. N o t e t h a t t e s t s c o n t a i n i n g only d a t a s a t i s f y i n g
l a r g e n u m b e r of t e s t p r e d i c a t e s a r e m o r e
p o w e r f u l i n the s e n s e t h a t d a t a s e l e c t e d f r o m
only a f e w e q u i v a l e n c e c l a s s e s s u f f i c e s to d e m o n strate program correctness.
Such t e s t s a r e u s e ful w h e n t h e p r o b a b i l i t y of f i n d i n g an e r r o r i s
d e e m e d low, as in r e g r e s s i o n t e s t i n g [ E l m e n d o r f
(1969)]. T e s t s u s i n g data s a t i s f y i n g only a f e w
t e s t p r e d i c a t e s c a n f a i l f o r f e w e r r e a s o n s and so
a r e u s e f u l in e i t h e r l o c a l i z i n g a f a i l u r e o r in
a t t e m p t i n g to i d e n t i f y as m a n y d i f f e r e n t e r r o r s
a s p o s s i b l e in a s e t of t e s t r u n s . M i n i m i z i n g
t h e c o s t of t e s t i n g r e q u i r e s j u d g e m e n t in d e c i d ing w h i c h e q u i v a l e n c e c l a s s e s to c h o o s e f r o m i n
s a t i s f y i n g C O M P L E T E ( T , C).
We d i s c u s s e d e a r l i e r why p r o v i n g v a l i d i t y n e e d
n o t be d i f f i c u l t . P r o v i n g r e l i a b i l i t y i s h a r d e r .
Such a p r o o f s u f f e r s f r o m t h e s a m e p r o b l e m s as
d i r e c t p r o o f s of p r o g r a m c o r r e c t n e s s , i n s o f a r a s
• i n s u r i n g t h e c o r r e c t n e s s of t h e p r o o f and a x i o m a t i z a t i o n of the r u n - t i m e e n v i r o n m e n t i s c o n c e r n e d .
A p r o o f of r e l i a b i l i t y m a y be s o m e w h a t e a s i e r t h a n
a d i r e c t p r o o f of p r o g r a m c o r r e c t n e s s , h o w e v e r ,
s i n c e t h e p r o o f c a n a s s u m e the p r o g r a m w o r k s
c o r r e c t l y f o r s o m e d at a. T h e p r i n c i p a l p r o b l e m
in the p r o o f i s to show that d a t a s a t i s f y i n g e a c h
t e s t p r e d i c a t e a r e t r e a t e d in s u f f i c i e n t l y the s a m e
way that successfully exercising each test predic a t e o n c e i s s u f f i c i e n t to i m p l y s u c c e s s f u l e x e cution~
a l l data s a t i s f y i n g the p r e d i c a t e . At a
m i n i m u m , this r e q u i r e s understanding all condit i o n s r e l e v a n t to the s u c c e s s f u l p r o c e s s i ~ , g o f a
t e s t d a t u m , b u t i t d o e s not r e q u i r e p r o v i n g t h a t
the t e s t datum will be s u c c e s s f u l l y e x e c u t e d - that is proved b y t e
s u c c e s s of a t e s t run. T h i s
sh o w s t h a t p r o v i n g t e s t r e l i a b i l i t y i s t h e i n d u c t i o n
s t e p i n an i n d u c t i v e p r o o f of p r o g r a m c o r r e c t n e s s
u s i n g t h e f u n d a m e n t a l t h e o r e m ; the t e s t i t s e l f i s ,
of c o u r s e , the basis st ep .
D e f e c t s in a p r o o f of r e l i a b i l i t y , a s i d e f r o m
s i m p l e l o g i c a l e r r o r s , a r e m o s t l i k e l y to s t e m
f r o m f a i l u r e to find a l l c o n d i t i o n s r e l e v a n t to t h e
s u c c e s s f u l p r o c e s s i n g of data. F o r e x a m p l e , t h e
s e t of t e s t p r e d i c a t e s d e s c r i b e d by a c o n d i t i o n
t a b l e w i l l f a i l to b e r e l i a b l e i f t h e d o m a i n of
v a l u e s a s s o c i a t e d with s o m e v a r i a b l e is not
c o r r e c t l y p a r t i t i o n e d i n t ~ r e l e v a n t c l a s s e s of
v a l u e s (e. g . , t h e t e s t p r e d i c a t e s d e v e l o p e d i n
S e c t i o n 4 would be u n r e l i a b l e i f t h e o n l y r e l e v a n t
v a l u e s of CW w e r e not B L , N L , E T , a n d ' H e r ,
i. e . , i f t h e s e w e r e n o t the only v a l u e s r e l e v a n t
w h e n c e r t a i n a c t i o n s a r e to h e p e r f o r m e d ) o r i f
some v a r i a b l e or condition w e r e completely
m i s s i n g f r o m t h e t a b l e (e. g . , th e p r e d i c a t e
fi11+bufpos = MAXPOS). Note t h e d i f f e r e n c e
h e r e b e t w e e n a r e l i a b i l i t y e r r o r and a v a l i d i t y
e r r o r . R e l i a b i l i t y r e q u i r e s p a r t i t i o n i n g the
v a l u e s e t f o r CW c o r r e c t l y i n t o s e t s of v a l u e s
" t r e a t e d the s a m e " b y t h e p r o g r a m . V a l i d i t y
r e q u i r e s the p o s t u l a t e d v a l u e .set f o r CW i n c l u d e
a l l p o s s i b l e v a l u e s f o r CW.
5) t h e t e s t p r e d i c a t e s m u s t b e i n d e p e n d e n t , e. g . ,
a l t data s a t i s f y i n g a p a r t i c u l a r t e s t p r e d i c a t e
m u s t e x e r c i s e the s a m e path in the p r o g r a m
and t e s t the s a m e b r a n c h p r e d i c a t e s .
Note t h a t c o n s t r a i n t s 1 , 2 , and 3 a r e s a t i s f i e d only
w i t h k n o w l e d g e of the d e t a i l s of a p r o g r a m * s i m p l e mentation. Satisfying c o n s t r a i n t 5 r e q u i r e s knowing s o m e t h i n g of the p r o g r a m l s i n t e r n a l s t r u c t u r e .
V e r i f y i n g c o n s t r a i n t 4 m a y r e q u i r e both i n t e r n a l
and e x t e r n a l k n o w l e d g e of a p r o g r a m . U n d o u b t e d l y ,
m o r e c o n s t r a i n t s than t h e s e f i v e n e e d to b e s a t i s f i e d
to i n s u r e r e l i a b i l i t y , but t h i s i s a s u b j e c t f o r f u t u r e
work.
Clearly, satisfying all these c o n s t r a i n t s can
l e a d to l o t s of t e s t p r e d i c a t e s r e q u i r i n g a l a r g e
s e t of t e s t data f o r a c o m p l e t e t e s t . If an u n r e a s o n a b l e a m o u n t of t e s t data i s r e q u i r e d , a
j u d i c i o u s c o m b i n a t i o n of d i r e c t p r o g r a m p r o v i n g
and e m p i r i c a l j u d g e m e n t c a n r e d u c e s i z e of a
c o m p l e t e t e s t . A s l o n g as t e s t p r e d i c a t e s a r e
e l i m i n a t e d only a f t e r a11 c o n d i t i o n s p o t e n t i a l l y
r e l e v a n t to the c o r r e c t o p e r a t i o n of the p r o g r a m
h a v e b e e n i d e n t i f i e d , any r e d u c t i o n s in t e s t v a l i d i t y a r e at l e a s t b e i n g m a d e c o n s c i o u s l y r a t h e r
than unwittingly.
I t ' s i m p o s s i b l e to f o r m u l a t e g e n e r a l n e c e s s a r y
conditions for reliability since reliability means
t h a t a l i e r r o r s in a p r o g r a m w i l l be d e t e c t e d by
a c o m p l e t e t e s t . If a p r o g r a m in f a c t h a s no e r r o r s,
a ny s e t of t e s t p r e d i c a t e s i s r e l i a b l e . F o r e x a m p l e , e x e r c i s i n g a l l s t a t e m e n t s is not a n e c e s s a r y
c o n d i t i o n f o r r e l i a b i l i t y as l o n g a s u n e x e r c i s e d
s t a t e m e n t s a r e n e v e r r e s p o n s i b l e f o r an e r r o r .
But s u p p o s e t h a t t e s t p r e d i c a t e s a r e d e f i n e d so
data b e l o n g i n g to a p a r t i c u l a r e q u i v a l e n c e c l a s s
s o m e t i m e s e x e r c i s e a p a r t i c u l a r s t a t e m e n t and
s o m e t i m e s do not. T h e n p r o v i n g the r e l i a b i l i t y
of t h e s e t e s t p r e d i c a t e s w i l l be h a r d e r b e c a u s e
it m u s t be shown that the s t a t e m e n t s not e x e r c i s e d by s o m e data in t h e e q u i v a l e n c e c l a s s n e v e r
c a u s e an e r r o r .
O t h e r w i s e , i f an e r r o r i s a s s o c i a t e d with the o c c a s i o n a l l y e x e c u t e d s t a t e m e n t ,
s o m e data in the c l a s s w i l l e x e c u t e s u c c e s s f u l l y
and s o m e w i l l not, and this c o n t r a d i c t s R E L I A BLE(C)o So, in g e n e r a l , we c a n only d e s c r i b e
c e r t a i n c h a r a c t e r i s t i c s t h a t if not s a t i s f i e d e i t h e r
m a k e a p r o o f of r e l i a b i l i t y h a r d e r o r m a k e i t
m o r e l i k e l y the t e s t p r e d i c a t e s a r e not r e l i a b l e
f o r t h e p r o g r a m to be t e s t e d . F o r e x a m p l e , it
a p p e a r s t h a t a s e t of t e s t p r e d i c a t e s m u s t at
l e a s t s a t i s f y t h e f o l l o w i n g c o n d i t i o n s to h a v e a
r e a s o n a b l e c h a n c e of b e i n g r e l i a b l e :
6.
Summary
In t h i s p a p e r , we h a v e exa~_ined s o m e f u n d a m e n t a l q u e s t i o n s and p r o b l e m s c o n c e r n i n g p r o gram testing:
1. What is t h e p r o p e r r o l e of t e s t i n g in p r o g r a m
d e v e l o p m e n t ? How d o e s t e s t i n g c o m p a r e to
p r o g r a m p r o v i n g as a m e a n s of i m p r o v i n g
software reliability?
2. What a r e t h e w e a k n e s s e s of t e s t i n g and to
w h a t e x t e n t can t h e y be o v e r c o m e ?
3. What i s a s y s t e m a t i c m e t h o d f o r s e l e c t i n g
t e s t data and h o w can the m e t h o d be evaluated
a s to i t s r e l i a b i l i t y and v a l i d i t y ?
4. I s t h e r e a t h e o r y of t e s t i n g that can p o i n t the
w a y to i m p r o v e d m e t h o d s and p r i n c i p l e s of
testing?
In a n s w e r i n g t h e s e q u e s t i o n s we h a v e shown the
following.
1. Th e R o l e of T e s t i n g - T e s t i n g h a s an e s s e n t i a l r o l e in p r o g r a m d e v e l o p m e n t if f o r no o t h e r
r e a s o n than the p e r v a s i v e n e s s of c a s e a n a l y s i s
in a n a l y z i n g r e q u i r e m e n t s , d e f i n i n g s p e c i f i c a t i o n s ,
i m p l e m e n t i n g p r o g r a m s , and p r o v i n g p r o g r a m
c o r r e c t n e s s . We h a v e d i s c u s s e d h o w the p r a c t i c £
of a t t e m p t i n g f o r m a l o r i n f o r m a l p r o o f s of p r o g r a m c o r r e c t n e s s is useful for improving r e l i a b i l i t y , but s u f f e r s f r o m the s a m e t y p e s of e r r o r s
as p r o g r a m m i n g and t e s t i n g , n a m e l y , f a i l u r e to
find and v a l i d a t e a l l s p e c i a l c a s e s (i. e . , c o m b i n a t i o n s of c o n d i t i o n s ) r e l e v a n t to a d e s i g n , i t s
s p e c i f i c a t i o n s , t h e p r o g r a m , and i t s p r o o f . N e i t h e r
testing: n o r p r o g r a m p r o v i n g can in p r a c t i c e p r o v i d e c o m p l e t e a s s u r a n c e of p r o g r a m c o r r e c t n e s s ,
but the t e c h n i q u e s c o m p l e m e n t e a c h o t h e r and
t o g e t h e r p r o v i d e g r e a t e r a s s u r a n c e of c o r r e c t n e s s than e i t h e r alone.
1 ) e v e r y i n d i v i d u a l b r a n c h i n g c o n d i t i o n in the
p r o g r a m m u s t be" r e p r e s e n t e d by an e q u i v a l e n t "~ c o n d i t i o n in the t a b l e ;
2) e v e r y p o t e n t i a l t e r m i n a t i o n c o n d i t i o n in th e
p r o g r a m [Sites(1974)] e . g . , o v e r f l o w , m u s t
be r e p r e s e n t e d by a c o n d i t i o n in the t a b l e ;
3) e v e r y v a r i a b l e m e n t i o n e d in a c o n d i t i o n
m u s t h a v e b e e n p a r t i t i o n e d c o r r e c t l y into
c l a s s e s t h a t a r e " t r e a t e d the s a m e " by t he
program;
4 ) e v e r y c o n d i t i o n r e l e v a n t to the c o r r e c t o p e r a t i o n of the p r o g r a m t h a t i s i m p l i e d by the
s p e c i f i c a t i o n , k n o w l e d g e of th e p r o g r a m ' s
data s t r u c t u r e s , o r k n o w l e d g e of t h e g e n e r a l
m e t h o d b e i n g i m p l e m e n t e d by t h e p r o g r a m
m u s t be r e p r e s e n t e d as a c o n d i t i o n in the
table;
2. T e s t W e a k n e s s e s - Th e w e a k n e s s of t e s t i n g
l i e s in c o n c l u d i n g that f r o m t h e s u c c e s s f u l e x e c u t i o n of s e l e c t e d t e s t d at a, a p r o g r a m is c o r r e c t
f o r a11 data. C r i t e r i a f o r t e s t data s e l e c t i o n b a s e d
solely, on i n t e r n a l p r o g r a m s t r u c t u r e a r e c l e a r l y
too w e a k to i n s u r e c o n f i d e n c e in the r e s u l t s of a
successful test. since such testing methods a r e
~T h i s m e a n s v e r i f y i n g that the c o n d i t i o n s r e p r e s e n t e d in the c o n d i t i o n t a b l e a r e b e i n g a c c u r a t e l y
t e s t e d i n s i d e the p r o g r a m ; t h i s c h e c k i s n e e d e d
to i n s u r e that a t a b l e p r e d i c a t e , e . g . , "X, Y, and
Z a r e aU e q u a l " i s not a c t u a l l y e v a l u a t e d in the
p r o g r a m as ( X + Y + Z ) / 3 = X .
509
Hetzel, W.C. P r o g r a m T e s t Methods. l~rentice Hall, E n g l e w o o d Cliffs, N . J . , 1973.
Hansen," P~ B. Testing m u l t i p r o g r a r n m i n g s y s t e m s .
Software P r a c t i c e a n d Experience 3 , 2 ( A p r i l - J u n e
too weak to n e c e s s a r i l y r e v e a l a l l d e s i g n e r r o r s
o r m a n y t y p e s of c o n s t r u c t i o n e r r o r s . 11eliabilitty
i s the k e y to m e a n i n g f u l t e s t i n g and l a c k of r e l i a b i l i t y is the r e a s o n for the c u r r e n t w e a k n e s s of
t e s t i n g as a m e t h o d for i n s u r i n g s o f t w a r e c o r r e c t n e s s . F u r t h e r w o r k is n e e d e d to show when a t e s t
is reliable.
f973), 145-150.
K i n g , 1n, J . H . The i n t e r p r e t a t i o n of l i m i t e d e n t r y
decision t a b l e f o r m a t and r e l a t i o n s h i p s a m o n g
c o n d i t i o n s . C o m i m t e r J o u r n a l 12, 4 (November
3. T e s t Methodology - We b e l i e v e m o s t s o f t w a r e
e r r o r s r e s u l t f r o m f a i l i n g to see or deal c o r r e c t ly with a l l c o n d i t i o n s and c o m b i n a t i o n s of c o n d i t i o n s r e l e v a n t to the d e s i r e d o p e r a t i o n of a p r o gram. An effective methodology for reliable
p r o g r a m d e v e l o p m e n t m u s t focus on u n c o v e r i n g
t h e s e c o n d i t i o n s and t h e i r c o m b i n a t i o n s . This
i s the m o t i v a t i o n for o u r u s e of c o n d i t i o n t a b l e s - it m a k e s it e a s i e r to find a n d a n a l y z e c o n d i t i o n
c o m b i n a t i o n s . E x p e r i m e n t a l u s e of c o n d i t i o n
t a b l e s and f u r t h e r study of the m e t h o d is n e e d e d ,
howeverp b e f o r e i t s full b e n e f i t s c a n be r e a l i z e d
i n p r a c t i c e , In a n y event, a s y s t e m a t i c a p p r o a c h
to t e s t data s e l e c t i o n i s c l e a r l y n e c e s s a r y to o b t a i n effective t e s t data with r e a s o n a b l e effort.
M o s t s y s t e m a t i c m e t h o d s p r o p o s e d to date have
b e e n b a s e d s o l e l y on knowledge of a p r o g r a m t s
i n t e r n a l s t r u c t u r e , and y e t such k n o w l e d g e i s
i n s u f f i c i e n t by i t s e l f to y i e l d a d e q u a t e l y r e l i a b l e
tests.
1969), 3Z0-326.
London• R . L . Software r e l i a b i l i t y t h r o u g h p r o v i n g
p r o g r a m s c o r r e c t . I E E E Int. Syrnp. on F a u l t To!er:a n t Computing• h4arch 1971.
N a u r , P. P r o g r a m m i n g by a c t i o n c l u s t e r s . BIT 9~ 3
(1969a), 250,258.
N a u r , P . , and 11ande11, B. ( e d s . ) Software E n g i n e e r r
O S c i e n t i f i c A f f a i r s D i v i s i o n • N~ATO• B r u s s e l s 39,
gJ.u.m, J a n u a r y 1969b.
P o o c h , U. W, T r a n s l a t i o n of d e c i s i o n t a b l e s . Com......:
p u t i n g S u r v e y s 6, 2 (June 1974), 125°151.
P o o l e , I:~. C. Delmgging and t e s t i n g ° I n B a u e r , F . L.
A d v a n c e d C o u r s e on Software E n g i n e e r i n g • S p r i n g e r V e r l a g , N e w York~ 1973, Z78-318.
11ubey, 11. J. et. al. C o m p a r a t i v e E v a l u a t i o n of
PL/I___~• AD 6 6 9 0 ~ , A p r i l 1968.
4. T h e o r y of T e s t i n g - We have m a d e a s t a r t
t o w a r d a t h e o r y of t e s t i n g by d e f i n i n g s o m e
b a s i c p r o p e r t i e s of t h o r o u g h t e s t s - - r e l i a b i l i t y ,
v a l i d i t y j and c o m p l e t e n e s s . We have a l s o d e v e l o p e d a s p e c i f i c v e r s i o n of the t h e o r y i n which
the t e s t data s e l e c t i o n c r i t e r i o n c o n s i s t s of s e t s
of t e s t p r e d i c a t e s o r g a n i z e d i n a c o n d i t i o n t a b l e .
O t h e r t h e o r e t i c a l d e v e l o p m e n t s could p r o c e e d b y
d e f i n i n g o t h e r types of t e s t data s e l e c t i o n c r i t e r i a .
Sites, 11. L. P r o v i n g that c o m p u t e r p r o g r a m s t e r m i n a t e c l e a n l y . S t a n f o r d Univ. • C o m p u t e r Sci.
Dept, • S T A N - C S - 7 4 - 4 1 8 , May 1974.
Stucki, L . G . and Svegel, N . P . Software A u t o m a t e d
V e r i f i c a t i o n S y s t e m Study, M c D o n n e l l D o u g l a s
A s t r o n a u t i c s C o . , AD-7~4086, J a n u a r y 1974.
The c o n c e p t of p r o v i n g t e s t r e l i a b i l i t y , v a l i d ity, and c o m p l e t e n e s s and t h e n t e s t i n g a p r o g r a m
( r a t h e r t h a n a t t e m p t i n g to p r o v e the p r o g r a m c o r r e c t d i r e c t l y o r u s i n g a n o n - r i g o r o u s a p p r o a c h to
t e s t i n g ) i s i m p o r t a n t for s o f t w a r e r e l i a b i l i t y . P r e s u m a b l y s e v e r y p i e c e of s o f t w a r e u n d e r g o e s s o m e
type of t e s t i n g , y e t the f u n d a m e n t a l t e c h n i q u e s of
t e s t data s e l e c t i o n have not p r e v i o u s l y b e e n as
c a r e f u l l y and c r i t i c a l l y a n a l y z e d as have s o m e
l e s s w i d e l y u s e d t e c h n i q u e s such as p r o o f of
c o r r e c t n e s s . We know l e s s about the t h e o r y of
t e s t i n g , which we do oftenp t h a n about the t h e o r y
of p r o g r a m p r o v i n g , which we do s e l d o m . This
p a p e r is a step t o w a r d r e d r e s s i n g this i m b a l a n c e .
P~efer e n c e s
Boehm, B. Software and its i m p a c t : a q u a n t i t a t i v e
a s s e s s m e n t . D a t a m a t i o n 19(May 1973), 48-59.
Boehm, B . W . etoal. Some e x p e r i e n c e with a u t o m a t e d aids to the d e s i g n of l a r g e - s c a l e r e l i a b l e
s o f t w a r e . In this p r o c e e d i n g s , A p r i l 1975,
Buxton, J . N . and R a n d e l l , B. Software E n g i n e e r ing TechniqLL~s, S c i e n t i f i c A f f a i r s D i v i s i o n , NATO,
B r u s s e l s 39, [~elgium, A p r i l 1970.
Dzthl, O. -J. , D i j k s t r a , E. W . , and H o a r e , C. A. R.
S t r u c t u r e d P r o g r a n l m i n g . A c a d e m i c P r e s s , New
York, 1972.
Elmendr~rf0 ~V. R. C o n t r o l l i n g the f u n c t i o n a l t e s t ing of an o p e r a t i n g s y s t e m . I E E E T r a n s . Sys. Sci.
and C y b . ~ S S C - 5 , 4 (October 1969); 284-289.
510
Download