Uploaded by bbhkqlkjpmzgkijadf

4.data-testing

advertisement
Dependable Software Systems
Topics in
Data-Flow Testing
Material drawn from [Beizer, Mancoridis, Vokolos]
© SERG
Data-Flow Testing
• Data-flow testing uses the control
flowgraph to explore the unreasonable
things that can happen to data (i.e.,
anomalies).
• Consideration of data-flow anomalies leads
to test path selection strategies that fill the
gaps between complete path testing and
branch or statement testing.
© SERG
Data-Flow Testing (Cont’d)
• Data-flow testing is the name given to a
family of test strategies based on selecting
paths through the program’s control flow in
order to explore sequences of events related
to the status of data objects.
• E.g., Pick enough paths to assure that:
– Every data object has been initialized prior to its
use.
– All defined objects have been used at least once.
© SERG
Data Object Categories
• (d) Defined, Created, Initialized
• (k) Killed, Undefined, Released
• (u) Used:
–
–
(c) Used in a calculation
(p) Used in a predicate
© SERG
(d) Defined Objects
• An object (e.g., variable) is defined when it:
–
–
–
–
–
appears in a data declaration
is assigned a new value
is a file that has been opened
is dynamically allocated
...
© SERG
(k) Killed Objects
• An object is killed when it is:
– released (e.g., free) or otherwise made
unavailable (e.g., out of scope)
– a loop control variable when the loop exits
– a file that has been closed
– ...
© SERG
(u) Used Objects
• An object is used when it is part of a
computation or a predicate.
• A variable is used for a computation (c)
when it appears on the RHS (sometimes
even the LHS in case of array indices) of an
assignment statement.
• A variable is used in a predicate (p) when it
appears directly in that predicate.
© SERG
Example: Definition and Uses
What are the definitions and uses for the program
below?!
1.
2.
3.
4
read (x, y);
z = x + 2;
if (z < y)
w = x + 1;
else
5.
y = y + 1;
6. print (x, y, w, z);
© SERG
Example: Definition and Uses
1.
2.
3.
4
read (x, y);
z = x + 2;
if (z < y)
w = x + 1;
else
5.
y = y + 1;
6. print (x, y, w, z);
Def
C-use
x, y
z
x
P-use
z, y
w
x
y
y
x, y,
w, z
© SERG
Data-Flow Anomalies
• A data-flow anomaly is denoted by a two
character sequence of actions. E.g.,
– ku: Means that an object is killed and then
used.
– dd: Means that an object is defined twice
without an intervening usage.
© SERG
Example
• E.g., of a valid (not anomalous) scenario
where variable A is a dpd:
A = C + D;
if(A > 0)
X = 1;
else
X = -1;
A = B + C;
© SERG
Two Letter Combinations for
dku
•
•
•
•
•
•
•
•
•
dd: Probably harmless, but suspicious.
dk: Probably a bug.
du: Normal situation.
kd: Normal situation.
kk: Harmless, but probably a bug.
ku: Definitely a bug.
ud: Normal situation (reassignment).
uk: Normal situation.
uu: Normal situation.
© SERG
Single Letter Situations
• A leading dash means that nothing of
interest (d, k, u) occurs prior to the action
noted along the entry-exit path of interest.
• A trailing dash means that nothing of
interest happens after the point of action
until the exit.
© SERG
Single Letter Situations
• -k: Possibly anomalous:
– Killing a variable that does not exist.
– Killing a variable that is global.
• -d: Normal situation.
• -u: Possibly anomalous, unless variable is
global.
• k-: Normal situation.
• d-: Possibly anomalous, unless variable is
global.
• u-: Normal situation.
© SERG
Data-Flow Anomaly State Graph
state of variable
action
K
k
u,k
d
u
U
d
u
D
d,k
anomalous
state
A
k,u,d
© SERG
Data-Flow Anomaly State Graph
with Variable Redemption
u
KU
u
k
k
u
K
d
k
d
DK
k
d
D
k
u
U
d
d
k
u
u
DD
d
© SERG
Static vs Dynamic
Anomaly Detection
• Static Analysis is analysis done on source
code without actually executing it.
• E.g., Syntax errors are caught by static
analysis.
© SERG
Static vs Dynamic
Anomaly Detection (Cont’d)
• Dynamic Analysis is analysis done as a
program is executing and is based on
intermediate values that result from the
program’s execution.
• E.g., A division by 0 error is caught by
dynamic analysis.
• If a data-flow anomaly can be detected by
static analysis then the anomaly does not
concern testing. (Should be handled by the
compiler.)
© SERG
Anomaly Detection Using
Compilers
• Compilers are able to detect several dataflow anomalies using static analysis.
• E.g., By forcing declaration before use, a
compiler can detect anomalies such as:
– -u
– ku
• Optimizing compilers are able to detect
some dead variables.
© SERG
Is Static Analysis Sufficient?
•
•
•
•
Questions:
Why isn’t static analysis enough?
Why is testing required?
Could a good compiler detect all data-flow
anomalies?
• Answer: No. Detecting all data-flow
anomalies is provably unsolvable.
© SERG
Static Analysis Deficiencies
• Current static analysis methods are
inadequate for:
– Dead Variables: Detecting unreachable
variables is unsolvable in the general case.
– Arrays: Dynamically allocated arrays contain
garbage unless they are initialized explicitly.
(-u anomalies are possible)
© SERG
Static Analysis Deficiencies
(Cont’d)
– Pointers: Impossible to verify pointer values at
compile time.
– False Anomalies: Even an obvious bug (e.g.,
ku) may not be a bug if the path along which
the anomaly exists is unachievable.
(Determining whether a path is or is not
achievable is unsolvable.)
© SERG
Data-Flow Modeling
• Data-flow modeling is based on the control
flowgraph.
• Each link is annotated with:
– symbols (e.g., d, k, u, c, p)
– sequences of symbols (e.g., dd, du, ddd)
• that denote the sequence of data operations
on that link with respect to the variable of
interest.
© SERG
Control Flowgraph Annotated for
X and Y Data Flows
1
3
4
5
6
7
8
9
10
11
12
13
2
INPUT X,Y
Z:= X+Y
Y:= X-Y
IF Z>=0 GOTO SAM
JOE: Z:=Z-1
SAM: Z:=Z+V
U:=0
LOOP
B(U),Q(V):=(Z+V)*U
IF B(U)=0 GOTO JOE
Z:=Z-1
2
IF Z=0 GOTO ELL
U:=U+1
END
UNTIL U=z
B(U-1):=B(U+1)+Q(V-1)
ELL: B(U+Q(V)):=U+V
IF U=V GOTO JOE
IF U>V THEN U:=Z
YY:Z:=U
END
1
dcc
JOE
3
4
5
LOOP
B(U)?
6
7
SAM
Z?
ELL
13
12
11
YY
U,V?
U,V?
10
U,Z?
9
Z?
8
© SERG
Control Flowgraph Annotated for
Z Data Flows
1
3
4
5
6
7
8
9
10
11
12
13
2
INPUT X,Y
p
Z:= X+Y
d
1
3 p
Y:= X-Y
IF Z>=0 GOTO SAM
Z?
JOE: Z:=Z-1
SAM: Z:=Z+V
U:=0
LOOP
d
B(U),Q(V):=(Z+V)*U
2
13 c 12
IF B(U)=0 GOTO JOE
END
YY
U,V?
Z:=Z-1
IF Z=0 GOTO ELL
U:=U+1
UNTIL U=z
B(U-1):=B(U+1)+Q(V-1)
ELL: B(U+Q(V)):=U+V
IF U=V GOTO JOE
IF U>V THEN U:=Z
YY:Z:=U
END
JOE
4
LOOP
cd
5 cd
SAM
ELL
11
U,V?
10
6
B(U)?
7
c
cd
p
p
9
U,Z?
p
8 Z?
p
© SERG
Definition-Clear Path Segments
• A Definition-clear Path Segment (w.r.t.
variable X) is a connected sequence of links
such that X is defined on the first link and
not redefined or killed on any subsequent
link of that path segment.
© SERG
Definition-Clear Path Segments
for Variable Z (Cont’d)
p
d
1
3
JOE
p
4
Z?
cd
5
cd
LOOP
B(), U?
6
7
c
SAM
cd
p
2
END
d
ELL
13
YY
c
12
11
U,V?
U,V?
10
p
9
U,Z?
p
8
Z?
p
© SERG
Non Definition-Clear Path
Segments for Variable Z (Cont’d)
p
d
1
3
JOE
p
4
Z?
cd
5
cd
LOOP
B(), U?
6
7
c
SAM
cd
p
2
END
d
ELL
13
YY
c
12
11
U,V?
U,V?
10
p
9
U,Z?
p
8
Z?
p
© SERG
Simple Path Segments
• A Simple Path Segment is a path segment
in which at most one node is visited twice.
– E.g., (7,4,5,6,7) is simple.
• Therefore, a simple path may or may not be
loop-free.
© SERG
Loop-free Path Segments
• A Loop-free Path Segment is a path
segment for which every node is visited at
most once.
– E.g., (4,5,6,7,8,10) is loop-free.
– path (10,11,4,5,6,7,8,10,11,12) is not loop-free
because nodes 10 and 11 are visited twice.
© SERG
du Path Segments
• A du Path is a path segment such that if the
last link has a use of X, then the path is
simple and definition clear.
© SERG
def-use Associations
• A def-use association is a triple (x, d, u,), where:
x is a variable,
d is a node containing a definition of x,
u is either a statement or predicate node
containing a use of x,
and there is a sub-path in the flow graph from d to u
with no other definition of x between d and u.
© SERG
Example: Def-Use Associations
1!
2!
3
T
w=x+1
4!
read (x, y)
z=x+2
Some Def-Use Associations:!
(x, 1, 2), (x, 1, 4), …!
z<y
(z, 2, (3,t)),...!
F
5!
(y, 1, (3,t)), (y, 1, (3,f)), (y, 1, 5), …!
y=y+1
6! print (x,y,w,z)
© SERG
Example: Def-Use Associations
What are all the def-use associations for the program below?
read (z)
x=0
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y = g (x, y)
print (y)
© SERG
Example: Def-Use Associations
def-use associations for variable z.!
read (z)
x=0
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y = g (x, y)
print (y)
© SERG
Example: Def-Use Associations
read (z)
def-use associations for variable
x=0
x.!
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
else
}
y = g (x, y)
print (y)
y = f (x)
y = h (z)
© SERG
Example: Def-Use Associations
read (z)
def-use associations for variable y.!
x=0
y=0
if (z ≥ 0)
{
x = sqrt (z)
if (0 ≤ x && x ≤ 5)
y = f (x)
else
y = h (z)
}
y=g (x, y)
print (y)
© SERG
Definition-Clear Paths
• A path (i, n1, ..., nm, j) is called a definition-clear
path with respect to x from node i to node j if it
contains no definitions of variable x in nodes
(n1, ..., nm , j) .
• The family of data flow criteria requires that the
test data execute definition-clear paths from each
node containing a definition of a variable to
specified nodes containing c-use and edges
containing p-use of that variable.
© SERG
Data-Flow Testing Strategies
•
•
•
•
•
•
•
All du Paths (ADUP)
All Uses (AU)
All p-uses/some c-uses (APU+C)
All c-uses/some p-uses (ACU+P)
All Definitions (AD)
All p-uses (APU)
All c-uses (ACU)
© SERG
All du Paths Strategy (ADUP)
• ADUP is one of the strongest data-flow
testing strategies.
• ADUP requires that every du path from
every definition of every variable to every
use of that definition be exercised under
some test All du Paths Strategy (ADUP).
© SERG
Example: pow(x,y)
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
10
{
c
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: pow(x,y)
du-Path for Variable x
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
c
10
{
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: pow(x,y)
du-Path for Variable x
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
c
10
{
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
c
10
{
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
c
10
{
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: pow(x,y)
du-Path for Variable y
/* pow(x,y)
This program computes x to the power of y, where x and y are integers.
INPUT: The x and y values.
OUTPUT: x raised to the power of y is printed to stdout.
*/
1
void pow (int x, y)
2
{
3
float z;
4
int p;
b
5
if (y < 0)
6
p = 0 – y;
a
f
d
7
else p = y;
1
5
8
9
8
z = 1.0;
9
while (p != 0)
c
10
{
e
11
z = z * x;
12
p = p – 1;
13
}
14
if (y < 0)
15
z = 1.0 / z;
16
printf(z);
17
}
g
14
16
i
17
h
© SERG
Example: Using du-Path Testing
to Test Program COUNT
• Consider the following program:
/* COUNT
This program counts the number of characters and lines in a text file.
INPUT: Text File
OUTPUT: Number of characters and number of lines.
*/
1
main(int argc, char *argv[])
2
{
3
int numChars = 0;
4
int numLines = 0;
5
char chr;
6
FILE *fp = NULL;
7
© SERG
Program COUNT (Cont’d)
8
if (argc < 2)
9
{
10
printf(“\nUsage: %s <filename>”, argv[0]);
11
return (-1);
12
}
13
fp = fopen(argv[1], “r”);
14
if (fp == NULL)
15
{
16
perror(argv[1]);
17
return (-2);
18
}
/* display error message */
© SERG
Program COUNT (Cont’d)
19
while (!feof(fp))
20
{
21
chr = getc(fp);
/* read character */
22
if (chr == ‘\n’)
/* if carriage return */
23
++numLines;
24
else
25
++numChars;
26
}
27
printf(“\nNumber of characters = %d”, numChars);
28
printf(“\nNumber of lines = %d”, numLines);
29
}
© SERG
The Flowgraph for COUNT
1
8
11
14
17
19
22
23
24
26
29
• The junction at line 12 and line 18 are not
needed because if you are at these lines then
you must also be at line 14 and 19
respectively.
© SERG
du-Path for argc
p
1
d
8
p
11
14
17
19
22
23
24
26
29
argc?
© SERG
du-Path for argc
p
1
d
8
p
11
14
17
19
22
23
24
26
29
argc?
© SERG
du-Path for argv[]
c
1
d
8
argc?
c
11
14
c
17
19
22
23
24
26
29
fp?
© SERG
du-Path for argv[]
1
d
8
c
11
14
17
19
22
23
24
26
29
argc?
© SERG
du-Path for numChars
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
23
24
cd
26
29
chr?
© SERG
du-Path for numChars
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
23
24
cd
26
29
chr?
© SERG
du-Path for numChars
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
23
24
cd
26
29
chr?
© SERG
du-Path for numLines
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
cd
23
24
26
29
chr?
© SERG
du-Path for numLines
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
cd
23
24
26
29
chr?
© SERG
du-Path for numLines
c
fp?
1
d
8
argc?
11
14
fp?
17
19
22
cd
23
24
26
29
chr?
© SERG
du-Path for chr
fp?
1
d
8
argc?
11
14
fp?
17
19
p
d
22
p
23
24
26
29
chr?
© SERG
du-Path for chr
fp?
1
d
8
argc?
11
14
fp?
17
19
p
d
22
p
23
24
26
29
chr?
© SERG
du-Path for fp
d
1
d
8
argc?
11
p
14
fp?
p
fp? p
17
19 pc
22
23
24
26
29
chr?
© SERG
du-Path for fp
d
1
d
8
argc?
11
p
14
fp?
p
fp? p
17
19 pc
22
23
24
26
29
chr?
© SERG
du-Path for fp
d
1
d
8
argc?
11
p
14
fp?
p
fp? p
17
19
pc
22
23
24
26
29
chr?
© SERG
All Uses Strategy (AU)
• AU requires that at least one path from
every definition of every variable to every
use of that definition be exercised under
some test.
• Hence, at least one definition-clear path
from every definition of every variable to
every use of that definition be exercised
under some test.
• Clearly, AU < ADUP.
© SERG
All p-uses/Some c-uses
Strategy (APU+C)
• APU+C requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to every predicate use.
• If there are definitions of the variable that
are not covered by the above prescription,
then add computational-use test cases to
cover every definition.
© SERG
All c-uses/Some p-uses Strategy
(ACU+P)
• ACU+P requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to every computational use.
• If there are definitions of the variable that
are not covered by the above prescription,
then add predicate-use test cases to cover
every definition.
© SERG
All Definitions Strategy (AD)
• AD requires that for every variable and
every definition of that variable include at
least one definition-free path from the
definition to a computational or predicate
use.
• AD < ACU+P and AD < APU+C.
© SERG
All p-uses (APU)
All c-uses (ACU)
• APU is the same as APU+C without the C
requirement.
• APU < APU+C.
• ACU is the same as ACU+P without the P
requirement.
• ACU < ACU+P.
© SERG
Relationship among DF criteria
ALL-PATHS!
ALL-DU-PATHS!
ALL-USES!
ALL-P-USES/SOME-C-USES!
ALL-C-USES/SOME-P-USES!
ALL-C-USES!
ALL-P-USES!
ALL-DEFS!
ALL-EDGES!
ALL-NODES!
© SERG
Feasible Data Flow Criteria
• What happens if we eliminate all un-executable
associations from consideration?
• If we eliminate all un-executable associations from
consideration, then there are significant differences
between the Data Flow criteria and the Feasible Data Flow
criteria.
• For a large class of “well behaved programs”, the Feasible
DF criteria All-p-uses, All-p-uses/some-c-uses, and All-uses
bridge the gap between All-edges and All-paths.
• However, for certain programs with anomalies there are
tests which satisfy All-p-uses without satisfying All-edges.
© SERG
An Example
1!
read (x)
2!
x = sqrt(x)
6
F
T
7!
3
read (y)
x<0
F
T
5!
4!
8!
9!
• Node 4 is un-executable; y is un-initialized.!
• Essentially, only def-use of concern is (x, 2, 3)!
6
T
y≠0
F
• Outcome of node 6 can be either T or F.!
• All-p-uses satisfied, but not all-edges.!
© SERG
Effectiveness of Strategies
• Ntafos compared Random, Branch, and
All uses testing strategies on 14 Kernighan
and Plauger programs.
• Kernighan and Plauger programs are a set
of mathematical programs with known bugs
that are often used to evaluate test
strategies.
• Ntafos conducted two experiments:
© SERG
Results of 2 of the 14
Ntafos Experiments
Strategy
Mean Number
of Test Cases
Percentage of
Bugs Found
Random
35
93.7
Branch
3.8
91.6
All Uses
11.3
96.3
Strategy
Mean Number
of Test Cases
Percentage of
Bugs Found
Random
100
79.5
Branch
34
85.5
All Uses
84
90.0
© SERG
Data-Flow Testing Tips
• Resolve all data-flow anomalies.
• Try to do all data-flow operations on a
variable within the same routine (i.e., avoid
integration problems).
• Use strong typing and user defined types
when possible.
© SERG
Data-Flow Testing Tips (Cont’d)
• Use explicit (rather than implicit)
declarations of data when possible.
• Put data declarations at the top of the
routine and return data objects at the bottom
of the routine.
© SERG
Summary
• Data are as important as code.
• Define what you consider to be a data-flow
anomaly.
• Data-flow testing strategies span the gap
between all paths and branch testing.
© SERG
Summary
• AU has the best payoff for the money. It
seems to be no worse than twice the number
of required test cases for branch testing, but
the results are much better.
• Path testing with Branch Coverage and
Data-flow testing with AU is a very good
combination.
© SERG
Download