Summarizing “Structural” Testing
• Now that we have learned to create test cases through both:
– a) Functional (blackbox) and
– b) Structural (whitebox) testing methodologies we might ask how much is “enough” or
“when should we stop testing.”
Some potential answers to “when is enough”
Testing
• We stop testing:
Some potential answers to “when is enough”
Testing
• We stop testing: similar
1. When we run out of time
2. When no more failure is encountered during testing
3. When no more defects are revealed by testing
4. When we have executed all the designed test cases
5. When we can not think of any more test case to run
6.
When we reach a point of “diminishing” return
7. When all faults are discovered
Un-decidable or When the preset % of “fault seeds” are found – see last slide
Explanation of “when to stop” testing
1.
Unfortunately, “when we run out of time” is an often used criteria to stop testing! (Think of the following):
– Customer satisfaction
– Increased customer support cost and fix cost
2.
Some quality conscious organization uses reliability theory and the concept of “when no more or “little” failures or defects can be revealed ” is when we stop testing. (hard to do.)
3.
“When we have executed all the designed test cases” is fine if the designed test cases provide good coverage; otherwise, it is just a convenient statement to meet schedule.
4.
“When we can not think of anymore test case” after properly analyzing the test case coverage would be another acceptable solution.
5.
“When we reach a point of diminishing return” is a good management solution similar to the reliability theory of not revealing anymore new defects or failures. (otherwise –
“diminishing return needs to be defined)
6.
“When all faults are discovered” is not possible theoretically and especially so for large systems.
# of Total
Bugs
Found
Start considering terminating testing terminate testing
Time or Total Test Cases Run
– Are there gaps and redundancies ?
– Have we covered all the relevant situations?
We will use the Triangle Problem as an example to look at these questions
Previous Sample Triangle Psuedo-code
1. Program Triangle
2. Declare a, b, c as Integer
3. Declare IsTriangle as Boolean
4. Output ( “enter a, b, and c integers”)
5. Input (a, b, c)
6. Output (“side 1 is”, a)
7. Output (“side 2 is”, b)
8. Output (”side 3 is”, c)
9. If (a<b+c) AND (b<a+c) And (c<b+a)
10. then IsTriangle = True
11. else IsTriangle = False
12. endif
13. If IsTriangle
14. then if (a=b) AND (b=c)
15. then Output (“equilateral”)
16. else if (a NE b) AND (a NE b) AND (b NE c)
17. then Output ( “Scalene”)
18. else Output (“Isosceles”)
19. endif
20. endif
21. else Output (“not a triangle”)
22. endif
23. end Triangle2
Condensation Graph from pseudo code first
1- 8
Statements coverage - 4 paths
Branch (DD-path) coverage - 4 paths
Cyclomatic # = 4+1 = 5 - 5 lin. Ind paths
All combinations - 8 paths
9
10
Is_Triangle= True
11
Is_Triangle = False
12
~Triangle
21
Not triangle
13
Triangle
15 equilateral
14
17 scalene
16
18 isosceles
19
20
22
Last
All Combination paths ?
• Let’s look at the all 8 combination paths
1.
P1 : < 8,9,10,12,13,14,15,20,22> (Equilateral)
2.
P2 : <8,9,10,12,13,14,16,17,19,20,22> (Scalene)
3.
P3 : <8,9,10,12,13,14,16,18,19,20,22> (Isosceles)
4.
P4: <8,9,10,12,13,21,22> (not possible)
5.
P5: <8,9,11,12,13,14,15,20,22> (not possible)
6.
P6: <8,9,11,12,13,14,16,17,19,20,22> (not possible)
7.
P7: < 8,9,11,12,13,14,16,18,19,20,22> (not possible)
8.
P8 : <8,9,11,12,13,21,22> (Not a triangle)
- So, there are 4 decision-decision (dd) paths (branch testing) that make sense.
- These are P1, P2, P3, and P8.
- We should at least test these four paths .
Compare against Boundary Value Test
(15 test cases for Triangle problem )
Remember the boundary: 1
≤ TriangleSide ≤ 200
Test case a b c expected output paths
1 100 100 1
2 100 100 2
3 100 100 100
4 100 100 199
Isosceles P3
Isosceles P3
Equilateral P1
Isosceles P3
5 100 100 200
6 100 1
Not Triangle P8
100 Isosceles P3
7 100 2
8 100 100
100 Isosceles P3
100 Equilateral P1
9 100 199
10 100 200
100 Isosceles P3
100 Not Triangle P8
11 1 100 100 Isosceles P3
12 2 100 100 Isosceles P3
13 100 100 100 Equilateral P1
14 199 100 100 Isosceles P3
15 200 100 100 Not Triangle P8
Let’s analyze this table in more detail --next chart
• Potential “Gap” exist in the Boundary Value
Test. When we look at the equivalence classes (or logic table) of the outputs, we see that Scalene triangle is not covered .
– Path P2 is not covered with the 15 Boundary Value test cases!
• There are, however, lots of “Duplications”
– P3 is covered 9 times (Isosceles triangle)
– P1 is covered 3 times (Equilateral)
– P8 is covered 3 times (Not Triangle)
Clearly, boundary value (functional testing) is not enough here ; is it possible that it is also not as efficient?
Comparison Metrics of Functional .vs.
Structural Test Effectiveness
• Assume
1.
Functional Test M generates m test cases
2.
Structural Test S generates s structural elements. (structural elements = the chosen paths for the S test)
3.
When all of the m test cases are executed, then n , where n ≤ s, of the s structural elements are traversed or covered.
• Then consider 3 metric of evaluating testing
“effectiveness” of functional with respect to structural are:
– Coverage of M with respect to S: C(M,S) = n/s
– Redundancy of M with respect to S: R(M,S) = m/s
– Net redundancy of M with respect to S: NR(M,S) = m/n
Comparison for the Triangle Example
• The Boundary Value Test, M, generated 15 test cases; so m = 15.
• The dd –path (or Branch) Test generated 4 paths for test cases; so s = 4.
• The 15 M test cases covers 3 of the 4 paths from the
S test; so n = 3.
The 3 comparison of effectiveness of M to S shows:
Coverage(M,S) = 3 / 4 : 75% coverage effectiveness
Redundancy(M,S) = 15 / 4 : 375% redundancy
NetRed(M,S) = 15 / 3 : 500% net redundancy
Note the penalty here
Relative Efforts (Test complexity) Comparison within Structural Test Methodologies
Effort to identify test coverage elements dd path Basis d-u path slice
(branch)
Sophistication in methodology
Should we consider Structural Test Complexity when Designing?
• If so -----
– Since program slice testing takes more effort, should we have less program slices in our programs?
– If we do have program slices, should those slice size (# of statements) be small?
What was that “fault seeding” stop criteria?
• Fault seeding is a technique for
– i) determining when to stop and/or for
– ii) projecting “escaped” bugs.
• Fault seeding technique:
– Develop a number of bugs (e.g. 20 bugs) and seed them into the product, without letting the testers know.
– Pick a % (e.g. 90% ) of discovery of the seeded faults by the test team to be considered as the stopping criteria .
– Run the tests and see if .9 x 20= 18 of the seeded bugs are found. Stop testing only if 90% is reached.
– If the total number of unique problems found is Z (e.g. 45, NOT including the 18 seeded fault), then we may roughly project the remaining nonseeded problems are:
- 45/Y = 18/20
- y = 50
- remaining non-seeded problems = 50-45 = 5
project that there are 5 more undetected problems remaining