Summarizing Structural Testing

advertisement

Summarizing “Structural” Testing

• Now that we have learned to create test cases through both:

– a) Functional (blackbox) and

– b) Structural (whitebox) testing methodologies we might ask how much is “enough” or

“when should we stop testing.”

Some potential answers to “when is enough”

Testing

• We stop testing:

Some potential answers to “when is enough”

Testing

• We stop testing: similar

1. When we run out of time

2. When no more failure is encountered during testing

3. When no more defects are revealed by testing

4. When we have executed all the designed test cases

5. When we can not think of any more test case to run

6.

When we reach a point of “diminishing” return

7. When all faults are discovered

Un-decidable or When the preset % of “fault seeds” are found – see last slide

Explanation of “when to stop” testing

1.

Unfortunately, “when we run out of time” is an often used criteria to stop testing! (Think of the following):

– Customer satisfaction

– Increased customer support cost and fix cost

2.

Some quality conscious organization uses reliability theory and the concept of “when no more or “little” failures or defects can be revealed ” is when we stop testing. (hard to do.)

3.

When we have executed all the designed test cases” is fine if the designed test cases provide good coverage; otherwise, it is just a convenient statement to meet schedule.

4.

When we can not think of anymore test case” after properly analyzing the test case coverage would be another acceptable solution.

5.

When we reach a point of diminishing return” is a good management solution similar to the reliability theory of not revealing anymore new defects or failures. (otherwise

“diminishing return needs to be defined)

6.

When all faults are discovered” is not possible theoretically and especially so for large systems.

Diminishing Return

# of Total

Bugs

Found

Start considering terminating testing terminate testing

Time or Total Test Cases Run

Test Case Coverage

• For us, test case coverage is a key issue in determining when to stop testing . We stop testing when our tests have covered

all that we want to cover.

Ask:

– Are there gaps and redundancies ?

– Have we covered all the relevant situations?

We will use the Triangle Problem as an example to look at these questions

Previous Sample Triangle Psuedo-code

1. Program Triangle

2. Declare a, b, c as Integer

3. Declare IsTriangle as Boolean

4. Output ( “enter a, b, and c integers”)

5. Input (a, b, c)

6. Output (“side 1 is”, a)

7. Output (“side 2 is”, b)

8. Output (”side 3 is”, c)

9. If (a<b+c) AND (b<a+c) And (c<b+a)

10. then IsTriangle = True

11. else IsTriangle = False

12. endif

13. If IsTriangle

14. then if (a=b) AND (b=c)

15. then Output (“equilateral”)

16. else if (a NE b) AND (a NE b) AND (b NE c)

17. then Output ( “Scalene”)

18. else Output (“Isosceles”)

19. endif

20. endif

21. else Output (“not a triangle”)

22. endif

23. end Triangle2

Condensation Graph from pseudo code first

1- 8

Statements coverage - 4 paths

Branch (DD-path) coverage - 4 paths

Cyclomatic # = 4+1 = 5 - 5 lin. Ind paths

All combinations - 8 paths

9

10

Is_Triangle= True

11

Is_Triangle = False

12

~Triangle

21

Not triangle

13

Triangle

15 equilateral

14

17 scalene

16

18 isosceles

19

20

22

Last

All Combination paths ?

• Let’s look at the all 8 combination paths

1.

P1 : < 8,9,10,12,13,14,15,20,22> (Equilateral)

2.

P2 : <8,9,10,12,13,14,16,17,19,20,22> (Scalene)

3.

P3 : <8,9,10,12,13,14,16,18,19,20,22> (Isosceles)

4.

P4: <8,9,10,12,13,21,22> (not possible)

5.

P5: <8,9,11,12,13,14,15,20,22> (not possible)

6.

P6: <8,9,11,12,13,14,16,17,19,20,22> (not possible)

7.

P7: < 8,9,11,12,13,14,16,18,19,20,22> (not possible)

8.

P8 : <8,9,11,12,13,21,22> (Not a triangle)

- So, there are 4 decision-decision (dd) paths (branch testing) that make sense.

- These are P1, P2, P3, and P8.

- We should at least test these four paths .

Compare against Boundary Value Test

(15 test cases for Triangle problem )

Remember the boundary: 1

≤ TriangleSide ≤ 200

Test case a b c expected output paths

1 100 100 1

2 100 100 2

3 100 100 100

4 100 100 199

Isosceles P3

Isosceles P3

Equilateral P1

Isosceles P3

5 100 100 200

6 100 1

Not Triangle P8

100 Isosceles P3

7 100 2

8 100 100

100 Isosceles P3

100 Equilateral P1

9 100 199

10 100 200

100 Isosceles P3

100 Not Triangle P8

11 1 100 100 Isosceles P3

12 2 100 100 Isosceles P3

13 100 100 100 Equilateral P1

14 199 100 100 Isosceles P3

15 200 100 100 Not Triangle P8

Let’s analyze this table in more detail --next chart

Comparison Summary

• Potential “Gap” exist in the Boundary Value

Test. When we look at the equivalence classes (or logic table) of the outputs, we see that Scalene triangle is not covered .

– Path P2 is not covered with the 15 Boundary Value test cases!

• There are, however, lots of “Duplications”

– P3 is covered 9 times (Isosceles triangle)

– P1 is covered 3 times (Equilateral)

– P8 is covered 3 times (Not Triangle)

Clearly, boundary value (functional testing) is not enough here ; is it possible that it is also not as efficient?

Comparison Metrics of Functional .vs.

Structural Test Effectiveness

• Assume

1.

Functional Test M generates m test cases

2.

Structural Test S generates s structural elements. (structural elements = the chosen paths for the S test)

3.

When all of the m test cases are executed, then n , where n ≤ s, of the s structural elements are traversed or covered.

• Then consider 3 metric of evaluating testing

“effectiveness” of functional with respect to structural are:

– Coverage of M with respect to S: C(M,S) = n/s

– Redundancy of M with respect to S: R(M,S) = m/s

– Net redundancy of M with respect to S: NR(M,S) = m/n

Comparison for the Triangle Example

• The Boundary Value Test, M, generated 15 test cases; so m = 15.

• The dd –path (or Branch) Test generated 4 paths for test cases; so s = 4.

• The 15 M test cases covers 3 of the 4 paths from the

S test; so n = 3.

The 3 comparison of effectiveness of M to S shows:

Coverage(M,S) = 3 / 4 : 75% coverage effectiveness

Redundancy(M,S) = 15 / 4 : 375% redundancy

NetRed(M,S) = 15 / 3 : 500% net redundancy

Note the penalty here

Relative Efforts (Test complexity) Comparison within Structural Test Methodologies

Effort to identify test coverage elements dd path Basis d-u path slice

(branch)

Sophistication in methodology

Should we consider Structural Test Complexity when Designing?

• If so -----

– Since program slice testing takes more effort, should we have less program slices in our programs?

– If we do have program slices, should those slice size (# of statements) be small?

What was that “fault seeding” stop criteria?

• Fault seeding is a technique for

– i) determining when to stop and/or for

– ii) projecting “escaped” bugs.

• Fault seeding technique:

– Develop a number of bugs (e.g. 20 bugs) and seed them into the product, without letting the testers know.

– Pick a % (e.g. 90% ) of discovery of the seeded faults by the test team to be considered as the stopping criteria .

– Run the tests and see if .9 x 20= 18 of the seeded bugs are found. Stop testing only if 90% is reached.

– If the total number of unique problems found is Z (e.g. 45, NOT including the 18 seeded fault), then we may roughly project the remaining nonseeded problems are:

- 45/Y = 18/20

- y = 50

- remaining non-seeded problems = 50-45 = 5

project that there are 5 more undetected problems remaining

Download