Are You Sure What Failures Your Tests Produce?

advertisement
Are You Sure What Failures
Your Tests Produce?
Lee White
Results on Testing GUI Systems
• CIS (Complete Interaction Sequences) approach
for testing GUI systems: Applied to four large
commercial GUI systems
• Testing GUI system in different environments:
operating system, CPU speed, memory
• Modified CIS approach applied to regression
test two versions of a large commercial GUI
system
Three Objectives for this Talk
• Use of memory tools during GUI testing
discovered many more defects;
observability problems here
• In GUI systems, defects manifested
themselves as different failures (or not at
all) in different environments
• In GUI systems, many more behaviors
reside in the code than designer intended.
Complete Interaction Sequence
(CIS)
• Identify all responsibilities (GUI activity that
produces an observable effect on the
surrounding user environment).
• CIS: Operations on a sequence of GUI objects
that collectively implement a responsibility.
• Example: (assume file opened)
File_Menu -> Print -> Print_Setup_Selection ->
Confirm_Print
FSM for a CIS
(Finite State Model)
• Design a FSM to model a CIS
• Requires experience to create FSM model
• To test for all effects in a GUI, all paths
within the CIS must be executed
• Loops may be repeated, but not
consecutively
Init
Edit
Cut
File
Copy
ReadyEdit
Open
File
Name
File
Select
File
Move
Cursor
Paste
Highlight
Finish
Open2
Select
File2
Move
Cursor2
Figure 1 Edit-Cut-Copy-Paste CIS FSM
Name
File2
File
Ready
How to Test a CIS?
• Design Tests: FSM model based upon
the design of the CIS is used to generate
tests.
• Implementation Tests: In the actual GUI,
check all CIS object selections, and select
all those transitions to another GUI object
within the CIS; add these transitions to the
FSM model to generate tests, as well as
any new inputs or outputs to/from the CIS.
I1
I2
BB
A
A
C
DD
C
O1
Figure 2 Design Tests for a Strongly
Connected Component
[(I1,B,C,D,A,B,C,O1), (I2,A,B,C,D,A,B,C,O1)]
B
I1
I2
A
C
O1
*
I3
O2
D
Figure 3 Implementation Tests for a
Strongly Connected Component
[ (I1,B,C,D,B,C,D,A,B,C,D,A*,B,C,O1),
(I1,B,C,D,B,C,D,A,B,C,D,A*,B,C,D,O2),
(I2,A,B,C,D,B,C,D,A,B,C,D,A*,B,C,O1),
(I2,A,B,C,D,B,C,D,A,B,C,D,A*,B,C,D,O2),
(I3,D,A,B,C,D,B,C,D,A*,B,C,O1),
(I3,D,A,B,C,D,B,C,D,A*,B,C,D,O2) ]
1) Real Network Suite
RealJukeBox
RealDownload
RealPlayer
Team A (3)
Team B (4)
Team B (4)
2) Adobe Suite
PhotoDeluxe
EasyPhoto
Acrobat Reader
Team B (4)
Team A (3)
Team A (3)
3) Inter WinDVD
Team C (3)
4) Multi-Media DB
GVisual
VStore
AdminSrvr
ObjectBrowser
Team D (4)
Table 1 Case Study of 4 Systems
Design
# Tests
# Faults
Impl.
# Tests
# Faults
GUI
System
GUI
Objects
Real
Networks
443
84
9
242
19
Adobe PS
Acrobat R.
507
223
2
612
10
Inter
WinDVD
112
56
0
154
3
Multi-Media
DB
294
98
0
241
9
Memory Tools
• Memory tools monitor memory changes,
CPU changes and register changes
• Used to detect failures that would have
eluded detection, and account for 34% of
faults found in these empirical studies
• Used two such tools: Memory Doctor and
Win Gauge from Hurricane Systems Tool.
Table 2 Hidden Faults Detected
by Memory Tools
GUI
System
Hidden
Faults
All
Faults
Percent
Real
Network
7
19
37%
Adobe PS
Acrobat Rd
4
10
40%
Inter
WinDVD
1
3
33%
Multi-Media
DB
2
9
22%
Total
Faults
14
41
34%
Failures of GUI Tests
on Different Platforms
Lee White and Baowei Fei
EECS Department
Case Western Reserve University
Environment Effects Studied
• Environment Effects: Operating System,
CPU Speed, Memory Changes
• Same software tested: RealOne Player
• 950 implementation tests
• For OS, same computer used, but use of
Windows 98 and 2000 investigated
Table 3. Faults detected by implementation tests for different operating systems
Surprises
Defects
Faults
Windows 98
96
35
131
Windows 2000
37
24
61
Table 4. Faults detected by implementation tests for different CPU speeds
Surprises
Defects
Faults
PC1
31
19
50
PC2
34
19
53
PC3
37
24
61
Table 5. Faults detected by implementation tests for different memory sizes
Surprises
Defects
Faults
PC3 (256 MB)
96
35
131
PC3 (192 MB)
99
36
135
PC3 (128 MB)
101
38
139
Regression Testing GUI
Systems
A Case Study to Show the
Operations of the GUI Firewall for
Regression Testing
GUI Features
• Feature: A set of closely related CISs
with related responsibilities
• New Features: Features in a new version
not in previous versions
• Totally Modified Features: Features that
are so drastically changed in a new
version that this change cannot be
modeled by an incremental change;
simple firewall cannot be used.
Software Under Test
• Two versions of Real Player (RP) and
RealJukeBox (RJB): RP7/RJB1,
RP8/RJB2
• 13 features; RP7: 208 obj, 67 CIS, 67
des. tests, 137 impl. tests; RJB1: 117 obj,
30 CIS, 31 des. tests, 79 impl. tests
• 16 features; RP8: 246 obj, 80 CIS, 92
des. tests, 176 impl. tests; RJB2: 182 obj,
66 CIS, 127 des. tests, 310 impl. tests.
RP7/RJB1
RP8/RJB2
8 Features
21 Faults
17 Faults
Firewall
59 Faults
16 Features
5 Totally Modified
Features
Tested from Scratch by T2
0 Faults
53 Faults in
Original System
3 New Features
Tested by T1
Figure 4 Distribution of Faults Obtained by Testers T1 and T2
Failures Identified
in Version1, Version2
• We could identify identical failures in
Version1 and Version2.
• This resulted in 9 failures in Version2, and
7 failures in Version1 not matched.
• The challenge here was to show which
pair of failures might be due to the same
fault.
Different Failures in
Versions V1, V2 for the Same Fault
• V1: View track in RJB, freezes if album
cover included
• V2: View track in RJB, loses album cover
• Env. Problem: Graphical settings needed
from V2 for testing V1
Different Failures (cont)
• V1: Add/Remove channels in RP does not
work when RJB is also running
• V2: Add/Remove channels lose previous
items
• Env. Problem: Personal browser used in
V1, but V2 uses a special RJB browser
Different Failures (cont)
• V1: No failure present
• V2: In RP, Pressing forward crashes
system before playing stream file
• Env. Problem: Forward button can only be
pressed during play in V1, but in V2,
Forward botton can be selected at any
time; regression now finds this fault
Conclusions for Issue #1
• The use of memory tools illustrated
extensive observability problems in testing
GUI systems:
• In testing four commercial GUI systems:
34% were missed without use of this tool.
• In regression testing, 85% & 90% missed.
• Implication: GUI testing can miss defects
or surprises (or produce minor failures).
Conclusions for Issue #2
• Defects manifested as different failures (or
not at all) in different environments:
• Discussed in regression testing study
• Also observed in testing case studies, as
well as for testing in different HW/SW
environments.
Implication for Issue #2
• When testing, you think you understand
what failures will occur for certain tests &
defects for the same software. But you
don’t know what failures (if any) will be
seen by the user in another environment.
Conclusions for Issue #3
• Difference between design and
implementation tests are due to nondesign transitions in actual FSMs for each
GUI CIS:
• Observed in both case studies
• Implication: Faults are commonly
associated with these unknown FSM
transitions, and are not due to the design.
Question for the Audience
• Are these same three effects valid to this
extent for software other than just GUI
systems?
• If so, then why haven’t we seen lots of
reports and papers in the software
literature reporting this fact?
Download