a firewall model for testing user

advertisement
A FIREWALL MODEL FOR TESTING
USER-CONFIGURABLE SOFTWARE SYSTEMS
By
BRIAN P. ROBINSON
Submitted in partial fulfillment of the requirements
For the degree of Doctor of Philosophy
Dissertation Advisor: Dr. Lee J. White
Department of Electrical Engineering and Computer Science
CASE WESTERN RESERVE UNIVERISTY
May, 2008
CASE WESTERN RESERVE UNIVERSITY
SCHOOL OF GRADUATE STUDIES
We hereby approve the thesis/dissertation of
Brian P. Robinson___________________________________
candidate for the Ph.D. degree *.
(signed) Lee J. White______________________________
(chair of the committee)
Andy Podgurski____________________________
Vincenzo Liberatore________________________
Ken Loparo_______________________________
(date) March 6th, 2008
*We also certify that written approval has been obtained for any
proprietary material contained therein.
Acknowledgements
This work is dedicated to my wife Heidi, for all the love and support provided.
She is an amazing person and I could never have done this work without her.
I would like to first acknowledge and thank my advisor Dr. Lee White for all his
help over the years. Taking his class in Software Engineering really got me excited about
this area of research and also showed me how behind industry really is. This became the
motivation for both my research in this area and my research program at ABB. Dr. White
also showed me that youth and speed are great in many sports, but precision matters most
in racquetball. That lesson was repeated constantly throughout our time working together.
We have analyzed some very interesting software testing problems over these last six
years, and I thank him for all of the time, dedication, and patience he gave me.
Next, I would like to thank Dr. Vincenzo Liberatore. I got to know Dr. Liberatore
in his research networking class, which really opened my eyes to other areas of research
in Computer Science. To this day I am amazed that my class project ended up published
as a short paper to an international workshop. I have really enjoyed the research,
proposals, and discussions with him over the years, and I thank him for all of the
opportunities and help. Also, I would like to thank Dr. Andy Podgurski. His knowledge
of software engineering and testing are very helpful to me personally and to my research
program at ABB. Many of the ideas and work he has done have refined how I look at
software testing, and I am grateful to him for all of the help and support. Dr. Ken Loparo
has also been a great committee member. His knowledge of the systems and products that
I studied for this work, as well as their market and use, has been a great benefit to my
research and the committee. I am very grateful for his support and help.
Another group who deserves thanks is the EECS student affairs team. Without
their support getting forms signed and questions answered, this dissertation might never
have come about.
Outside of Case, I would like to thank ABB. Without the experiences I gained
while working there, not to mention the funding, this research would not have happened.
I am very fortunate to have that knowledge and data available to me, and I hope that this
research will be put to good use in ABB soon.
My family also deserves many thanks. Their encouragement and support were
very helpful and really enabled me to get this work done. They all had to put up with my
trips to the library, requests for quiet, frequent absences, and various other things that
came up over these years. A special thanks to my son Aidan and my daughter Arwyn.
Aidan had many nights where he was not able to get my full attention, and I thank him
for being patient. Arwyn had to learn not to push the mouse and hit a key when I was
holding her, no matter how fun it looks. It is amazing how fast hundreds of pages can be
deleted and I think this was the real reason that Undo was invented.
Last, and most important, I would like to thank my wife Heidi. Without her
encouragement and support I would never have pursued this degree. She sacrificed time,
both her own and ours together, to make sure I could take courses, study for tests, do this
research, and be successful. She is my role model, and her work ethic and study habits
are what I strive for. Her love and support enabled this research to be possible, and I am
forever in her debt.
Table of Contents
List of Tables ..................................................................................................................... 3
List of Figures.................................................................................................................... 4
1. Introduction................................................................................................................... 6
2. Proposed Solution ....................................................................................................... 14
2.1 Solution Overview .................................................................................................. 14
2.2 Using the Solution................................................................................................... 17
2.3 Example Applications of the Solution .................................................................... 18
3. Needed Firewalls ......................................................................................................... 24
3.1 Traditional Firewall ................................................................................................ 24
3.2 Extended Firewall ................................................................................................... 30
3.3 COTS Firewall ........................................................................................................ 35
3.4 Deadlock Firewall................................................................................................... 41
3.5 Other Future Firewalls ............................................................................................ 50
4. Configurations and Settings Firewall........................................................................ 52
4.1 Settings Changes..................................................................................................... 56
4.2 Configuration Changes ........................................................................................... 62
4.3 Constructing a Firewall for Settings Changes ........................................................ 66
4.4 Constructing a Firewall for Configuration Changes............................................... 77
4.4.1 Constructing a Firewall for New Configurable elements .................................... 81
4.4.2 Constructing a Firewall for Previously Used Configurable elements ................. 87
4.4.3 Constructing a Firewall for Removed Configurable elements ............................ 91
4.5 Time Complexity of the Configuration and Settings Firewall................................ 93
4.6 Future Improvements on the Configuration and Settings Firewall......................... 99
5. A Process to Support the Configuration and Settings Firewall............................ 101
5.1 Current Industry Testing Process.......................................................................... 101
5.2 Modified Industry Testing Process ....................................................................... 103
5.3 Time Study of the Proposed Release Testing Process .......................................... 107
5.4 Future Additions to the Proposed Release Testing Process.................................. 108
6. Empirical Studies of User Configurable Software Firewalls................................ 109
6.1 Empirical Studies Overview ................................................................................. 109
6.2 Limitations of Empirical Studies .......................................................................... 114
6.3 First Case Study .................................................................................................... 114
6.3.1 First Customer Study – Embedded Controller................................................... 115
6.3.2 Second Customer Study – Embedded Controller .............................................. 119
6.3.3 Additional Customer Studies – Embedded Controller....................................... 123
6.4 Second Case Study................................................................................................ 126
6.4.1 First Customer Study – GUI System ................................................................. 127
6.4.2 Second Customer Study – GUI System ............................................................. 130
6.4.3 Additional Customer Studies – GUI System ..................................................... 132
6.5 Third Case Study................................................................................................... 136
6.5.1 First Testing Study – GUI System ..................................................................... 137
6.5.2 Second Testing Study – GUI System ................................................................ 139
6.5.3 Summary of Third Case Study........................................................................... 141
6.6 Fourth Case Study................................................................................................. 141
6.6.1 Taxonomy Overview ......................................................................................... 142
1
6.6.2 Embedded Controller Defect Classification ...................................................... 145
6.6.3 GUI System Defect Classification ..................................................................... 148
6.7 Fifth Case Study.................................................................................................... 151
7. Conclusions and Future Work................................................................................. 159
8. References.................................................................................................................. 163
2
List of Tables
Table 1. Results of Procedural Firewall Testing at ABB.................................................. 29
Table 2. Results of Object-Oriented Firewall Testing at ABB......................................... 29
Table 3. Results of EFW Testing at ABB......................................................................... 32
Table 4. Effort Required for EFW Testing at ABB .......................................................... 33
Table 5. Results of EFW Testing at Telecom Company .................................................. 34
Table 6. Effort Required for EFW Testing at Telecom Company.................................... 34
Table 7. COTS Firewall, First Study Results at ABB ...................................................... 38
Table 8. COTS Firewall, Second Study Results at ABB.................................................. 39
Table 9. COTS Firewall, Third Study Results at ABB..................................................... 40
Table 10. COTS Firewall, Fourth Study Results at ABB ................................................. 40
Table 11. Summary of Case Study 1 .............................................................................. 126
Table 12. Summary of Case Study 2 .............................................................................. 135
Table 13. Results from the Third Case Study ................................................................. 141
Table 14. Beizer’s Taxonomy’s Major Categories ......................................................... 142
Table 15. Beizer’s Taxonomy’s Functional Bugs........................................................... 143
Table 16. Beizer’s Taxonomy’s Functionality as Implemented Bugs............................ 143
Table 17. Beizer’s Taxonomy’s Structural & Data Bugs ............................................... 144
Table 18. Beizer’s Taxonomy’s Implementation & Integration Bugs............................ 144
Table 19. Beizer’s Taxonomy’s System and Test Bugs ................................................. 145
Table 20. Summary of Source Metrics ........................................................................... 153
Table 21. T-test Results for Call Depth .......................................................................... 154
Table 22. T-test Results for Fan In ................................................................................. 155
Table 23. T-test Results for Fan Out............................................................................... 156
Table 24. T-test Results for LOC / Method .................................................................... 157
Table 25. T-test Results for Cyclomatic Complexity ..................................................... 157
3
List of Figures
Figure 1. Example Procedural Firewall Graph [13].......................................................... 26
Figure 2. Example Object-Oriented Firewall Graph......................................................... 27
Figure 3. Example Extended Firewall Graph ................................................................... 31
Figure 4. Example COTS Firewall Graph [14]................................................................. 36
Figure 5. Example Deadlock Graph, Two-Way ............................................................... 42
Figure 6. Example Deadlock Graph, Three Way.............................................................. 43
Figure 7. Example Deadlock Graph with Message Queues ............................................. 44
Figure 8. Example Deadlock Firewall Graph for a Modified Task .................................. 45
Figure 9. Example Deadlock Firewall Graph, First Study Original ................................. 46
Figure 10. Example Deadlock Firewall Graph, First Study Changed .............................. 46
Figure 11. Example Deadlock Firewall Graph, Second Study ......................................... 47
Figure 12. Example Deadlock Firewall Graph, Third Study ............................................ 48
Figure 13. Example Deadlock Firewall Graph, Fourth Study .......................................... 49
Figure 14. Example of a Settings Change......................................................................... 58
Figure 15. Example Configuration Addition .................................................................... 65
Figure 16. Process Diagram for a Settings Change .......................................................... 67
Figure 17. Example List of Settings ................................................................................. 71
Figure 18. Example Settings Change GUI........................................................................ 72
Figure 19. Example Difference of Two Configurations, Settings .................................... 73
Figure 20. Example Settings Code Definition .................................................................. 74
Figure 21. An Example EFW from a Settings Change..................................................... 76
Figure 22. General Process Diagram for Configuration Changes .................................... 78
Figure 23. Process Diagram for a New Configurable Element ........................................ 82
Figure 24. Example Configuration ................................................................................... 83
Figure 25. Example Difference of Two Configurations, Adding ..................................... 84
Figure 26. Example Source for a Configurable Element.................................................. 85
Figure 27. Process Diagram for a Previously Used Configurable Element...................... 88
Figure 28. The V-Model [48].......................................................................................... 102
Figure 29. Release Testing, Old and New Methods ....................................................... 105
Figure 30. Case Study 1, Configuration Change with Latent Defect ............................. 118
Figure 31. Case Study 1, Settings Change with Latent Defect....................................... 119
Figure 32. Case Study 1, Added Configuration Change................................................. 122
Figure 33. Classification of Embedded Controller Defects ............................................ 146
Figure 34. Classification of GUI System Defects........................................................... 148
4
A Firewall Model for Testing User-Configurable Software
Systems
Abstract
by
Brian P. Robinson
User-configurable software systems present many challenges to software testers.
These systems are created to address a large number of possible uses, each of which is
based on specific configurations. Configurations are made with combinations of
configurable elements and settings, leading to a huge number of possible combinations.
Since it is infeasible to test all combinations at release, many latent defects remain in the
software once deployed. An incremental testing approach is presented, where each
customer configuration change requires impact analysis and retesting. This incremental
approach involves cooperation and communications between the customer and the
software vendor. The process for this approach is presented along with detailed examples
of how it can be used on various user-configurable systems in the field. The overall
efficiency and effectiveness of this method is shown by a set of empirical studies
conducted with real customer configuration changes running on two separate
commercially released ABB software systems. These two systems together contained
~3000 configurable elements and ~1.4 million Executable Lines of Code. In these five
case studies, 460 failures reported by 100 different customers were analyzed. These
empirical studies show that this incremental testing method is effective at detecting latent
defects which are exposed by customer configuration changes in user-configurable
systems.
5
1. Introduction
The testing of user-configurable software systems presents significant challenges
to practitioners in the field. These systems allow a huge number of possible user
configurations in the system, each of which can affect its execution. These configurations
are composed of user specified combinations of configurable elements, including
individual settings values which exist inside the elements themselves. Due to this
combinatorics problem, it is infeasible to completely test these systems before release
[30, 31], resulting in many latent defects remaining in the software when it is deployed to
the field.
Recently, there have been a few approaches to try to address this problem. The
first approach combines statistical design of experiments, combinatorial design theory,
and software engineering in an attempt to cover important, fault revealing areas of the
software [26, 32, and 34]. One study of open source software by NIST shows that these
techniques can be effective when tests can cover a large number of pairs [40]. Another
recent study shows a technique which prioritizes configurations, allowing earlier
detection of defects but leading to a decrease in overall defect detection [33]. These
studies were conducted on an open source system and a small set of test cases from the
Software-artifact Infrastructure Repository [41], respectively. Neither of these studies
was conducted on large or industrial systems.
Another approach relies on parallelism and continuous testing to reveal faults in
the system. This system, named Skoll [52] , was developed at Maryland by Porter, et. al.
Skoll runs multiple configurations in parallel on separate systems, allowing for a larger
number of combinations to be tested. In addition, the system employs search techniques
6
to explore the configuration space and using feedback to modify the testing as it is being
performed. Also, this system continues testing configurations after the release of the
product. This is a very promising approach to try on user-configurable systems. However,
it may suffer from scalability problems when used on hardware limited systems, such as
those running on a custom embedded hardware platform, or when multiple release
baselines are being maintained.
In practice, industry testers first verify the system with common configurations
that are created with expert knowledge and sample simulated field data. These
configurations are created to test areas perceived to be high risk, an idea taught by James
Bach [4]. Bach also made the ALLPAIRS program [3] with allows industrial testers to
generate a small set of pair-wise tests which satisfy a coverage standard. Once the system
is verified using these methods, each new customer’s configuration is used in a very
extensive testing activity. This testing is conducted when the software is first delivered,
installed, and commissioned [5], and involves running the software thoroughly on the
specific installed system, including its final settings and configurations, through both
normal run modes as well as any error cases that can be injected into the system.
While these forms of testing may work for the initial commissioning and
installation, users of these software systems often make changes to their configurations
throughout the lifetime of the installation. These changes to the configuration of the
software can cause failures related to defects that are latent and hidden within the initial
released version. These latent software defects were never detected in the release testing
of the system, where the simulated example customer configurations were run, and also
remained hidden for other customers, most frequently due to other customers running
7
different configurations or settings. As a result, customers who have been running failurefree for years, from their point of view, now have a major risk and potential quality
problem when changes to the configuration and settings are made. This issue is
exacerbated by the fact that there was no code within the software that changed, the usual
action that customers associate with the risk of new failures. In many cases, only a few
configuration or settings items were added or modified, leading to this new defect
affecting the software’s stability for that customer.
Before going further into the proposed solution for this problem, a better
understanding of user configurable systems will be presented. User configurable systems
are software programs (or groups of programs) that are created as a general purpose
solution to address a broad market need by presenting the ability to address many specific
needs that individual customers may have. Each customer within this market has a
smaller set of specific problems and needs, and this type of software addresses them by
taking the general purpose software and specializing it. This specialization is
accomplished by using configurations that direct the execution of the program to solve
the exact problem or need each customer has. In order to provide the ability to solve such
varying and diverse issues within this broad market, the system is usually made up of a
large number of configurable, library-like components, called configurable elements.
These elements are only executed when the customers configure the system to include
them in their running configuration. In addition, each of these elements can contain a
number of settings whose values further refine the actions that the element performs.
Configuring systems such as these usually involve connecting or grouping the elements
to process different events and actions, usually in a programming environment. These
8
groupings can be set up either graphically or programmatically, depending on the
implementation of the software and the needs of the specific market.
A real-time control system is an example of a user-configurable software system.
These systems are used to control the operation of factories, power plants, chemical
plants, and pharmaceutical manufacturing. Users of these kinds of systems purchase a
base set of software that contains the many different functions and rules which, either
independently or in cooperation with the vendor, can be used to configure the system to
the customer’s specific process needs. All of the systems installed by customers contain
the same base configurable elements, but each customer uses a subset of them and groups
them in different ways, leading to very different execution patterns for each customer.
For example, there are many different control algorithms, such as Proportional-IntegralDerivative (PID) control algorithms [37], each of which exists as a configurable element.
Within each control algorithm’s configurable element, there are many settings that can
change the operation of the mathematical function used. This function directly feeds the
output value used by other elements in the configuration.
Another example of a user-configurable software system is an Enterprise
Resource Planning (ERP) system [38], such as the ones developed at SAP™. These
systems are meant to model and manage a company’s business process flow. These ERP
systems contain base libraries and functions that are needed to implement and run a
business process. Users of ERP systems can configure the software, either independently
or jointly with the vendor, for their individual business process needs. These types of
systems are becoming more common and more widely deployed. An example of a
configurable element in an ERP system is a business rule. Each company has different
9
business rules for each process they are implementing, most of which are based on one of
a set of patterns. Companies can configure the system with the specific business rule
element for the pattern that they wish to use. An example of configuring one of these
types of systems is the specific accounting model a company wants the system to use.
There are many accounting models available, such as First-In-First-Out or Last-In-FirstOut, and each of them are configurable elements in the system. Each of these
configurable elements has settings, and these settings influence the execution of that
element. In the case of the accounting model, these settings select how the company
wants to organize the accounting model, either by region, cost center, code, country, or
business unit.
A new approach to testing user-configurable software systems is presented,
specifically aimed at finding latent defects that customers would detect. In this approach,
each initial configuration is tested before its initial use. Instead of trying to cover as many
other configurations as possible, additional testing is postponed until the users of the
software make changes to their configurations. By using a method completely based on
user changes, only defects of relevance to a customer will be revealed. Data collected
from failure reports at ABB show that configuration-based failures found by internal
testing are only fixed 30% of the time, compared to non configuration-based failures
which have an overall fix rate of 75%. Configuration-based defects are often postponed
until customers in the field report them.
The proposed approach can be considered a new or modified form of regression
testing and, as such, there is a need to determine the testing required for each type of
customer change to verify that the system still performs correctly and the change did not
10
expose a latent defect. While the purpose and data needed for this new form of regression
testing is different, the main steps that must be followed are the same. These steps
include identifying the specific change itself, determining the impact of that change, and
finally a selection of tests that cover that impact to verify that the system still meets its
requirements after the change and no latent defects are exposed.
Regression testing involves selective retesting of a system to verify that
modifications have not caused unintended effects and that the system still complies with
its specified requirements [1]. There exist many regression test selection (RTS)
techniques that minimize the time and resource costs of retesting changed software.
Methods for test selection and reduction are highly desirable, as a complete retest of a
system is cost prohibitive [20]. Many RTS methods make use of control flow information
to determine the impact of a change, such as [8, 11, 17, 18, and 23]. Many of these
methods were later expanded to support Object Oriented systems [9, 24, 25, and 28]
which complicate basic control flow methods. Thomas Ball improved these control flowbased methods with algorithms that achieve a much finer grain selection of the areas
which require retesting [16]. Besides control flow, many other dependencies have been
used for RTS methods. The first, data flow, expands the impact along longer data flow
dependencies which would otherwise be missed. These techniques, such as [13, 22, 27,
29, and 39], take longer to determine the impact of the change, but allow for the detection
of defects related to these data flow paths. In addition, dependencies dealing with global
variables [12], COTS components [14], and GUI systems [10] have been studied. Finally,
other concepts have been used to augment regression test selection, such as program
slices [21] and guided semantics [19].
11
These existing RTS methods are all intended to detect regression defects coming
from code changes within the software under test. In the case of changes in user
configurations, these methods do not directly apply, since there is no actual change to the
software itself. In addition these current RTS methods all assume no latent defects remain
in the system, since their focus is regression defects that are the result of a code change
within the software product. This is not the case with user configuration changes. Finally,
these systems lack an overall complete test suite that can be used to select regression tests
from, since fully testing a system with this many combinations of configuration elements
is not feasible. These problems will be solved by creating a new RTS method which
addresses configuration changes and latent defects in these types of software systems.
In addition to providing a solution to this specific problem, the new RTS method
provides an industrial organization the opportunity to significantly change their release
testing activities for these user configurable systems. Since each software configuration
change will lead to retesting at the customer site based on the results of this new RTS
method, it becomes less important to do exhaustive testing before release. Each
customer’s current configuration could be tested at each software release, verifying that
no traditional regression defects are introduced in the code for running customers. In
addition, some other unused parts of the system could be tested for general use; for
example, testing could be planned for common changes to these customer configurations,
verifying that likely changes the customer will make do not contain regression defects,
both traditional and latent. The exhaustive testing of these unused areas can be postponed
until a customer configures the system to use these features. This RTS method supports,
in effect, a test-as-the-software-is-used approach. This will lead to faster times to market,
12
as many large configurable systems have many features that remain unused for years.
This can also lead to more satisfied customers, as each customer would know the system
works for their specific configuration and usage, and have that software released out to
them faster.
13
2. Proposed Solution
This chapter presents a high level description and overview of the proposed
solution, shown in Section 2.1. A description of how to use the method is presented in
Section 2.2. Finally, Section 2.3 presents high level examples of how to use the method
in practice.
2.1 Solution Overview
User-configurable software systems present very unique and challenging
problems for all of software engineering and testing in particular. These systems can be
configured in a large number of ways by using the system’s configurable elements, each
of which may contain many settings. These configurable elements and settings lead to a
system with a huge number of possible executions, resulting in a high probability that
latent defects exist. Even after release testing is complete and the software has been
executing in the field, many of these defects remain. Latent defects, such as these, can be
exposed at a later time by customers changing their running configuration. Since these
defects never caused failures to be seen, many customers and software providers treat
these defects as regression faults. Since these defects are not the result of code changes,
traditional regression testing is not able to detect them. Since this problem is different
than previously researched problems in regression testing, current RTS methods do not
directly apply. The events which expose these defects are based on a change, however,
and solving this problem involves using the same principles. The solution involves an
extension to a current regression test selection method, specifically the Traditional
Firewall originally developed by White and Leung [13]. The firewall concept is based on
14
building design where hardened walls are created that prevents a fire from spreading
from one area to another. This should not be confused with network firewalls, which
were named in the same way. This new RTS method is called the Configuration and
Settings Firewall.
Before the Traditional Firewall can be extended to work with this type of change,
a better understanding of RTS methods is presented. RTS methods can be broken down
into common steps that are used. The first step determines the specific areas of the system
that changed. In many RTS methods, this step is accomplished by performing a code or
binary differencing between the newly changed version of code and the previously tested
version. This step is often automated by a tool which compares the two software versions
and generates a list of changes. Next, each difference is identified at a specifically
defined granularity level, depending on the RTS method used. Frequently, the desired
granularity involves identifying the specific function or object that contains the change
and then marking that entire function or object as changed. Once the changes are
identified, the impact of the change on the surrounding system is determined. This step
requires analysis and system knowledge, as dependencies and relationships that exist
within the system must be determined. These relationships include simple concepts such
as control flow, which are related to paths in the code, and more complicated concepts
such as data flow, which describe relationships involving variable passing and program
state. Once the impact from these relationships is identified, it is used in the final step in
RTS methods. This step involves the selection of tests that are needed to verify that this
change does not adversely affect the system. This is accomplished by selecting
15
previously created tests which cover the affected areas, or creating new tests when
previous tests do not exist or are insufficient to test the impact of the change.
A solution to the problem of configurable systems and latent defects involves
creating a new type of regression test selection method for configuration and settings
changes by extending current, code-based RTS methods. This new RTS extends the
firewall method for use on user-configurable systems which is accomplished by making
changes to the actions taken in each step of RTS methods. The first RTS step determines
the changes within the system. In the new firewall, changes are derived from the
configurations themselves by comparing the new configuration to the previously running
and tested configuration. Differences are identified and categorized as either settings
changes or configuration changes. The differences between these types of change are
discussed in Chapter 4. Once all configuration differences have been identified and
categorized, the impact of these changes is determined. This is done by starting at the
change itself and determining all relationships which exist from the change to the rest of
the system, similar to how code-based firewalls are created. Any time there is a
dependency between the current function and other functions in the system, the current
function is marked as changed. Once complete, the analysis continues on the related
function and this process repeats until no more dependencies are present. Dependencies
that are checked for include control flow, data flow, and other relationship types for
which there are firewalls available. The process stops only when there are no more
dependencies to include, which means there is no further impact to the current part of the
system from the changed part of the system. Finally, once the impact has been
determined, tests are selected and created to completely validate the impacted part of the
16
system. These tests can be either new tests or reused tests, depending on if the change
affects parts of the system that have been tested before. This step is very similar to the
various code-based firewall RTS methods. More detail will be presented on each of these
steps and how they are accomplished in Section 2.3 and Chapter 4.
This new Configuration and Settings Firewall makes use of numerous codechanged-based testing firewalls developed in the past for different types of dependencies.
These firewall models include the Traditional Firewall, which handles control-flow
dependencies, the Extended Firewall, which handles data-flow dependencies, the
Deadlock Firewall, and the COTS Firewall, for use where source code in not available. In
addition to these firewalls, current research is ongoing to further extend the firewall
concept for dependencies related to memory leaks and performance issues. Each of these
future firewall models is useful for both code changes and configuration and settings
changes. Before the Configuration and Settings Firewall can make use of any of these
code-based firewalls, each must be shown to be effective for code-change-based
regression testing of industrial systems. This effectiveness is shown by a number of
empirical studies conducted on user-configurable software systems at ABB. These
studies are presented in Chapter 3.
2.2 Using the Solution
Customers must understand that the effectiveness of the configuration and settings
firewall is dependent on it being used on each change to the configuration of the system.
For each set of changes made, the system must be analyzed and retested with the new
firewall to verify no latent defects are exposed by the change. This is especially true if the
proposed changes to release testing shown in Chapter 5 are used by the software vendor.
17
Also, the software vendor must have a good relationship with frequent communication
with the customers of the software, as each change to the configuration will need to have
some testing done to verify that no defects exist in the new configuration. The testing
itself can be done by either the original vendor of the software or, assuming they agree,
the customer who made the change to the configuration.
Each software configuration change will lead to some retesting, so it becomes less
important to do exhaustive testing before release. Customer configurations can be
retested at each release with traditional code-change-based RTS methods. Currently,
unused portions of the system can be tested for very general use to show it performs its
requirements and has no defects in sample configurations if time permits. The detailed
testing of these unused functions can wait until a customer decides to add them to their
configuration. At that time, a configuration and settings firewall is created to verify that
no latent defects were exposed by the change. By deferring this testing until customers
configure it, the product can have faster times to market and more satisfied customers, as
customers primarily care that the system works for their configuration.
2.3 Example Applications of the Solution
To better illustrate the concept and use of the Configuration and Settings Firewall,
a few high level examples are presented. These examples show when and how the
firewall can be used and include configurable control and ERP systems. Further details
on the configuration and settings firewall, including how it is built, are presented in
Chapter 4 and will not be discussed in this Section.
The first system is an embedded process controller. These controllers are
configured to do many forms of process control on various input values and types.
18
Configuring this system involves creating a graph which contains both function code
blocks, which are configurable elements represented by nodes, and relationships between
them, which represent dynamic linkings between the components and are represented
graphically by arcs. Relationships can be unidirectional or multidirectional depending on
the requirements and usage needed between these function blocks. Each function code
contains code which is executed only when an instance of it exists in the configuration
that is currently running in the controller. There are a large total number of function
blocks that can be used to configure a system, many of them dealing with specific data
types and algorithms used to control a process. Within each function code block reside
many settings that affect the way the specific function block executes. These settings take
the form of values assigned in the configuration of the block itself. In general, settings act
as parameters, refining the control block to the specific action required. Settings include
values which define timing values, modes of operation, and even simple labels.
An example based on the first system starts with a controller module that has been
running in the field for five years with no documented issues reported from the customer.
This customer has been steadily adding new configurable elements to the configuration
for the past five years in order to track additional values in the system. These values are
needed by the plant operators and were discovered as important in the process of running
the plant after it was commissioned. Each time a new value is added to the configuration,
it is loaded into the process controller and put into execution. In the most recent change
made, there were a number of new configurable elements added. These additions were
new analog data points that the plant operators wanted to view and monitor as the plant
was running. The plant engineer made the required changes to the configuration and
19
loaded it into the controller. After the new configuration was running, it went into an
error state which caused the plant to shut down. This customer now has a major software
quality issue, including a loss of production, even though the system has been running
failure-free for this customer and no code changes have been made to the software.
Applying the Configuration and Settings Firewall to this example would have
prevented the plant from shutting down by determining the impact of the change and
testing that area for latent software defects. Creating the firewall for this example starts
by identifying the configuration and settings changes made by comparing the new and
old configurations. The details for each step will be shown in Chapter 4. Once the
changes have been identified, a mapping of the configurable elements to the code that
implements them is created. Using this mapping and the differences found, a Traditional
Firewall is created that treats each configuration and settings change as a code change.
Each of the changes is analyzed to determine if additional relationships are present. If so,
other firewalls are created, each of which is described in Chapter 3. Using the results of
the various firewalls, tests are either selected or created to verify that these changes did
not expose any latent defects. After testing is complete, the configuration is loaded into
the customer’s process controller and the plant starts running with the changed
configuration. The defect in this example is due adding a new configurable element to the
last linking slot of a different, previously configured element. A grouping of up to fifteen
analog input points is created by connecting each point to another configurable element
which allows up to fifteen connections. In this case, there had never been a full fifteen
analog values present in the configuration before, since the previous configuration had
only ever used fourteen. This software defect was latent within the code that handles the
20
fifteenth element and was never uncovered since the code for that specific case had never
been run.
Another example user-configurable system deals with human system interface
(HSI) applications. The specific HSI application in this example runs in the Windows OS
on standard desktop PCs. This system is responsible for configuring the physical and
logical organization of the process control system, including physical hardware, network
infrastructure, and control logic. Configuring this system involves adding different
configurable elements from various libraries into a configuration, and then linking these
software elements to the actual physical elements they are representing graphically in the
HSI. In many cases there are multiple configurable elements, such as PIDs and shaping
functions, which are all linked to one specific physical element, such as a temperature or
pressure measurement device.
In this example, the HSI is running the originally released software version and
has reported no defects in the system. The HSI’s configuration is changed every few
months to match physical changes throughout the plant. Recently, the configuration was
changed to add new display and alarm elements for the plant operators. These display
elements are linked to previously existing configurable elements, representing physical
devices in the plant, specifically a temperature sensor and a pressure sensor. The alarm
events are linked to the new display events and its settings contain values describing
when to enter the alarm state and what action to take when this alarm state is reached.
The configuration is loaded into the system and when the alarm event occurs, the
specifically defined action does not occur due to a latent software defect. Thankfully, the
21
plant operator noticed the value should have triggered an alarm and was able to shutdown
the plant, or a larger issue would have occurred.
By using the Configuration and Settings Firewall in this situation, the area
affected by the change would have been identified and tested before the system started
executing the change, and the defect would have been detected earlier. For this example,
a Configuration and Settings Firewall is constructed. In this case, the new alarm and
display elements were linked to previously existing elements and added to the
configuration. From this change information, a Traditional Firewall is created, along with
any other needed firewalls, which identified the impact of the change. Then this impact is
tested, revealing any latent defects exposed in the changed configuration. The defect in
this HSI example is related to the actual response to the event that was configured in the
system, and this response is a selection of possible actions that can be taken to remedy
the event. Inside this HSI system, the action selected by this changed configuration had
never been run before, and this action did not respond properly due to the latent defect in
the code. This defect existed in an area of the code that was contained in the impact
determined by the Configuration and Settings Firewall which would have led to it being
detected earlier.
A final example of configurable systems is an ERP system. The ERP used in this
example is a SAP system and is configured to run centralized accounting functions. This
system was running for over four years with no major quality problems. The customer
decided to change from the International Financial Reporting Standards to the Generally
Accepted Accounting Principles that the US markets use. This required configuration
changes, which lead to a small error affecting financial records. This failure was found
22
internally by audits, but it now represents a major defect, since the Sarbanes-Oxley Act
requires all financial records to be certified correct by the CEO and executives of the
company. By constructing a Configuration and Settings Firewall, defects of these types
can be detected before the configuration change goes into live use.
These examples make it clear that the problem of latent software defects
contained in user-configurable systems is critical. Applying the Configuration and
Settings Firewall is an effective way to address this problem and can detect these defects
before they cause major customer problems. Chapter 3 briefly presents the underlying
code-based firewalls that are used in the Configuration and Settings Firewall,
highlighting significant new empirical studies that show their effectiveness in code-based
regression testing. Chapter 4 presents the details of the Configuration and Settings
Firewall, including the differences between configurable elements and settings, a detailed
listing of the action for each of the steps, when and how the firewalls in Chapter 3 are
used, and a detailed set of examples on how to construct this new firewall. Chapter 5
presents a modified release testing process to augment use of this firewall after release.
Finally, Chapter 6 presents the setup and results of a set of empirical studies that show
the effectiveness and efficiency of this method in practice.
23
3. Needed Firewalls
The Configuration and Settings Firewall utilizes code-change-based firewalls to
determine impact. In order for the new firewall to be accurate, an empirical investigation
of previously developed firewall models have been conducted, as well as the creation of a
new firewall to address impact propagation that is not currently addressed. The
Traditional, Extended, and COTS Firewall ideas were developed by others outside the
scope of this research. The Deadlock Firewall and all of the empirical investigations of
these firewalls were developed as part of this research. In addition, firewalls are proposed
for additional types of impact propagation that are not supported currently.
This chapter is organized with Section 3.1 containing a brief overview of the
Traditional Firewall [13], as well as the details of the industrial empirical studies
conducted on it. Section 3.2 presents the Extended Firewall and the results of new
empirical studies on its use in industrial practice. Next, Section 3.3 presents the COTS
Firewall for third party components and the results of studies on its effectiveness. Section
3.4 discusses the newly created Deadlock Firewall and presents the results of empirical
studies of its use in industry. Finally, Section 3.5 talks about additional future firewalls
that would be beneficial for both code-change-based regression and the new
Configuration and Settings Firewall.
3.1 Traditional Firewall
The Traditional Firewall RTS method (TFW) can be applied to both Procedural
software [13] and Object-Oriented software [9]. These methods both involve determining
the difference between the code of a previously tested software version and a changed
24
version that needs to be tested. This difference is usually created with some kind of
differencing tool, such as Araxis Merge [42]. Each individual change in the source code
is mapped to the function or object in which it resides and this function or object is
marked as changed. This mapping sets the granularity of the change to the level required
for the RTS method, in this case the function or object level. Once all of the differences
have been determined, an analysis of the impact the changes have on the software is
performed. The impact analysis used in the Traditional Firewall method, both procedural
and OO versions, involves starting at the function or object identified as changed and
then selecting each function or object that is one level away, in a control flow graph, from
the change as needing to be tested. Each of these functions or objects, including the
change itself, is considered to be inside the testing firewall. Only the functions or objects
that have a calling relationship with the changed entity will need to be tested. Data flow
relationships are not considered in this firewall model. Each function or object identified
as needing testing is mapped to test cases based on the type of change. These types
include checked, requiring only a few tests to check the functionality of the component,
changed, requiring a complete retest of the component, or affected, which requires all
interfaces to the changed component to be thoroughly tested. Based on this classification,
tests are selected to cover the functions and objects within the firewall and nothing
outside the firewall requires any retesting.
For procedurally designed systems, the firewall is represented graphically as a
calling tree where functions are the nodes and function calls are the arrows. Each code
change is marked on the node of the graph representing the changed function. The
boundaries of the firewall are represented as bold arcs on the function calls leading into
25
the functions requiring test. In practice, when complete calling graphs do not exist, these
graphs are generated ad-hoc, by starting at the change and following the control flow
dependencies which exist in the code. An example of a procedural firewall graph is
shown in Figure 1.
Figure 1. Example Procedural Firewall Graph [13]
For object oriented systems, the firewall is represented graphically as a class
relationship diagram. Each class is represented as a node and each relationship is
represented as an arc. Different styled arrows represent each kind of OO relationship
supported in the model, namely inheritance, composition, association, and usage. The
arrows that mark the edge of the testing required are bold as in the procedural version.
These firewalls, when used in industry, are created in a similar manor as the procedural
method, by starting at the changed object and building the graph along each control flow
dependency until there are no additional relationships left to add. Figure 2 shows an
example of an Object-Oriented firewall graph.
26
K
H
J
I
M
G
W
L
X
U sing
AA
B
H
C
T
D
N
O
U
S
E
P
Q
F
R
V
Specification C ode
C hanged
C hanged
C hanged
A ffected A dded
Included in the firew all
C hecked
Figure 2. Example Object-Oriented Firewall Graph
These Traditional Firewalls, both procedural and object oriented, were never
empirically studied on real systems in industry. Before they can be used in the
Configurations and Settings Firewall, their effectiveness on real industrial systems was
verified as part of this research. This empirical evaluation was accomplished by selecting
iterative versions of many different software systems at ABB and creating Traditional
Firewalls on them [7, 39]. The TFW was run on fourteen different major releases of
software and the object oriented firewall was run on fifteen different major releases, four
at ABB and eleven at another company. These selected versions represented new releases
of the software at the time of the study and used the firewall method to select tests
required to verify the system had no regression defects. In order to determine the
effectiveness and efficiency of the firewall method, the original ABB method of selecting
tests was run in parallel. This original method involves guess work based on feedback
from previous releases and both system and developer-based expert knowledge. This
previous ABB method was, in effect, a very rudimentary RTS method. The analysis
conducted here is somewhat different then those used in other studies on regression, as
the effectiveness is determined by comparing the ABB expert guess method against the
27
TFW method. Previous studies looked at comparing the retest all case to the specific RTS
method under evaluation. Complete retesting of these large systems takes such a long
time that using it as a comparison for TFW would result in a reduction so large that it
would be meaningless. For example, a complete retest for a specific software product in
ABB takes three and a half man years of effort. The ABB expert guess method identified
three man weeks of regression testing. The TFW for a small change may select three man
days worth of effort, leading to a 99.87% reduction of test time when compared to the
retest-all case. A more accurate comparison of the TFW to the current ABB RTS method
yields an 85.8% reduction in test time. By comparing the TFW to the current ABB
method, we can present a benefit that is more meaningful to management and developers.
The Traditional Firewall for procedural software was empirically validated with
case studies performed at ABB for the releases listed in Table 1. The time required to test
the TFW, shown in the firewall test time column, includes both analysis time, listed in the
second column, and time required to execute the tests identified by the TFW. Using the
TFW for these releases at ABB led to an average reduction of 42% in test cases executed
and an average reduction of 36% in test time, when compared to the original test
selection method used in ABB [7]. In addition to savings in test cases and calendar time,
an additional 14 defects were detected that would not have been detected in the original
testing, as well as all 28 of the defects that would have been detected using the original
test selection method. The increase in defects detected is due to the developers and testers
selecting incorrect areas to retest based on both their past data and their expert
knowledge. In addition to these savings and defects detected, no customer regression
28
defects were reported against the versions in this study after release, showing that firewall
is effective at selecting correct test cases.
Table 1. Results of Procedural Firewall Testing at ABB
Analysis
Number Firewall
Files of Builds Test Time
Time
Project (Hours) Modified Tested (Days)
10
10
2
4
1
6
13
1
3
2
13
2
9
8
3
9
17
3
10
4
23
22
5
15
5
18
7
11
25
6
3.5
5
1
10
7
1
7
5
3
8
1
4
1
4
9
0.5
2
2
5
10
0.5
2
4
5
11
60
31
17
15
12
10
113
8
25
13
15
93
27
20
14
338
96
152
Totals: 170.5
Orig
Number Number of
%
%
Test
of Tests
Tests
Time Savings: Original Firewall Savings
10
60%
75
42
44%
5
40%
25
10
60%
20
60%
60
40
33%
20
50%
150
84
44%
30
50%
140
66
53%
35
29%
500
305
39%
5
-100%
30
45
-50%
8
63%
110
60
45%
5
20%
50
20
60%
10
50%
60
30
50%
10
50%
60
30
50%
20
25%
200
90
55%
35
29%
250
155
38%
25
20%
200
130
35%
238
36%
1910
1107
42%
The use of the object oriented TFW at ABB led to a reduction in test cases
executed by 63% and a reduction in test time of 71%, shown in Table 2. The data was
collected and created in the same way that it was for the procedural TFW. In addition, no
new regression defects were detected by the customers after release in the two objectoriented systems analyzed. The results of using TFW on object-oriented software are
shown in Table 2. Additional results are shown in Tables 3, 4, 5, and 6.
Table 2. Results of Object-Oriented Firewall Testing at ABB
Analysis
Number Firewall Orig
Number Number
Time
Files of Builds Test Time Test
%
of Tests of Tests
%
Project: (Hours) Modified Tested (Days)
Time Savings: Original Firewall Savings
0.5
3
1
3
10
70%
130
51
61%
1
1.5
4
1
2
7
71%
60
20
67%
2
2
7
2
5
17
71%
190
71
63%
Totals:
29
Since this TFW method only takes control flow relationships into account, it is
not considered “safe”, which is defined as guaranteeing that all possible defect finding
tests remain in the regression test suite [6]. Even with that limitation, these empirical
results show that the method is effective while still being very efficient for use on real
software systems in industry [7].
3.2 Extended Firewall
A main shortcoming with the Traditional Firewall deals with the impact analysis
stopping one level away from the change. This limits the effectiveness of the method to
only control flow dependencies. While no defects were missed in the releases studied for
the TFW at ABB, a few regression defects were found by customers using other products
or versions that were not analyzed with the firewall method. Some of these defects would
not have been detected in the TFW, as these defects were due to changes made in long
data flow relationships within the software.
The Extended Firewall (EFW) was developed to increase the effectiveness of the
TFW by extending the impact more then one level away from the change when certain
data flow paths are present. This method extends the firewall model from just control
flow dependencies to include these longer data flow dependencies which exist in large
user-configurable software systems. Regression testing of data flow paths is not new and
has been used in past research, such as [13, 22, 27, 29, and 39].
The key additions to the Traditional Firewall are the ideas of external
dependencies and handling of return values. External dependencies occur when inputs to
a function or object are based on another function or component output, creating a data
flow dependency between the components. Return values from functions are also now
30
analyzed to determine if they form a unique data flow path in the code. The TFW
assumes that all inputs to changed functions have no external dependencies and therefore
selecting objects or functions one level away is enough.
The EFW starts with a standard TFW, as shown in Section 3.1 As the TFW is
created, each input to a changed function or object is inspected to determine if it is has
any external dependencies. These dependencies include bi-directional paths where both
callers and return paths are included. If dependencies exist, then this dependent function
or component is included in the firewall and its callers are checked to see if they have
external dependencies or not. This continues until the chain breaks by reaching a function
or object that has no external dependencies, which becomes the edge of the firewall. This
node represents the start of the data dependency and the whole path, from this starting
node all the way back to the change, is identified as a data flow path that needs to be
retested. An example of an Extended Firewall is shown in Figure 3.
Affected
Components
External
E:
P
A1
A2
Ak
Must Be
Tested
Cm
Messages
Must Be
Checked
Modified
Component
Checked
Components
C1
External
E:
K
C2
Figure 3. Example Extended Firewall Graph
31
Figure 3 shows an example EFW graph created for an Object Oriented system.
This graph represents a subset of the total relationships that exist in the system and only
includes objects within the firewall that require retesting. A data flow path exists, starting
at the node labeled P and ending with the node labeled K. Notice that there are numerous
nodes labeled A or C that do not include longer data flow paths. For these nodes, the one
level away method of the Traditional Firewall is sufficient to detect regression defects.
The TFW is contained completely within the EFW, so all tests that the TFW selects are
contained in the set of tests that EFW selects. In addition, the EFW only adds tests for
these additional data flow paths that are identified in the analysis.
In order to determine the effectiveness of the Extended Firewall, an empirical
study was conducted on a large user configurable software system. This empirical study
was carried out on ABB software as well as a separate study on software from a large
telecommunication company in another research project.
The first empirical study of the EFW was conducted on a large user configurable
software system at ABB as part of this research project. This study was completed on a
system with over one million lines of code and more then 1,000 classes. Due to the size
of the system, only two incremental versions were analyzed. The results are shown in
Tables 3 and 4.
Table 3. Results of EFW Testing at ABB
Ver.
1
2
Modified
Classes
29 : 12 new
18 : 3 new
TFW
Methods
181
145
EFW
Methods
239
163
32
TFW
Classes
86
82
EFW
Classes
101
94
Faults
Detected
6 : 1 new
10 : 2 new
Table 4. Effort Required for EFW Testing at ABB
Ver.
TFW
Tests
EF
W
Tests
Add.
EFW
Tests
TFW
Analysi
s (Hrs)
1
2
168
112
219
141
30%
26%
20
17
EFW
Analysi
s
(Hrs)
25
20
TFW
Test
Time
(Hrs)
35
21
EFW
Test
Time
(Hrs)
48
27
Add.
EFW
Time
37%
29%
This data shows the number of classes modified as well as the number of methods
that were identified as affected in both the TFW model and the EFW model. The
Extended Firewall added approximately 28% more test cases and 34% more test time
when compared to the Traditional Firewall. This additional time is due to each input to
the changed object or function having to be checked to see if it contained external
dependencies. In addition, if it does contain these dependencies, more tests must be rerun.
The EFW will always detect all defects that the TFW does, as the EFW only adds
additional retesting areas to the results of the TFW. This is shown with the TFW and
EFW columns in Table 3. Each version’s TFW numbers are less than or equal to the
numbers in the EFW. For all of the extra time spent, there are a total of three defects that
are only detected by the EFW.
The second empirical study involved a large telecommunication system and was
conducted as part of a separate research study by another researcher. It is listed here only
to show the effectiveness of the EFW method. This study contained 11 incremental
software releases each with around 66 classes and 11K lines of C++ code. Each version
had both a TFW and an EFW constructed for it. The results of the empirical study are
shown in Tables 5 and 6.
33
Table 5. Results of EFW Testing at Telecom Company
Builds
Ver. 1
1.1
1.2
Total
Ver. 2
2.1
2.2
2.3
2.4
2.5
Total
Ver. 3
3.1
3.2
3.3
3.4
Total
Modified
Classes
TFW
Method
s
EFW
Method
s
TFW
Classes
EFW
Classes
Faults Detected
(Failures)
2
1 (new)
3
5
2
7
13
3
16
5
2
7
7
3
10
2
1
2
1
1
1
1
1 (new)
5
3
1
2
3
1
10
3
1
3
5
2
14
3
1
2
3
1
10
3
1
3
4
2
13
1 new
1
2
2 : 1 new
3
2 : 1 new
1
1
1
1
4
1
4
1
1
7
2
4
3
2
11
1
4
1
1
7
2
4
2
2
10
2
2
3
2 : 1 new
23 : 4 new
Table 6. Effort Required for EFW Testing at Telecom Company
Ver.
TFW
Tests
EF
W
Tests
Add.
EFW
Tests
TFW
Analysis
(Hrs)
1
2
3
88
90
80
115
105
110
31%
17%
38%
6
7
5
EFW
Analys
is
(Hrs)
13
11
10
TFW
Test
Time
(Hrs)
15.7
16.0
14.3
EFW
Test
Time
(Hrs)
20.2
18.5
19.3
Add.
EFW
Time
52%
28%
52%
This study showed that the EFW added approximately 28% more test cases and
40% more test time. This time includes both analysis time and test time. While EFW
testing required more time then the standard TFW, four new EFW defects were found in
the two versions that the TFW would have missed. This data is similar to the data
collected from ABB in the first study.
These empirical studies show the Extended Firewall to be effective for industrial
software systems when extended data flow paths are present and affected by the change.
This method is less efficient overall than the TFW, as it takes more time to identify and
map the external dependencies in the software, but the benefit of this additional time is an
increase in the effectiveness of the model when compared to the TFW. Because of the
34
larger amount of effort to complete this firewall, it is recommended that the EFW be used
only when these data flow paths are present in the system, which can be determined as
the TFW is being created.
3.3 COTS Firewall
The regression test selection methods listed in Sections 3.1 and 3.2 base their
analysis on source code. Many software systems, including user configurable systems,
use third party commercial-off-the-shelf (COTS) components. These components are sold
to the user with no code, only containing executable images such as a library files or
DLLs and some user documentation. The users of these COTS components must
integrate these black boxes into their system by creating glue code that interfaces
between the COTS component and the system which uses it.
These COTS components often change, sometimes as frequently as every eight
months [43]. When these components change, customers that use them often need to
conduct regression testing to verify that the changes in the COTS component do not
adversely affect the customer’s own system, which contain and use these components.
This is made difficult by the vendors, who do not include any reliable change information
with the new version and lack the details needed to help with a code based RTS method.
Without source code available, a retest all or an expert knowledge guess method
are needed to select test cases for regressing these systems. A solution to this problem is
to extend the firewall model to support COTS software, which was developed as a joint
collaboration with North Carolina State University (NCSU) and ABB [14]. This
extension works directly with the binary image files instead of the source code, as source
is not available for COTS analysis. The changes in the binary files must be identified, in
35
this case by a direct differencing of the images. Changes to the internals of the
component must then be mapped to the top level API functions that use the changes,
which are marked as affected. These affected API functions can then be compared to the
customer glue code so customer impact can be identified. Once this is complete, a user
can retest the parts of their system that use these affected top level API functions. All
other API functions that do not use the changed parts of the COTS component do not
require regression testing. Figure 4 shows an example COTS Firewall graph.
Figure 4. Example COTS Firewall Graph [14]
This figure shows how changes internal to the COTS Firewall are mapped to the
top level API functions. Changed are propagated from the changed function to the
externally exposed API functions one caller at a time. Changed function N is called by
E4, which is then marked as changed. Once the top level API functions are identified as
either changed or unchanged, each caller from the customer’s application that call
changed COTS functions are marked as changed, shown as G2 and G4. Additionally,
TFWs and EFWs are created around each, leading to G3 and the unlabeled nodes directly
connected to G2 and G4 to require testing.
36
This firewall is constructed following the same steps as the previous firewalls,
namely differencing, impact, and test selection. The difference is determined by
comparing the previously used and tested binary image of the COTS component to the
newly changed binary image. This difference, since it contains compiled binary code, can
have many sources of change, only a few may be due to actual source changes. These
source changes are the only differences that are important for this analysis, so a student at
NCSU created a method to remove this other unneeded information. The removed
information includes address table changes, specific calling addresses that move within
the library, and other compiler related flags and options. More information on the details
of how this method works can be found in [14].
Once the changed functions are identified, the impact must be determined. The
impact analysis begins the same way that the Traditional Firewall does, by starting at the
change. Instead of stopping one level away, as the TFW does, or determining data
dependencies, as the Extended Firewall does, this method goes up the calling tree to
determine the highest level API functions that call this change. Those API functions are
considered affected and needing retesting to verify that there are no changes in the COTS
component that break the customer application. This calling tree is created by using the
various address tables that exist within the different types of components. Each of the
affected API functions must then be retested by executing the parts of the customer
application that use these functions. No other API functions need to be retested.
In order to verify the effectiveness of this COTS Firewall, there were a total of
four different empirical studies conducted at ABB. The first study was conducted on a
757 thousand lines of code (KLOC) ABB application written in C/C++, using a 67
37
KLOC internal ABB software component in library (.lib) files written in C. This internal
ABB component was considered the COTS component in this study as it was created and
built in a different location and then the .lib file was used in the main application. No
source code for the .lib file was available to the developers of the main application. The
result of the first case study indicates that this COTS Firewall can reduce the required
regression test cases by 40% on average [14]. Some releases required no retesting, as no
changes in the component affected the APIs that the product was using. The detailed
results are shown in Table 7.
Table 7. COTS Firewall, First Study Results at ABB
Metrics
Changed component functions
Added component functions
Deleted component functions
Affected exported component functions
Affected functions in the application
Total test cases needed
% of reduced test cases
1 vs. 2
164
3
4
331
60
592
0%
2 vs. 3
668
2
2
331
60
592
0%
Comparisons
3 vs. 4
4 vs. 5
1
664
0
0
0
0
2
331
0
60
0
592
100 %
0%
5 vs. 6
2
1
0
39
0
0
100 %
The results in Table 7 show that, for this component and its changes, either all the
API functions that are accessed by the customer need to be retested or none of them. In
two of the revisions, there was no regression impact identified to the main ABB
application at all, while in the other three versions, all of the APIs the customer code used
were affected and needed to be retested.
The second study was conducted on a 400 KLOC ABB application written in
C/C++. This product uses a 300 KLOC internal ABB software component in library (.lib)
files written in C. The full, retest-all strategy takes over four man months of effort to run.
Five incremental releases of the component were analyzed and compared to study the
38
effectiveness of the COTS Firewall method at reducing regression test cases. The results
of the study are shown in Table 8.
Table 8. COTS Firewall, Second Study Results at ABB
Metrics
Total changed functions identified
True positive ratio
Affected exported component functions
% of reduced affected exported component functions
Affected user functions in the application
Percentage of reduced affected user functions
Total test cases needed
Percentage of reduced test cases
1 vs. 2
388
99.46%
84
31.71%
38
17.39%
151
30%
Comparisons
2 vs. 3
3 vs. 4
1238
4
98.39%
100%
122
1
0%
99.18%
59
1
0%
98.31%
215
11
0%
95%
4 vs. 5
13
100%
8
93.44%
6
89.33%
20
91%
The first release was able to reduce the testing by 30% which was a significant
reduction over the original testing, a full retest all, that was done on this release without
the COTS Firewall. The second release had no reduction due to all of the API functions
being affected by internal changes. For both of these first two releases there were a large
number of internal changes to core functions. The final two releases had a significant
reduction in regression test cases needed saving significant time over the retest all case.
The third empirical study was conducted with the same ABB application that was
used in the first case study. This application uses many different components, and this
study looked at a different component within this application, specifically a three KLOC
internal ABB software component. This component is a DLL file written in C. Four
incremental releases of this component were analyzed with the COTS Firewall method.
The results for this study are shown in Table 9.
The results from this study show another example where either complete testing
or no testing is required. The first and third comparisons showed no reduction in testing
needed due to all the top level API functions being affected by the internal change. The
39
second comparison showed that no testing was needed, as none of the used API functions
were affected.
Table 9. COTS Firewall, Third Study Results at ABB
Metrics
Affected exported component functions
True positive ratio
% of affected exported component functions
Affected glue code functions
% of affected glue code functions
Total test cases needed
% of test cases reduction
Actual regression failures found
Regression failures detected by reduced test suite
Comparisons
1 vs 2 2 vs 3 3 vs 4
45
9
44
100% 100% 100%
91.8% 18.4% 84.6%
2
0
2
100%
0%
100%
31
0
31
0%
100%
0%
1
0
0
1
0
0
The final empirical study was conducted on a 405 KLOC ABB application written
in C/C++. This application incorporates 115 different internal ABB software
components, of which 104 are .dll format and 11 are .ocx format. These components were
written in C/C++. Four of these components were selected for study. Each is
implemented in the Component Object Model (COM) [44], three of which are packaged
in a DLL file and one is packaged in an OCX file. The results of this study are shown in
Table 10.
Table 10. COTS Firewall, Fourth Study Results at ABB
Metrics
Same linker?
Affected exported component functions
True positive ratio
% of affected exported component functions
Comparisons
1 vs 2 2 vs 3 3 vs 4
Yes
Yes
Yes
3
10
3
100%
100% 100%
42.9% 66.7%
75%
% of test cases reduction
Actual regression failures found
Regression failures detected by reduced test suite
93.4%
1
1
97.6%
1
1
90.4%
1
1
These results show an average reduction in test cases of over 90% for these
releases. In addition to the savings, one regression defect was found in each release by
using this COTS Firewall.
40
The final results of the four empirical studies show that the COTS Firewall is
effective in reducing the number of tests needed to retest the customer software due to
changes in third party COTS components. There are some factors that limit the
effectiveness of this method. The first limitation is shared with most other RTS methods
and deals with the design of the component itself. If the component is highly coupled,
even a simple change can have a very large impact on the rest of the component. This
was the case in the components studied, where the test reduction was either 100% or 0%.
Another factor which limits the effectiveness of this routine is legal in nature. Breaking
down a component into its constituent parts and determining relationships between them
could be considered reverse engineering, which goes against the End User License
Agreement (EULA) that the COTS components are sold under. It is very important when
using this method on 3rd party COTS software to work with the vendor when using this
method [45].
3.4 Deadlock Firewall
There are other types of relationships, such as the data flow relationships in the
Extended Firewall, which are not handled in the Traditional Firewall method. One of
these has been an issue for user-configurable software systems at ABB, specifically
relationships that lead to deadlock. Deadlock occurs when two or more processes request
the same set of resources in a different order. Since these resources are held by processes
which request additional resources, none of the contending processes can make progress
in their activity. In the real-time software studied at ABB, deadlock can occur with many
types of relationships. These relationships include the traditional case of tasks and
semaphores, which are the processes and resources, respectively, as well as slightly
41
different cases dealing with message queues and task interaction, which is a specific case
of a blocking system call in the software. Any place in the code where the system or
application waits on a resource while holding another resource at the same time is a
candidate for deadlock, depending on the order the resources were taken in.
The concept of deadlock and its detection is not new, so only a quick overview
will be presented. The first case to consider involves a system with two tasks and two
semaphores, shown in Figure 5. It is arbitrarily assumed that task T1 requests and takes
semaphore S1 before task T2. At a point later in time, T1 will request semaphore S2, and
it does this after T2 has requested S2. After T2 requests S2, it requests semaphore S1 at a
time after it has been taken by T1. Neither tasks T1 nor T2 can continue to execute
because T1 holds S1 and T2 holds S2 and both are waiting for each others resource
without releasing it. T1 needs S2, T2 needs S1, and so no progress can be made by either
task. This is known as two way deadlock. This precise ordering of the events in tasks T1
and T2 are what cause the deadlock to occur. If the order in which the semaphores were
taken was consistent throughout the software system, for example, always requesting S1
before S2, no deadlock could occur. This important ordering is why deadlock rarely
happens at release, since the development team has a greater understanding of the usage
of resources in the system. Once the software is released and maintenance begins, that
system resource knowledge can be lost and regression defects injected.
Figure 5. Example Deadlock Graph, Two-Way
42
Another example involves three-way deadlock which can occur in real-time
software. The three-way deadlock graph is shown in Figure 6. Three tasks T1, T2 and T3
are involved as well as three semaphores S1, S2 and S3. T1 takes S1 first, and later
requests S2. T3 takes S2 first, and later requests S3. T2 takes S3 first, and later requests
S1. This is an example of a cyclical deadlock where each task holds one unique resource
and is waiting for a different resource already held by another task. It is possible to have a
k-way deadlock, where k = 3, 4, 5…n. In practice, this would be very inefficient to
design and operate, and k would be limited to the number of tasks and semaphores,
whichever is larger.
Figure 6. Example Deadlock Graph, Three Way
The real-time systems studied at ABB have shown that there are many other
sources of deadlock that do not deal directly with tasks and semaphores. Deadlocks can
arise from the use of message queues or other forms of task interaction via messages,
such as signals, as long as any blocking calls are present. More generally, deadlock can
occur whenever a number of processes share any type of global resources, assuming that
the task waits for access to that resource.
With message queues, deadlock can occur when the queues fill up. This is a
condition very difficult to preclude or predict. A common example of this in software
today is TCP sockets with blocking waits. These sockets have a sliding window buffer
which, when full, will not accept more data, putting the process in a blocked state. Figure
43
7 shows a different scenario that will exhibit deadlock. If any message queue fills up,
there is a corresponding task that cannot process an incoming message since it is stuck
waiting to send to a filled message queue, which backs up the cycle and causes deadlock
to occur. Specifically, if Q1 fills up, task T1 is unable to process incoming messages
since it is blocked waiting for an outgoing message be sent. This exhibits the necessary
condition for deadlock where a task must be both a producer and consumer of messages.
Figure 7. Example Deadlock Graph with Message Queues
Within the software studied, initial designs rarely contained conditions for
deadlock, but frequently these deadlock conditions were the result of hasty revisions of
the code without careful testing or analysis. This led to regression defects in test or in the
field due to the changes in the software. This issue becomes even more important when
studied in user-configurable software systems, such as those developed by ABB. These
systems may not exhibit deadlock in thousands of runs of the software since this deadlock
may be dependent on a specific configurable element to be running, or even a specific
setting value being used. Unlike the firewalls discussed so far, which are testing firewalls,
the firewalls for detecting deadlocks will be based on structural analysis, as opposed to
actual testing. The structural analysis will consist of labeled graphs, an example of which
is shown in Figure 8. This figure shows a 2-way deadlock situation with three tasks and
three semaphores.
44
Figure 8. Example Deadlock Firewall Graph for a Modified Task
Creating this firewall follows the same methodology as the previous firewalls.
The first step is to create a Traditional Firewall. The analysis for deadlock starts at the
changed nodes in the Traditional Firewall and determines if the change affects any use of
shared global resources. If the change does not, then no Deadlock Firewall is needed. If
the change does include a shared resource then it is considered affected and all users of it
are checked for other resources that they can hold or take at the same time as this affected
resource. This continues until all related dependencies have been mapped. If this
uncovers a cycle then the cycle is checked to see if it contains the ordering issues that
lead to deadlock.
There were four major empirical studies conducted of the Deadlock Firewall as
part of this research. These studies involved analyzing code from ABB’s current realtime systems product line, including real-time process controllers, communication
modules, smart sensors, and data servers. The first step was to identify previously
released software versions that contained deadlock. The code for the version with a
known deadlock defect and the previous version were acquired and then the Deadlock
Firewall method was applied to the two software versions. This Deadlock Firewall was
then checked to verify that the deadlock defect was detected in the firewall analysis.
45
Since each of these software revisions did not have deadlock graphs constructed for the
previous versions, they were created as part of the firewall process.
The first study involved a communications gateway, which was an object-oriented
software product. The release chosen for analysis contained changes to six source files.
Within those six source files, two objects were affected. Each object had six methods
changed. A firewall graph was constructed that showed the system without the new
change. This is shown in Figure 9.
Figure 9. Example Deadlock Firewall Graph, First Study Original
This graph only shows the semaphores relative to the firewall constructed in
Figure 9. It was determined by a Traditional Firewall that a code change added a
semaphore to task T1, which now takes semaphores S1 and S2 in that order. The new
semaphore, S2, was accessed by two callers before the change. After this original graph
was created, it was updated to show the new semaphore dependency, which is shown in
Figure 10.
Figure 10. Example Deadlock Firewall Graph, First Study Changed
46
After the update was completed, the resulting deadlock graph was analyzed for
any of the deadlock patterns that were identified before. Figure 10 shows that task T1 and
task T5 form a cycle with semaphores S1 and S2. T1 takes S1 and then S2, while T2
takes S2 then S1. This shows a potential deadlock between tasks T1 and T2, which is the
deadlock discovered by the customer in the released product version.
The second study involved a real-time communications gateway, which is a
procedural designed module. This release contained 31 files changed, including code
changes to 72 functions. A firewall graph was constructed that shows the system before
the change. It was determined by a Traditional Firewall that a code change added two
blocking message queue calls on message queue A to task T1. These new message
queues were added to the graph, which is shown in Figure 11.
Figure 11. Example Deadlock Firewall Graph, Second Study
The firewall analysis shows that task T1 now both sends and receives on message
queue A. Since both the message queue send and receive operations are blocking, this
situation leads to a message queue deadlock, as described in Section 5.4. This problem
was detected in the field by customers in this software version and reported to ABB. In
addition to this one defect, four other potential deadlock conditions involving message
queues were identified in the system. These had not been seen yet in the field, but might
have been in the future.
47
The third study involved another communications module, which was also a
procedurally designed module. This release contained 22 changed files, which contained
41 modified functions. A firewall graph was constructed that shows the system before the
change. It was determined by a Traditional Firewall analysis that tasks T1 and T2 now
use a blocking message queue send. The new message queues were added to the graph;
the results of the firewall are shown in Figure 12.
T4
Message
Queue C
Message
Queue A
T1
Message
Queue C
T5
Message
Queue D
Message
Queue B
T3
Message
Queue A
T2
Message
Queue B
Figure 12. Example Deadlock Firewall Graph, Third Study
The firewall analysis shows that task T1 sends a message to task T3 via message
queue A. Then task T3 sends a message to task T4 on message queue B. After that, task
T4 sends a message to task T1 on message queue C. This cyclical deadlock detected was
the same deadlock detected by customers in the field and reported to ABB in this
software version. The Deadlock Firewall also detected two additional deadlock cases in
this module similar to the one found in the field. These were corrected before the
software was released, preventing them from ever being found by the customer.
The final case study involved using the Deadlock Firewall analysis on a new
piece of changed software. This software had passed all of its unit, integration,
regression, and systems tests and was certified as ready for release by the test department.
This new software had 93 files modified, including 112 functions. The Deadlock Firewall
analysis was conducted and yielded the firewall graph shown in Figure 13.
48
Figure 13. Example Deadlock Firewall Graph, Fourth Study
This graph shows that task T1 sends a message to task T2 on message queue A.
Then task T2 sends signal 1 to task T3 and then waits for signal 2 from T3. Task T3
sends a message to T1 on message queue B, and also signals back to task T2. This is a
cyclical deadlock case using both signals and message queues. As soon as any one
message queue is full, all three tasks will have deadlock. This deadlock was successfully
detected by the firewall method, and a major defect was detected prior to release of the
software. In addition, three other deadlock conditions were detected, two of the three
dealt with message queue additions, as discussed in study 2, and the other dealt with a
more traditional semaphore deadlock. These graphs are not shown here, as they are
similar to the previously described studies.
These empirical studies show that the Deadlock Firewall can be effective at
detecting regression deadlock defects that make it out to customers in the field. In
addition to detecting deadlock previously found in the field, this method was able to
identify other deadlock dependencies that existed due to a change that were not yet found
by the customers. The Deadlock Firewall model was a key addition to the firewall suite,
as deadlock was a relationship that was not handled in existing firewall models.
49
3.5 Other Future Firewalls
There are additional dependency types which can propagate impact to other areas
of the system that are not covered by firewalls today. This section describes needed
firewalls to address these dependencies and relationships. These firewalls will be useful
in both traditional code based regression testing as well as in the Configuration and
Settings Firewall. Until they are created, these represent limitations on the effectiveness
of both the code based firewall suite and the new Configuration and Settings Firewall.
There are many forms of impact that can occur from software changes [15], three of
which are discussed in this section.
User-configurable systems often face regression defects due to memory leaks.
Since many software systems continue running for long periods of time, often only
shutting down once or twice a year, even a very small infrequent memory leak can lead to
a major customer defect. In the software studied at ABB, memory leaks were identified
as a major cause of long term customer unhappiness. This unhappiness is partly due to
these defects being difficult to detect, debug, and fix, leading to long fix times and
recurring downtime for the customer. One key assumption is that all of the memory leaks
must be based on code that is accessible to the firewall team. No third party or operating
system memory leaks will be detected with this method. This Memory Leak Firewall
represents work that needs to be completed in the future and is outside the scope of this
work.
An additional key dependency in user-configurable systems, especially in
embedded systems, involves global variables. Current industrial practice treats global
variables as a data dependency and uses the EFW to determine impact. This has not been
50
proven effective and is only done since no other firewall exists for global variables.
Future research needs to be performed in order to understand and model the dependencies
present when global variables are used.
A final key dependency in user-configurable systems is performance. When code
changes are made to a system the overall performance can be impacted. In this case,
performance means the response time, cycle time, maximum load, throughput, and other
quantifiable measures of the system limits. In software studied at ABB, regression defects
of this kind are becoming more frequent. Besides traditional performance defects from
code changes, latent software defects can be exposed from configuration changes which
impact the performance of the system. The impact on other software areas from a
performance change represents a dependency which is not covered in firewalls today. In
addition, performance testing is often costly, so regression test selection will be very
beneficial. Firewalls for these dependencies need to be created, but they are outside the
scope of this research.
51
4. Configurations and Settings Firewall
In order to address the problem of users changing system configurations and
settings, which expose failures related to latent defects as well as traditional regression
defects, an extension to current regression test selection methods is presented. Instead of
looking at changes within the software code itself as the only source of impact within the
system, this approach analyzes changes to the user’s configuration, including both
configurable elements and settings, that determine the specific way the software behaves
in that user’s environment This new analysis is conducted whenever the configuration or
settings in the application change, matching the way that current RTS methods are
applied to software for every code change.
Latent software defects can exist in many different parts of a system. These
include long data flow paths, where a configurable element’s specific action or output
result is dependent on a value computed in a different configurable element, or in a code
path internal to a configurable element that was previously dormant but, due to a change
in the configuration, is now executed. It is also possible that latent defects were
previously hidden from view due to either environmental, performance, or other
configurable elements but are now exposed due to a change in the configuration.
Additional types of changes that go beyond the configuration of the system can also
expose latent defects. These include process changes, changes to the way a user interacts
or interfaces with the system, changes to the hardware the system is running on, such as
upgrading PCs, or changes to other software that runs on the same machine or network,
including the operating system or other third party systems. These change types are not
handled in this firewall method and will require additional research to address.
52
This new firewall method identifies the different types of changes that exist in the
configuration, including both settings changes and configurable element changes, and
selects specific code-based firewalls to model these changes from those listed in Chapter
3. The specific firewalls selected for use depend entirely on the types of changes that
were made to the configuration. Since the code itself does not change, a set of differences
in the configuration and settings must be determined instead. After these differences are
identified, each change, including both changes to settings and changes to configurable
elements, is mapped to the parts of the source code that represent it within the system.
This step will be discussed in more detail in Section 4.2 and 4.3. Once this mapping is
completed, the source code representing the difference is marked as changed and a
selection of one or more of the code based firewall models is made. Each needed firewall
is then created, using both the data present in the configuration as well as the source code
and design documents for the system. After the selected firewalls have been created, a
test selection or test creation activity is performed to cover the impact to the system
identified in the model.
It is important to be able to identify and differentiate between a setting and a
configurable element when looking at the changes in a configuration. This classification
of setting or configurable element must be done for each identified change between two
versions of a configuration. Settings, for the purpose of this research, are defined as
values that exist inside a configurable element which are visible and changeable by the
user. Some settings changes can be made when the system is offline and execution has
been stopped while other changes can be made while a system is online and currently
executing. In effect, these settings and the specific values they hold resemble and act as
53
parameters in procedural code, or as attributes in object-oriented code, to the configurable
elements they reside in. Similar to parameters and attributes, these settings can define the
specific behavior of the configurable element such as a specific internal code path that is
executed when a function or method is called or the return value that an internal
algorithm computes when other objects call this element. Settings will be further
discussed in Section 4.1, where settings changes are presented.
Configurable elements, on the other hand, are defined as individual parts of the
system that can be added to or removed from the system’s configuration. These elements
are represented in the software as a specific grouping of code, and can be thought of as,
and compared to, a class. In fact, the execution of a configurable element in a system acts
just as a class does. Adding a configurable element to the configuration creates an
instance of that class with its own settings and memory space, just as creating an instance
of a class in code. Similarly, if a class exists in the code but is never instantiated, its code
will never be executed. This is the same for configurable systems, where the code for the
configurable element exists in the system, but if it is never added to a configuration, that
code will never be executed. Even when an element does exist in the configuration, the
possibility that it will actually be executed depends on the specific settings, events, or
user interactions that are present during the execution of the system by the user. This is
also similar to a class, where instantiated objects are only called in response to the
occurrence of specific events in the system. From a testing point of view, making a
change to a configurable element in the system’s configuration may add new code to the
system that has the potential to be executed and can be treated the same as adding a new
class to the code of the system. The specific details of the change itself also determine the
54
overall impact to the system. For example, adding a configurable element that has never
been used previously in the system adds new code that has the possibility of being
executed. Conversely, the configurable element added could have been used previously
in other parts of the system. Either of those cases can lead to new failures due to latent
defects being revealed in the system, but the specific impact and risk of each change are
different. Configurable elements will be further defined and broken down when
discussing configuration changes in Section 4.2.
Before this new firewall can be used, a few key assumptions must be made. The
first assumption is that all code usages and dependencies of an individual setting can be
identified in the system. This usually requires the source code, the design documents, and
the user’s specific configuration that is currently executing in the field. This first
assumption is needed since a hidden use of a setting will propagate the impact to another
area of the software and the impact analysis will be incomplete, potentially missing
affected areas that may contain latent defects. A second needed assumption is similar to
the first and requires that the code implementing a specific configurable element can be
identified. Also, all of its interactions and dependencies within the system must be
determined, usually requiring access to both the code and the running configuration. This
second assumption is needed to address issues that arise when an unknown dependency
exists between two areas of the system, causing the impact analysis to not identify an
affected area. It is important to note that the first two assumptions both require source
code, design documents, and the user configuration to be available for the analysis to be
effective. The third assumption states that the focus of this new firewall is on detecting
latent software defects and traditional regression defects that exist inside the source code
55
of the system that are revealed due to a change in the configuration by either a
configurable element or setting value. All other latent and regression defects that are not
affected or revealed by one of these changes are outside the scope of this research and
method. Any errors in the logic of the configuration that the user creates or changes are
also outside the scope of this research. Specific testing and regression impact due to
change on the validity and correctness of the configuration itself are different research
areas and not the focus of this work. Finally, the system should not be designed in such a
way that any single change impacts the whole system. This means that a fully connected
system, where every object or function is dependent or related to every other object or
function, will not benefit from this or any other traditional RTS method, since the impact
from any one change will propagate to all other areas of the system and require a
complete retest.
A detailed description of changes to settings and configurations is presented in
Sections 4.1 and 4.2, respectively. Actual construction of the firewalls for these change
types is shown in Sections 4.3 and 4.4. Section 4.5 will discuss the time complexity of
using this firewall method. Finally, Section 4.6 presents some future enhancements that
could be made to improve the efficiency and effectiveness of this method. Within the
descriptions and constructions shown in the following sections, an example real time
industrial control system will be referenced. This system is the same system initially
described and used in the examples in Chapter 2.
4.1 Settings Changes
A settings change is a change to a specific value that resides inside a configurable
element that is both visible to and changeable by the user. Changes to these values can
56
sometimes occur without the need for recompilation by just changing a configuration file
or by using a human system interface, such as a GUI interface. Since these changes can
be made easily, users often overlook the possible risk that comes from changing settings
values in a currently executing system. In addition, some settings changes can be made to
the system while it is executing which can lead to serious failures due to latent defects
being exposed in a running system. Because of this risk, all changes to the settings should
be first done in a test environment using this firewall method to determine impact and to
verify the absence or presence of latent defects. These values can describe the behavior
that the configurable element will exhibit, similar to the way values supplied to an object
can determine the behavior that object will exhibit. Specifically, the values supplied to a
configurable element by its settings can affect the output of a specific method in the item
which is called by other items up or down a calling path. A settings value could also
affect the internal code path that the object takes in response to either a method call or
event. Finally, a setting value could have no real effect on the system at all due to the
internal usage of that specification inside the configurable element that contains it. An
example of this deals with a change to a setting that either affects or is used by code paths
that are not currently executing due to other settings or configurable elements in the
system.
Settings reside internal to configurable elements and changes to them can affect
the internal operation of that element. The only way that a settings change can affect
external configurable elements is through a data dependency as control flow
dependencies are fully contained inside configurable elements similar to classes. Data
flow dependencies can occur in two ways. The first dependency is very common in user-
57
configurable systems and occurs when the output of the configurable element is affected
by the settings change and is either used as an input to another element connected to it or
used as a parameter value to other function calls in the system. This forms the same kind
of data dependency existing in traditional procedurally designed software systems, where
all of the data is passed as parameter inputs to the next function. An example of a setting
change affecting this kind of data dependency is shown in Figure 14.
The other
dependency type is less frequent and involves retained state information. This state
information can exist as an internal state, such as previous state values, weighted
averages, global variables, or data stored in a database.
Figure 14. Example of a Settings Change
Figure 14 shows a settings change occurring within the configuration of the
system. This change to setting 1 is mapped to the code that represents it in the code
which, in this case, is variable A. This variable is used to determine whether code path 1
or code path 2 is executed whenever method X is called. Before the change to the system,
the configuration contained setting 1 with a value of ten. Setting 1 is represented in the
code by variable A, so A’s value is also ten. The settings change modified the value of
setting 1 to twenty, which means the internal variable A is now also twenty. Variable A is
58
used in the code to determine which code path, either path 1 or path 2, is executed
whenever method X is called. This change to setting 1 leads to path 1 being executed in
the new configuration, whereas code path 2 was executed in the previous configuration.
In this example, code path 1 was never run before at the customer site and could contain a
latent defect. This setting change impacts a data flow dependency, as it can change the
output of the configurable element. One possible latent defect in code path 1 would cause
the output of method X to be incorrect for specific input values passed into the
configurable element.
For settings changes, a Traditional Firewall is created first to model the change.
This firewall is constructed by mapping the settings that have changed into the variables
within the code of the configurable element that they reside in. Finding the code for
configurable elements and settings is dependent on the implementation of the system and
varies for each system. In general, it can be done by either expert knowledge, using
design documents and code models if they exist, searching with a predominantly manual
brute force effort, or by using some automated techniques recently created in the
Information Retrieval and Program Analysis communities, such as [35, 36]. Once the
code for the configurable element has been found, the specific variables are identified,
and they are marked as code changes. Once all of the changes have been identified, a
Traditional Firewall is constructed for each. The TFW is fully contained inside the code
of the configurable element, with the exception of calls to helper functions, operating
system calls, calls to third party components, or use of global variables.
Settings changes can also affect data values that are computed elsewhere in the
system if they are connected by a data flow dependency. Due to this, it is important to
59
determine if the changed setting impacts any existing data dependencies. If any of these
dependencies exist in the code, the specific changed setting is checked to see if it impacts
that dependency, either with a change to control flow or the value used in the data
dependency. If the dependency is impacted, an Extended Firewall is created to determine
the impact. Otherwise, the EFW is not needed. Determining the functions affected by
either control flow or data flow is currently a manual process with some automated tool
support. There are many promising research areas that are currently looking at and
automating the impact analysis for both control flow and data flow, such as [36]. These
techniques and tools will help automate the mostly manual analysis that this method
needs, which should make it even more feasible for this method to be used in the future.
In addition to traditional control flow and data flow dependencies, settings
changes can impact other dependencies in the system. These dependencies include
semaphore and other blocking calls, which could lead to deadlock, as well as memory
allocations and deallocations, which could lead to a memory leak occurring, as well as
performance changes and third party components. Each of these dependencies could be
affected by the setting change and, as a result, they must be checked for. If any of these
dependencies exist, a firewall for that dependency must be created. Checking for these
types of impact in the code is accomplished in the same way as for code changes,
specifically treating the changed settings as changed code.
Once all of the various dependencies have been modeled with firewalls, the
determined impact is used to select tests, if previous tests exist, or create new tests, if
they do not currently exist. These tests must cover all of the impact identified by the
TFW and any of the additional firewalls used, such as an EFW.
60
An example will now be presented using the control system example shown in
Chapter 2. In this example a setting, whose value selects the specific input shaping
function to execute on an input value, is changed. This function exists inside the same
configurable element as the setting and shapes inputs connected to it by applying a
translation function to them. Within the code of the configurable element, the specific
shaping function is selected by a case statement that uses the value of the setting to
determine the correct algorithm to apply. Applying this to Figure 14, the code uses
variable A to determine the specific function to apply, which is based on the value of
setting 1 selected by the user. In this example, no additional changes occur in the
configuration and no code was changed. There are, however, existing code paths which
have never been run by the customer in this way, any of which may contain latent
software defects. Therefore, this settings change requires regression testing to determine
if any latent defects are exposed by the change.
Another example deals with an ERP system which was also discussed in Chapter
2. In these types of configurable systems, settings are usually represented as parameters,
configuration files, and database values that are passed into or used by the various
configurable elements of the system. These configurable elements exist in code libraries
containing objects and functions that are needed to perform the user’s action. The settings
themselves are used within the ERP libraries in a similar way as the settings in a control
system are used. This includes usages where ranges of values are treated differently,
specific events or values trigger defined responses by the system, and many options are
available to select that specialize the general solution provided. The specific way the two
systems are configured, graphically for the control system and programmatically for the
61
ERP system, does not affect either the use of this method or the validity of its results. It
only affects the technical details as to how to compare two configurations and how to
map the settings into the code.
Settings changes usually involve a smaller impact then that of an added
configurable element. The impact of the settings change is completely internal to the
configurable element it resides in, with only system calls, data dependencies,
performance dependencies, memory dependencies, and global variables having an impact
outside the boundaries of the configurable element itself. In addition, any dependencies
that do cross the boundary must be impacted inside the configurable element by the
setting change. In the worst case, a setting change could affect an output that is used by
the entire system, thus requiring a full retest of the entire system, but this is extremely
rare as you would need either the setting change to affect code that is fully coupled to
code in every other configurable element in the system, or to build a configuration that
has a configurable element that is connected to every other configurable element in the
configuration. Either of these cases is very unlikely and if they did exist, a full retest
would be required as the impact would propagate to every part of the system. In practice
large distributed control systems monitor many smaller independent control areas and
would never be fully connected to any one configurable element in the system.
4.2 Configuration Changes
Configuration changes, unlike settings changes, deal with changes to configurable
elements which either add them to or remove them from a preexisting configuration.
These configurable elements are internally represented by the system as classes or
functions. The specific code implementing the logic for these configurable elements must
62
exist somewhere inside the system, usually in library-like constructs. Even though the
code always exists inside the system, until it is used inside the configuration it will never
be executed. Placing a specific configurable element inside a configuration creates an
instance of that item inside the system in the same way that declaring an object in code
creates an instance of that class. Each configurable element contains zero or more settings
that can control or impact its execution. It is possible for a configurable element to
contain no settings at all, as its behavior is not dependent on any other values or inputs.
Changes to a configuration can be categorized into different change types where
each type involves different steps and can have different impact on the system. The first
type of configuration change involves adding a new configurable element that does not
exist elsewhere in the configuration. In this case, the code for this element existed in the
system but it was not possible for this code to run, as no instances of it had been used in
the configuration before. This type of change requires the most testing, as it has the
highest potential risk for new failures from latent defects due to the large amount of
unexecuted code now present in the configuration.
Using the control system example, adding an I/O module with connected data
points to the system, each of which is represented in the system as a configurable
element, would represent a configuration change. I/O modules are external physical
devices which act as bridges between the controller device and the physical measurement
devices, such as thermocouples. These I/O modules read in values, format or convert the
data, and transmit it back to the controller. If the added configurable element which
represents the I/O module was not used elsewhere in the system, then the change is
classified as adding a new configurable element. This change would allow the execution
63
of a new set of code within the system. This code, while previously present in the system,
can be considered new since it is now possible to execute it. Similarly in an ERP system,
adding a new business process to an existing system would represent the addition of a
new configurable element, as these processes are composed of configurable elements
represented by functions stored in libraries. Each business process may contain very
specific configurable elements, leading to many changes where previously unused
configurable elements are added.
A second type of configuration change involves the addition of previously used
configurable elements into the customer’s configuration, many times with different
settings than other instances of it have used before. This is the most common change type
encountered in the field, as users often extend a system by adding more of the
configurable elements they have already used in their configuration previously. Since
some of the code that represents the configurable element has run before in this
configuration, there is a lower risk of failure from latent defects, as there is less new code
to run.
It is important to determine how different this new instance of the configurable
element is from other instances. Therefore, these previously used configurable elements
that are added must have their settings compared to the previous instances that exist in
the configuration. If the settings are exactly the same, then the configurable element itself
and all external callers just need to be checked. For a function to be checked, it must have
some very basic tests executed on it. If there are only a few differences in settings
between the newly added element and the previous instances, a Settings Firewall is
created to check which internal code paths and data values of the element are affected.
64
This includes checking to see if the settings are involved in any data dependencies, which
require an EFW, or any other dependencies that would propagate the impact to areas
outside of the configurable element itself. If the settings in the new instance are
completely different then the settings in the previous instances, the whole element is
considered completely new and must use the firewall for adding new configurable
elements.
Figure 15. Example Configuration Addition
In the control system example, adding a previously used configurable element, in
this case an I/O module, to the configuration represents this type of configuration change.
This is shown in Figure 15, where two previously used configurable elements are added
to allow input and output of new values to be calculated. This type of change happens
frequently, as customers often add additional data values to existing configurations which
they did not know would be important until the plant was running. The specific settings
in the new instance must be compared to the settings used in previous instances. If the
settings are very different, it will be necessary to fully test the added configurable
elements, since new code paths may be present which contain latent defects.
The final type of configuration change is removing configurable elements. This is
the least common type of configuration change, as customers rarely remove pieces of
their previously running configuration, usually only when they have to replace it with a
different configurable element type. This action does not allow for any new code to be
executed and in fact removes code from within a previous executing block. As a result,
the objects or functions directly related to the removed item, either within the code or due
65
to the configuration, are all marked as changed and firewall models are constructed
around them. These areas are all retested for both latent defects and regression defects,
looking closely for changes in data paths and values. In this case it is very important to
remember the assumptions with this firewall. The goal is to detect latent defects in the
code, not defects in the configuration. The details of how the Configuration and Settings
Firewall is created for each case will be described more completely in Section 4.4.
The remainder of this chapter describes the algorithms for creating firewalls for
each of the types of configuration and setting change. Section 4.3 will detail the
construction of the Settings Firewall, used for settings changes. Section 4.4 will present
constructing the Configuration Change Firewalls, one for each type of possible change.
4.3 Constructing a Firewall for Settings Changes
Settings changes require a firewall to be created for each change within the user’s
configuration. Constructing firewalls for settings changes involve following a set of
steps, the totality of which define the firewall creation process. The full process is shown
in Figure 16, where each process step is represented by a circle, each valid transition is
represented by an arrow, and any specific conditions that must be true to take a transition
are listed as labels on the arrow.
Initially, customers have a previously created and tested configuration running in
their environment. The customer decides to make a change to the configuration involving
one or more settings values. A copy of the current configuration is created, the changes
are made, and the new configuration is saved. Once the changes are made, the next step
depends on the access the user has to the internal system source code. If the source code
is available then the user can create the Settings Firewall directly. If not, then the user
66
must send both the original and changed configuration to the software vendor for analysis
and testing. These examples and steps assume the vendor is doing the analysis, as many
systems do not make the source code available for the users of the system. In the case
where the source was available to the user, the steps to create the firewall are the same.
Figure 16. Process Diagram for a Settings Change
67
Once the software vendor receives the configurations, a difference between the
two configurations must be determined. The specific way that two configurations can be
compared to one another depends completely on the details of the specific system being
used. At a high level, this step is the same as computing a difference between two source
code files. There are two primary ways that configurations are presented to the user. The
first represents configurable elements and settings graphically, using pictures and lines to
represent the elements and relationships, as well as GUI windows to display the setting
values contained inside the elements. The other way that configurations can be
represented is programmatically, where functions or objects are the configurable
elements and parameters represent the settings. The purpose this step is to determine the
executable changes that exist within the settings of configurable elements inside the two
separate configurations. This step has the same purpose as code differencing in the
Traditional Firewall. Identifying the differences can be accomplished by either a text
based differencing tool, looking for changes in variables, parameters, or files, or by using
a custom tool provided from the software vendor, looking for differences in values
contained inside the configurable elements. Determining if a change affects execution is
important as some changes, such as element names and comments, do not have any effect
on the system and will not expose latent defects. This determination can be hard and will
require analyzing the source code of the configurable element to see how that setting is
used. All changes that do affect execution are added to a list. A detailed example is
presented later in this section that describes these steps for a specific control system
example.
68
After the specific settings changes are identified, the source code representing
each must be identified. This step is also dependent on the specific system and how
configurable elements and settings are implemented within the source code. In a
programmatic system, the setting values are contained in parameters or configuration
files that are passed into the system at a specific time or in response to a defined event.
For these types of systems, finding the users of settings involves tracing the file or
parameter from its input, usually a file or database, to its usage in the code. If the system
is graphically configured, a similar traceability is conducted from the GUI window into
the source in the system which uses the settings values. In either case, settings are
contained in configurable elements, and the variables themselves reside in the code
implementing the configurable element. A detailed example of mapping the values inside
a graphical configurable element to the source code that uses them is presented later in
this section. Once the code using the changed setting is identified, it is marked as a code
change.
Each area of source marked as changed requires a Traditional Firewall to be
constructed. Creating the TFW for these changes is the same as creating one for a real
code change, following the steps listed in Chapter 3.1. For a settings change, all of the
impact identified by the TFW is contained inside the configurable element itself. In
addition the TFW, analysis must be done to determine if other dependencies exist which
are impacted by the setting change. The main dependencies to look for include data
dependencies using the changed setting, blocking calls that are affected by the change,
and any third party components that might be affected. Other dependencies include
changes in memory allocations or deallocations, as well as any changes that might impact
69
the performance of the system as a whole. Each of these dependency types are searched
for and any identified require the corresponding firewall model created. It is important to
understand that some relationships and dependencies between configurable elements
themselves and also with system functions are created dynamically when the
configuration is loaded. These dependencies are really only dynamic when looking at the
code since, once loaded, they remain static throughout the entire execution of the system.
As a result, the configuration itself must be used when identifying relationships in the
needed firewall models.
After all the different types of impact have been determined from the created
firewalls, tests must be created or selected from previous testing which cover these
impacted areas. If tests exist already for an area, they can be reused. These tests can come
from many places including previous testing activities for that user, testing completed for
other users which would work for this current user, and tests that were completed for
product release testing. Various coverage and other test completeness measures are useful
here as they can determine the area of the system that the test executes. If previous tests
are not available for a specific impact, these tests can be created to cover that new area
completely. Once the tests have been completed, they are executed on the system to
determine if any latent defects were exposed from the change.
In order to show the details involved in creating a firewall for settings changes, an
example on a real system will be shown. This example system is the control system
discussed in Chapters 2 and 4 and follows the steps shown in Figure 16. This specific
system uses a graphical representation of configurable elements which are called function
codes. These function codes are inter-connectable blocks of logic that, when joined
70
together, form a solution to the specific controls problem the users of the system need.
Each function code contains a number of different settings, some of which can affect the
actions, values, and events produced. Figure 17 contains a screen capture of a GUI
window showing all of the settings that exist inside a specific function code, along with
their current values. Any changes to these settings will require the creation of a
Configuration and Settings Firewall.
Figure 17. Example List of Settings
Making changes to these settings for a graphical function code involves using the
GUI window shown in Figure 18. Once this window is open, clicking on any setting will
open an additional GUI window specific to the setting clicked. An example of this is
shown in Figure 19. This window shows the values allowable for a specific setting and
the value can be changed by clicking on any of the options presented in the menu. It is
also possible to change the settings value directly by replacing the number or string in the
71
Value column in Figure 17. Settings can only be changed if the user has the permissions
required to change the system. Once all of the desired settings changes have been made,
the new configuration file is saved and both new and previous configurations are sent off
to the vendor of the software for analysis.
Figure 18. Example Settings Change GUI
When the vendor receives the two configurations, the set of changes made
between the previous and the new configurations must be determined. Since the new
configuration was saved as a separate file, it can be compared to the previous
configuration file. The details on how to actually difference two configurations varies
depending on the system itself. For an ERP system, the configuration of the system is
often done programmatically by configuration files, database entries, or even writing glue
and wrapper code that uses and customizes the specific library functions the system
provides. Within that system, settings can either reside as parameters passed into the
library functions or objects used, or they exist in configuration files read into the system
72
at setup. As a result, both the code files and the configuration files are compared to the
originally running configuration using a text differencing tool in the same way the current
code-based firewalls are compared. In the control system example, the system itself
supports identifying all the changes that were made between two configurations by using
an application that is delivered with the software. An example difference between two
configurations is shown in Figure 19.
Figure 19. Example Difference of Two Configurations, Settings
Now that the differences between the new and old configuration have been
identified, a mapping must be made between the changed settings and the internal code
representing these settings. Usually, settings are represented internally as variables or
attributes inside functions or objects. In an ERP or other programmatically configured
system, this mapping is easy as the values are passed into the system as parameters or in
specific configuration files on the disk. However, control systems and other graphicallyconfigured systems require more analysis for this mapping as the code that implements
each graphical element must be determined. Some system knowledge, documentation, or
code searching must be available to support this analysis. For the control system
example, each function code is well defined in the source code, including the variables
73
and attributes representing the specifications. In addition, there are well defined functions
that extract the settings from the file and assign those values to the attributes in the
function code that uses them. Since the code is so well defined, simple searching for the
number of the function code and the setting number from the GUI window is effective.
A code example showing the internal variables used to hold the specification values is
shown in Figure 20.
Figure 20. Example Settings Code Definition
Once the code representing the settings has been identified, each one is marked as
a code change. Each function using these changed variables in the code is marked as
affected. Once this is complete for every change, a TFW is constructed around each
function identified. After the TFW is complete, analysis is conducted to determine if any
additional forms of impact which go beyond simple code flow dependencies are present
in the changed functions. Since settings changes in the system deal with variables, each
function that is affected must be checked to determine if a data dependency exists
74
between any outputs of the function and the changed variable. If any such data
dependencies exist, it is marked as affected also. Once complete, an Extended Firewall is
constructed for each dependency, using both internal source code and the configuration
file itself to determine the impact. The configuration file is needed in order to resolve the
dynamic linking of configurable elements into static relationships which can be used for
this analysis. These relationships are considered static as the configuration remains the
same throughout the execution of the software.
Another possible dependency involving settings changes impacting blocking
calls. It is possible that a setting value selects a specific action for the configurable
element it resides in to perform. If that changed action involves code that takes any
semaphores or performs any blocking calls, either now or before the change, a Deadlock
Firewall must be created. Similarly, if the setting change affects the creation, use, or
deallocation of memory, then a Memory Leak Firewall must be constructed. Each of
these firewalls are created in the same way as they are for a code change, starting from
the functions that were marked as changed and then analyzing each dependency affected
by that change. In this control system example, a data dependency is found and an
Extended Firewall graph is created. This graph is shown in Figure 21 and shows that one
of the settings changes was involved in two data dependencies with other functions in the
system.
75
Figure 21. An Example EFW from a Settings Change
In this figure, functions are represented as circles and calls are marked as arrows.
Function A is involved in a data dependency with Function C using the changed
specification in the data path. Function B is in a data dependency with Function D and
also uses the changed specification in its data path. Since both data dependencies use the
changed specification, they are both considered affected and require retesting to check for
the absence of latent defects.
The final step uses the impact identified in the firewalls constructed so far to
select or create additional tests needed to verify the system works correctly and does not
contain any latent defects. If tests exist from previous customers or testing, they can be
selected and reused, otherwise new tests need to be created. These tests must cover all of
the affected areas and determine the presence or absence of latent defects in these
affected areas. Once the tests are created, they must be run on the customer’s changed
configuration. If the tests detect any defects, they should be corrected and a new version
of the system can be sent out to the customer. Otherwise, the results of the testing can be
sent back to the customer, allowing them to load the new configuration into their system
and run it in their environment.
76
4.4 Constructing a Firewall for Configuration Changes
Configuration changes require one or more firewalls to be created each time they
change within the user’s configuration. Constructing these firewalls follow steps similar
to those described in the previous section for settings changes. The main differences
between settings changes and configuration changes are the size and reach the impact has
into the surrounding system and the general scope of the impact and testing needed. The
main steps in creating a firewall for a configuration change are shown in Figure 22. Just
as the process diagram for settings changes, each of the process steps is represented by
circles and the figure is followed by a more detailed description of each step.
The process starts at the same point that the settings process did, namely a
previously created and tested configuration running in the customer’s environment. The
customer then decides that a change to the configuration is needed and adds or removes a
configurable element from it. A copy of the current configuration is created, the changes
are made, and the file is saved as the new configuration. As with settings, the next step
depends on the system itself and the access the user has to the internal system code. If the
code is available to the user, they can follow all of these steps themselves. If the code is
not available to the user, then they can send the original and changed configuration to the
vendor company for analysis. This description will assume the vendor is doing the
analysis, as many systems do not make the source available for the users of the system
and as a result, the user will send the original and new configurations to the company.
77
Bl
oc
A f kin
fe g
ct C
ed al
? ls
y
or d?
e m te
M fec
Af
Figure 22. General Process Diagram for Configuration Changes
Once these configurations are received, the difference between them must be
determined. Only identify executable changes between the two configurations, as other
change types do not affect execution and do not require testing. Determining the
difference between the two configurations can be accomplished by either a text based
differencing tool, looking for added method or function calls, or by using a custom tool
provided from a software vendor, looking for added or removed configurable elements.
Determining if a change is executable is important and may require checking the source
78
code that implements the specific settings that change. Any changes that do not affect the
execution of the system are ignored, similar to the way analysis for RTS methods ignore
comment changes within source files since they are not executable changes. A few
detailed examples of configurable elements, both addition and removal, are presented
later in this section which describes these steps for a control system.
Once the changes to the configurable elements have been identified they are
categorized as one of three types of changes: adding a new configurable element which
does not exist elsewhere in the configuration, adding a previously used configurable
element to the system, or deleting a configurable element. Each configurable element
added to the system must be mapped to the underlying source that executes when that
item is used. This code is marked as changed and added to a list. In addition, all
relationships that exist in the configuration from the added configurable element to other
configurable elements are marked as changed, and the code for each of these configurable
elements is marked as changed and added to a list. For all removed configuration items,
the configuration must be checked to see what dependencies to other configurable
elements were affected by the removal of a specific configurable element. The code for
these configurable elements is marked as changed and added to a list.
Once the list of configurable element changes is complete a Traditional Firewall
is created, just as it was for a settings change. Each item that was added to the list of
change code requires that a Traditional Firewall be created. Besides the TFW, some
additional analysis must be done to determine if there is other impact from this
configuration change to the system as a whole and to determine what that impact is if it
exists. The main types of impact that need to be looked for when dealing with a
79
configuration change include data flows using the changed configurable element, new
blocking calls that exist within the source, new memory allocations or deallocations, as
well as any performance impact the added or removed configurable elements may have
on the system as a whole. Each of these additional forms of impact is searched for within
the code marked changed and if any are identified, they are added to a list. Each element
on that list will need a corresponding firewall model associated with it. Once all of the
additional impact types have been identified, the specific firewall model that addresses
each one is created.
Similar to other firewalls, each area of impacted identified by the firewalls needs
tests to be selected or created which cover them. If tests exist already for an area, they
can be reused. These tests can come from many places including previous testing
activities for that user, testing completed for other users which would work for this
current user, and tests that were completed for product release testing. Various coverage
and other test completeness measures are useful here as they can determine the area of
the system that the test executes. If previous tests are not available for a specific impact,
these tests can be created to cover that new area completely. Once the tests have been
completed, they are executed on the system to determine if any latent defects were
exposed from the change.
Now that the general steps have been defined for a configuration change, a few
example firewalls will be constructed for each of the possible types of configuration
change. Section 4.4.1 will involve constructing a Configuration Firewall for a change
involving new configurable elements, 4.4.2 will show creating a Configuration Firewall
80
for previously used configurable elements, and 4.4.3 will create a firewall for removing
configurable elements.
4.4.1 Constructing a Firewall for New Configurable elements
Constructing a firewall for the addition of new configurable elements into a
configuration modifies and further refines the general process, shown in Figure 23. The
main difference between the process shown in Figure 23 and the generic process shown
in Figure 22 is how the data needed to create the TFW is gathered. For a new
configurable element firewall, both identifying all connected configurable elements in the
configuration and its source code location are needed. These two steps run in parallel
with both acting as input to the TFW. These two actions are shown inside the box in
Figure 23.
An example system will be used to show the details of how this method works.
The example system is the same one used in the Settings Firewall example in Section 4.3,
but instead of showing settings changes, this example will involve adding new
configurable elements into the system. An example configuration for this system is
shown in Figure 24. In that figure, the different rectangular shapes are configurable
elements, or function codes as they are called in the system, and the arrows represent the
relationships between these elements, called connections. A function code is a graphical
unit that represents code in the system that performs a specific function. These function
codes are linked together with arrows that describe a specific relationship between them.
An example relationship involves an external temperature input function code connected
to a thermocouple conversion function code which takes the value from the thermocouple
and converts it into a temperature value. The thermocouple translation function code is
81
then connected to a display function code which allows the plant operators to view the
value from the control room. Any new function codes added that do not exist elsewhere
y
or d?
em te
M fec
Af
Bl
oc
Af kin
fe g
ct C
ed al
? ls
in the configuration will require this firewall to be created.
Figure 23. Process Diagram for a New Configurable Element
In this example, the change involves adding a new function code to the diagram in
Figure 24. The specific function codes added do not exist elsewhere in this configuration.
Adding these function codes to the configuration is accomplished by graphically adding
82
the function code to the configuration shown in Figure 24. Once these function codes are
added, it can be connected to any other configurable elements that are related to it in
some way. These connections represent many types of relationships such as a data
relationship, a logical relationship, or even a physical relationship. In addition, these
relationships can be in a single direction, denoted by an arrow at the end of the connector
showing the direction of the relationship, bi-directional, shown with an arrow on both
ends of the connector, or an association, where the two blocks are able to access data
within each other, shown with no arrows on the connection. After the changes have been
added, the configuration is saved in a new file.
Figure 24. Example Configuration
Once the new configuration is complete, the changes between the new and old
configuration must be identified. This is done by comparing the original version and the
new version of the configuration to each other. Just as in the Settings Firewall in Section
4.3, this control system example supports a tool that reports the differences between two
83
configurations. Each function code type that is added to this configuration must be
checked to see if it already exists in the system elsewhere. If it is completely new or
needs to be treated as completely new, it is included in this firewall construction. If it
previously existed, it will be included in the firewall shown in Section 4.2.2. The
differences from this new function code configuration example are shown in Figure 25.
This differencing tool shows the new function codes that were added to the configuration.
Figure 25. Example Difference of Two Configurations, Adding
Now that the newly added function codes have been identified, a mapping must
be made between these new function codes and the internal code that implements their
functionality. Usually configurable elements, such as the function codes in this example
system, are represented internally as objects or library functions stored in a set of
common components that are loaded into the system when needed. In an ERP system,
configurable elements are usually functions or objects and mapping them to the code is
easy since the interface to these functions will be visibly called from the user code that
implements their solution. For control systems, which are graphically configured, this
mapping may not be as intuitive. For this specific control system example, each function
code is well defined in the source code, containing a list of all the methods, variables, and
interfaces that exist within it. These interfaces usually just get assigned by the
configuration to connect the function code with whichever function code has been
84
connected to it. In addition to the source code of the added configurable element, the
relationships between that item and any other items must be analyzed. Each item
connected to the changed item will have its source code marked as changed also. A code
example showing the details of a function code is shown in Figure 26. Besides
determining the source code that implements the functionality of the configurable
element, all connected function codes must be identified. This is accomplished by
checking the configuration file for these relationships.
Figure 26. Example Source for a Configurable Element
A Traditional Firewall must now be created, treating each new function code
added to the configuration as code changed. In order to determine the relationship these
new function codes have to the rest of the configuration, both the internal code and the
configuration must be analyzed. As a result, creating the TFW becomes more
85
complicated, as configurations set up dynamic calling relationships between the function
codes themselves, as well as system calls, when the configuration file is first loaded.
Internal state variables, semaphores, and other relationships that exist statically must be
included in the firewall, as well as all dynamic relationships that exist from the
configuration connections. These connections link the calling function, shown
graphically in Figure 24 as the input and output lines in the blocks, to the functions they
need to access. Internal to the code, these connections are usually just addresses which
get assigned when the configuration file is loaded into the system. Once these
relationships are identified and understood, the TFW can be created.
Once the TFW is complete, analysis is conducted to determine if any additional
forms of impact beyond code flow are present in the changed functions. Since adding
new function codes to the system allow new code to be run, every function and variable
inside the function code must be checked to determine if a data dependency exists
between any outputs of the function code and the external existing function codes in the
system. If any data dependencies exist, an Extended Firewall is constructed. Similar
analysis must be performed for each additional type of dependency discussed in Chapter
3. If any of these forms of dependency are present, the corresponding firewall must be
created. Just as with TFWs, these firewalls must take into account both the source code
dependencies as well as the dynamic dependencies that are created from the
configuration.
Now that the firewalls are created, tests must be selected or created to both
exhaustively test the newly added function code and all related configurable elements and
source code identified. These tests need to be run on the customer’s new configuration
86
and should focus on the specific changed and affected methods listed in the firewalls.
Some tests for these areas may exist from other customers and can be reused to some
extent.
4.4.2 Constructing a Firewall for Previously Used Configurable elements
When the user’s configuration change contains the addition of a previously used
configurable element, a slightly different firewall construction is used. Constructing this
firewall follows the same general process shown in Figure 22, with some small changes
which are shown in Figure 27. First, the settings values of the new instance of the
configurable element need to be compared to all the other instances in the configuration.
The one with the fewest differences is recorded. If all settings are different, then the
element is considered as new and the firewall in Section 4.4.1 is created for it. If none of
the settings are different, then only a few tests are created to check its basic behavior.
Finally, if only some of the settings are different then a Settings Firewall is created for
each and the results are aggregated together into a TFW. These changes are shown in the
box in Figure 27.
The same example system used in Section 4.4.1 will be used when describing the
construction of this firewall. For this firewall, the user adds a new instance of a function
code that has other instances elsewhere in the configuration. Instead of completely new
function codes being added, which enables large amounts of new code in the system for
execution, such as those shown in Section 4.4.1, these changes add an instance of an
element that has already executed inside the customer’s system and configuration. Since
this code was already enabled for execution in the system, the main source of latent
defects are from differences in the settings values between the instances themselves.
87
Pe
rfo
Im rman
pa
ct? ce
y
or d ?
e m te
M fec
Af
Bl
oc
Af kin
fe g
ct C
ed al
? ls
y
ta
Da denc
n
?
pe cted
e
D ffe
A
Figure 27. Process Diagram for a Previously Used Configurable Element
Creating a firewall for this type of change follows the process steps shown in
Figure 27. Once the user adds a new instance of a previously used configurable element,
the configuration is saved as a new version and sent off to the vendor for analysis. Once
88
received, the differences between the two configurations must be determined. Each new
instance of a previously used configurable element is identified and added to the list of
changes.
After all of the new instances of configurable elements have been identified, each
must have its settings values compared to every other instance in the system. The
instance with the smallest number of differences is recorded. If none of the settings
values are different, then a few simple checking tests are written and executed. No
additional testing is required. If all of the settings values are different, then this instance
should be treated as a new configurable element and the process shown in Section 4.4.1 is
used. Finally, if only some of the settings values are different, then Settings Firewalls are
created for each of the different settings following the steps shown in Section 4.3.
In addition to determining the differences between instances of the configurable
element, a mapping is made between the new configurable elements and the source code
that represents it, as well as determining what other configurable elements and system
calls are connected to it in the configuration. For this specific control system example,
each function code is well defined in the source code, containing a list of all the methods,
variables, and interfaces that exist within it. These interfaces usually just get assigned by
the configuration to connect the function code with whichever function code has been
connected to it. In addition, each relationship to any other function code must be
identified and the code for those function codes will be marked as changed also. This
mapping is the same as for new configurable elements described in Section 4.4.1.
Once all of the source code and configuration dependencies have been identified,
a TFW is created. As a result, creating the TFW becomes more complicated, as
89
configurations setup dynamic calling relationships between the function codes
themselves, as well as system calls, when the configuration file is first loaded. Internal
state variables, semaphores, and other relationships that exist statically must be included
in the firewall, as well as all dynamic relationships that exist from the configuration
connections. These connections link the calling function, shown graphically in Figure 24
as the input and output lines in the blocks, to the functions they need to access. Internal to
the code, these connections are just addresses, and they are assigned by having the calling
function in the first function code get the address assigned to it from the other end of the
connection. Once the relationships are understood and modeled, the Traditional Firewall
can be completed, stopping one level away from each affected function.
Now that the Traditional Firewall is complete, analysis is conducted to determine
if any additional dependencies besides code flow are affected by the added instances.
Each created Settings Firewall must be checked for data dependencies which might exist
between any outputs of the new instance of the function code and the external connected
function codes in the configuration. If any data dependencies exist, an EFW is
constructed. Similar analysis must be performed for each additional type of dependency
discussed in Chapter 3. If any of these forms of dependency are present, the
corresponding firewall is created.
Finally, tests must be selected or created to test the function codes in their new
use. The testing for these reused function codes can be further reduced, when compared
to the new function code case, if the settings that control the way the function code
operates are the same in the new use as in the old use. If the settings are the same, then
the function code itself only needs to be quickly checked, and more exhaustive testing
90
will be performed on the interfaces to the existing function codes in the system. If the
settings are different, a Settings Firewall can be made for each changed setting to further
reduce the testing.
4.4.3 Constructing a Firewall for Removed Configurable elements
Constructing the firewall for removed configurable elements differs slightly from
the other cases. It follows the general process steps in Figure 22, but the details of some
of the steps have changed. Changes of this type either involve removing individual
configurable elements and connecting the surrounding system back together around it or
removing a whole connected part of logic completely. The first type, removing an
individual element, is more common. An example of this involves the changing of a field
device to a new field device which now has the ability to convert its own thermocouple
value to a temperature directly. The user would remove the function code that converts
the temperature currently from the configuration and connect the input value function
code to the display function code directly. The removal type involves the removal of an
entire logical part of the system completely. An example of this would be the removal of
an old processing line from the system. If the physical plant shuts down a line, the
configuration will be updated by removing all of the logic for that line. For both of these
cases, it is important to remember that the aim is to detect internal software defects, such
as parameter problems or invalid calculation. It is not meant to address the problem of
user configuration defects, such as whether the process remains stable with the new
configuration. Each time a configurable element is removed from the system a firewall of
this type must be created.
91
Determining the differences in the configurations uses the same system tool as
shown in the previous examples. The new configuration will have some function codes
removed and some connections changed. The changed connections are the elements that
are considered changed, and the function code at each end of them is marked as changed.
After all the impact to source code mapping has been completed, a Traditional
Firewall is created. In order to determine the impact the removal of these function codes
had on the rest of the system, both the internal code and the configuration must analyzed.
Again, creating the calling tree becomes more complicated for graphically configured
systems, as the graphical relationships represent dynamic calling relationships between
the function codes that get loaded at runtime from the configuration. Internal state
variables, semaphores, and other relationships that exist statically must be included in the
firewall, as well as all dynamic relationships that exist from the configuration
connections. Since the removal of a certain function code could change the data being
used in other connected function codes, each function code that used to connect to the
removed one will be considered changed. Once the relationships are understood and
modeled, the Traditional Firewall can be completed. Next, analysis is conducted to
determine if any additional forms of impact beyond code flow are present in the changed
functions. Since adding new function codes to the system allow new code to be run, each
new function inside that function code must be checked to determine if a data
dependency exists between any outputs of the function code and the external existing
function codes in the system. If any data dependencies exist, an Extended Firewall is
constructed. Similar analysis must be performed for each additional type of dependency
92
discussed in Chapter 3. If any of these forms of dependency are present, the
corresponding firewall is created.
Finally, tests must be selected or created to test the function codes in their new
use. The testing for these reused function codes can be further reduced, when compared
to the new function code case, if the settings that control the way the function code
operates are the same in the new use as in the old use. If the settings are the same, then
only the function code itself needs to be quickly checked, and more exhaustive testing
will be performed on the interfaces to the existing function codes in the system. If the
settings are different, a Settings Firewall can be made for each changed setting to further
reduce the testing.
4.5 Time Complexity of the Configuration and Settings Firewall
In order for this firewall to be successful, it must be both effective and efficient
when used in industrial practice. The effectiveness of this firewall is measured by its
ability to detect defects exposed when real customer configuration changes are made,
shown in Chapter 6. The efficiency of this firewall is shown by analyzing the time
required to use this method for each customer change in the field. This time is described
in the following equation, Te = At + Ta, where At is the time required to complete the
firewall analysis, and Ta is the time needed to test the impact identified by the analysis.
Traditional regression test selection (RTS) methods analyze efficiency by taking
Te and subtracting it from the originally needed test time, To. This difference represents
either the time savings or time lost when using this method, depending on whether the
resulting difference is positive or negative, respectively. This new firewall method is
different from traditional RTS methods since, in general, no testing currently occurs
93
when customers make changes to their configuration change, as the customer assumes the
software is fully tested and contains no defects. Because of this, using the new firewall
only adds testing time for each customer configuration change. A small number of
customers, who run either critical applications or configurations that revealed defects
with previous changes, may do some form of black box testing based on their past data
and understanding of the software when they make changes. In effect, these customers
guess at the impact of the change in a somewhat directed way. With these customers,
their original testing becomes To and the time savings or time loss can be determined, but
the number of customers in this category is very small.
Since using the Configuration and Settings Firewall in practice will, in most
cases, result in additional time for both the analysis and testing the impact of the change,
using it must be made as efficient as possible. This will limit the overhead imposed on
both the customers and the company developing the software, who must work together to
test the software for each configuration change. The two main components of time that
arise when using this firewall are At, the time it takes to perform the analysis required,
and Ta, the testing time identified by the impact of the change on the software. Ta
depends on the size of the impact identified in the analysis, which is influenced by the
code dependencies which exist in the implementation of the program under test and the
setup and execution time of the tests themselves. The accuracy of the impact analysis is
shown in the empirical studies in Chapter 3 for the individual code-based firewalls, and
Chapter 6 when applying this new firewall to customer changes. Issues arising from the
implementation and the overall analysis time will be discussed in the remainder of this
section.
94
The analysis time of this new firewall method can be described by the time
complexity of the algorithm used. The algorithm itself takes each change, either a setting
or a configurable element, and determines the impact on the system from that change by
creating one or more firewalls. More formally stated, the algorithm takes each changed
element E from the set of elements that make up the configuration C and creates a
Traditional Firewall for it. Creating the TFW involves finding all control flow paths to
and from E and marking them as changed. The propagation of the change expands out
exactly one level for each control flow dependency, so this analysis can be done in linear
time based solely on the number of control flow dependencies present. Once the TFW is
created, the algorithm must determine if one or more additional firewall models must be
created, depending on the specific change and its implementation. If an Extended
Firewall is needed, element E is checked for additional data and control flow paths which
extend out past the one level that the TFW requires. This checking involves determining
if a function or data value is dependent on something outside the TFW, either up the call
stack or previously calculated state values. The details of how the EFW is created are
discussed in Section 3.2. When an EFW is created, the propagation of the change is not
constant, and the time required to check for the existence of these paths is dependent on
the number of control flow paths and the number and length of data flow paths, Pc, Pd,
and Pl respectively, that exist in the system through the change. Specific values of Pc, Pd,
and Pl were collected in the empirical studies and are shown in Chapter 6.
This algorithm is further refined with logical bounds for each value used in the
algorithm. Configurations, denoted in this analysis as C, can contain a maximum of N
elements. Each configuration element, labeled E, in the configuration C exists as an
95
instantiation of one of M possible configurable elements. The number of base
configurable elements, called M, is usually small due to the configurable elements being
highly encapsulated. For example, the ABB control system has 247 configurable
elements available for use in configurations. Most configurable systems allow many
elements in C to be based on the same base configurable element, much as object
oriented design allows many instantiations of the same class. The number of
specifications, S, in each configurable element E depends on the specific configurable
element that E instantiated. Usually S is kept small, just as the number of member
variables in a class should be kept small, since each specification in element E should be
well encapsulated. Finally, Pd and Pc need to be considered for the program under test,
since the time needed to construct an EFW is influenced by these values.
For the worst case analysis of this algorithm, the software system has to be
implemented in such a way that the implementation of all of the configurable elements
and settings in the system are fully coupled together. As a result of this, Pd, Pc, and Pl
would be large, since the high coupling must exist in the code as either control flow or
data flow dependencies. This full coupling of all objects would require the creation of an
EFW for each changed element E in C. In addition, the algorithm for the EFW would
propagate the change through the entire code base for each of these changed elements.
Also, each configuration must contain a large number of elements, N, and each element
must have a large number of specifications, S. Finally, each one of these elements and
settings must have to be changed by the customer. For this example, the algorithm would
operate on all N elements in C, since they were all changed. Each individual changed
element E in C, including both settings and configurable elements, would require an
96
EFW to be created. This Extended Firewall will grow to the size of the entire software
program, leading to an algorithm with a O((N*S)(Pd*Pc*Pl)) runtime which has exponential
growth and depends on the number of configurable elements and settings, N and S
respectively, the number of control flow paths, Pc, and the number and length of data
flow paths, Pd and Pl respectively, that exist in the system.
This worst case analysis is infeasible for real systems for a number of reasons.
First, completely coupled software program are incredibly hard to create, especially large
ones that have to perform actual complex functions in the real world. Second,
configurable systems tend to have groups of configurable elements that work exclusively
with each other. In the control system example, analog inputs have configurable elements
created for each of the possible device types you can connect them to, with specific code
to convert the values and communicate with that specific device. This code can not be
fully coupled with all other methods and classes, as it is specific to that one type of
device. The ERP system also contains these sets of similar elements, an example of
which is process specific functions that depend on the specifically selected process type
being implemented. Since this fully coupled system is infeasible, the EFW created for
each changed element E would never have to propagate across all existing Pd and Pc
paths. Finally, a customer would never change every setting and every configurable
element in the system at once. Since many of these configurable elements represent real
world objects, such as field devices in control systems and computer hardware in ERP
systems, a complete change of the configuration would mean a complete change of the
physical environment also. In this case, an entirely new commissioning effort is going on,
and a complete new testing effort must be done anyway.
97
The best case analysis for this algorithm involves a software system where the
code is completely encapsulated and minimal coupling exists. This would lead to a small
Pd, Pl, and Pc, since these paths represent the couplings that exist in the system. Since
there is minimal coupling, only a TFW would need to be created for each changed
element in the configuration. Two different customer change patterns will be looked at,
containing both single and multiple changed configurable elements. The single element
change would cause the algorithm to select the only changed element E from the set C.
For that one element, the algorithm would create a TFW only, Pd and Pl would be zero
and Pc would be one since there is minimal coupling in the system. Creating this TFW
takes constant time, as the propagation stops one level from the change by definition.
Since there is only one element changed, this step happens exactly once, which leads to a
runtime of O(1). Looking at this example with multiple customer changes, or N changes,
the algorithm selects each changed E from C and creates a TFW only, which is created in
constant time. This step is done N times, so the runtime for this example is O(N).
Finally, an average case analysis for this algorithm involves a system more
representative of one found in industry. This system has an average fan in of two and an
average fan out of five. The system has around ten thousand elements in the
configuration, and customer makes an average of eighteen total changes, of which six are
configuration changes and twelve are settings changes. Since the algorithm creates
firewalls for each changed element E in C, eighteen TFWs would be created. In addition
to these TFWs, one EFW is created per fifteen TFWs, on average. These numbers are
created empirically from real customer configurations and shown to be statistically
98
accurate in Chapter 6. Creating this number of TFWs and EFWs represents a very
reasonable amount of time and effort for the vendor in order to verify a customer change.
4.6 Future Improvements on the Configuration and Settings Firewall
There are many other current research projects in many software engineering
communities that could be combined with this method to increase its efficiency and
effectiveness. Efficiency increases could be obtained by including techniques and
automation being done in Requirements Engineering, Program Analysis, and Information
Retrieval, such as [35] and [36] mentioned in Sections 4.1 and 4.2. These automated
methods will replace much of the human expert knowledge and manual work that is
required to determine the impact from these customer changes.
It may be possible in the near future to fully automate this process, where
customers can submit their current and new configurations to an automated system. This
system would build the firewalls and return to the customer the impact of their changes,
allowing them to test their own changes quickly. This would enable this firewall method
to scale to any number of deployed systems while still protecting the implementation
details from customers and competitors.
In addition to improving the efficiency of the analysis time with new methods and
automation, this new firewall can be improved by adding some dynamic information,
such as execution profiles of the currently running configuration. These execution
profiles can be used to reduce the testing required due to a setting or configuration
change. By comparing the current new execution profile to other currently tested and
running execution profiles from other customers can identify areas of the system that are
running in the same way and were already tested. These areas do not require retesting,
99
but in the current firewall they would be retested. This additional reduction in testing will
reduce the time it takes to get a new change verified and into execution at the customer
site. Many studies have looked at capturing, grouping, and differencing profiles, such as
[46, 47].
Finally, some improvements in effectiveness can be made by developing new
firewall models for impact types not currently handled. Impact types that do not currently
have firewalls include defects related to memory leaks, starvation, and performance.
These firewalls, once developed, can be added to this Configuration and Settings Firewall
in the same way that other firewalls were added.
100
5. A Process to Support the Configuration and Settings Firewall
While the Configuration and Settings Firewall presented in Chapter 4 works well
for configuration changes in the field, software companies still release new software
versions with code modifications. This chapter presents a product release testing process
for use with user configurable software systems which, when used together with the
firewall after release, prevent latent and configuration based defects from being detected
in the field while reducing the redundant testing whenever possible. This process is based
on current industry release testing process with a few modifications, dealing with the
specific configurations and settings that must be tested at release time. Many of these
modifications aim to offset the additional testing needed when using the firewall for
configuration and settings changes throughout the products lifecycle.
This chapter is divided into the following sections. Section 5.1 presents an
overview of current processes for testing in industry. Section 5.2 presents a modified
process for release testing to support the Configuration and Settings Firewall. Section 5.3
presents future changes and research additions that can improve the efficiency and
effectiveness of this process. Finally, Section 5.4 discusses the time complexity of using
this proposed release process.
5.1 Current Industry Testing Process
Testing in industry usually follows the V-Model, shown in Figure 28, or a process
that is very similar to it. The early phases of testing, which are labeled in the figure as
coding, unit testing, and integration testing, are focused on verification activities. The
coding phase, which would not seem to include any verification, includes static analysis
101
and code reviews, two effective early methods of defect removal. These early phases
require no changes when testing user configurable software systems, since they are
focused on removing as many early defects as possible at a low level in the development
process. Later testing phases, labeled on the figure as system testing and acceptance
testing, focus more on validation activities. The system testing phase actually includes
two activities, product level and systems level testing. These later phases are ideal to
modify for user configurable software systems, as these phases deal with showing the
software meets its customer requirements.
Figure 28. The V-Model [48]
The requirements created for configurable systems are not received directly from
the customers. Instead they are created by the product management teams, with the help
of the marketing team, who have the responsibility to understand the market that the
product is sold in. It is done this way because the very nature of configurable systems
102
allows many different customers to refine a general solution into the specific solution that
will address their individual needs. In effect, these systems are sold as general purpose,
off-the-shelf software systems with a goal of meeting the needs of an entire broad,
diverse market. Due to the lack of direct requirements from any specific customer, the
product requirements for a specific customer can be best described by the current running
configuration they are using, as this configuration contains the functions and features that
the customer cares the most about at this point in time. Since the overall system
requirements come from the market as a whole, late phase testing can be shifted from
exclusively validating these market requirements to validating the currently running
customer configurations.
5.2 Modified Industry Testing Process
Fully testing the configurations for each customer completely would be a
monumental task, and provide limited benefit compared to the overall effort. Instead of
fully testing each configuration, a traditional code-based regression analysis will be
performed on the system. The impacted areas determined from the code-based analysis
will then be compared to the customer configurations in use in the field. Only the
configurations that contain configurable elements impacted by the code changes for the
release will be retested, looking for any latent, regression, or new defects in the system.
In addition to these configurations, a set of common changes to these configurations
could be tested if time remains. These common changes should be based on both the
overall system requirements and changes that other customers have made in the past. This
additional testing should allow for detection of defects that have a high probability of
being seen by customers in the field soon after release. Finally, this new release testing
103
process can lead to a reduction in the time that a release requires to finish testing, thus
getting the software out to the customers faster.
Testing other features not currently in use by customers and correcting defects
found in those features will be postponed until customers start using them by adding the
configurable elements to the configuration. Using the Pareto principle [48], also known
as the 80/20 rule, as a rough guide, it is likely that around 80% of the customers are only
using 20% of the features of the software. Other studies of software, including
prioritization and prediction, use this principle in a similar way [49].
It is safe to postpone the testing of features not currently configured by customers
as they can only be added later in one of two possible ways. The first way a previously
unused feature could be added involves the customer changing their configuration to
include it. This would require the use of the Configuration and Settings Firewall, which
would detect the new feature from the change and trigger testing of that new area. The
only other way a previously unused and untested feature can be added is when a new
customer buys the system and needs that new functionality in their configuration. In this
case, the extensive commissioning testing that occurs before the software goes online will
detect these defects.
Modifying the release testing process for these user configurable systems has the
potential to both reduce time and increase defects detected. The actual measured
improvement for a specific team or company will vary, depending on the current balance
between finding defects and reducing time that currently exists in the company. If the
release schedule has been the most important driver for releases, then the major gain will
be in defects detected, as previous testing did not have enough time to address the large
104
number of possible configurations and settings. If detecting as many defects as possible
was the driving factor instead, then a significant increase in release schedule can be
made, as previous testing was potentially looking at features that are not currently used in
the field. Most companies do not use either of those factors to the exclusion of the other,
so some measurable gain in both will be possible.
Once the software is released, the Configuration and Settings Firewall will be
used on each customer configuration and settings change. The details on how to create
the firewall for each of those changes was presented in Chapter 4. Customers must work
with the vendor to determine the impact of their change, and the vendor will test the
software with the new configuration and settings values and correct any defects that are
found. This process also has a positive side effect. Customers know that a change will
require them to send in a new version to the vendor and wait for validation, leading them
to make fewer ad hoc changes to the configuration, similar to how developers stop
making ad hoc code changes when companies use controlled Configuration Management
processes. More deliberate changes will help the customer have better reliability with the
software, as well as the vendor having less defects reported on the system.
Figure 29. Release Testing, Old and New Methods
105
In Figure 29, both the traditional and proposed release testing processes are
shown side-by-side. The x-axis represents time, the y-axis represents the range of
possible configurations, each black x represents a defect, the small dots represent the
configurations tested, and the vertical line represents the point in time where the software
is released to the customers. Each dot represents a set of tests and the coverage those tests
had on the execution of the software. Any x’s which are not detected before release
become latent defects.
The top image shows the coverage achieved for user configurable software when
tested with a traditional release testing process. Only a few different configurations are
tested, and those tested are very similar to each other. This leads to many tests being
redundant with regard to configuration and execution, and many latent defects not being
covered by tests. The bottom image shows coverage achieved when testing these types of
systems with the proposed new release testing process. In this case, testing is spread
across the product lifecycle leading to less redundancy, larger coverage of possible
configurations and executions, and a larger detection of customer relevant latent defects.
The overall number of tests run is the same for both the traditional and proposed
testing methods shown in Figure 29. The traditional release testing case has all of the
tests and configurations run in a short period of time before release. For the proposed
method, the total testing time is same, but this time is spread out over the life of the
product. It is important to note that, over the life of the product, the proposed release
testing method may require the same or even more total time to test the system than the
traditional method. This additional time amounts to the additional testing and fix time
required for each customer configuration change. As these previously unused features are
106
configured, testing must be conducted on these features and the defects found must be
corrected.
This new proposed release testing process, combined with the Configuration and
Settings Firewall, amount to a test-as-you-use model, where the costs of testing and
fixing defects are spread out over the life of the product. Even when the combined
method does cost more over time then traditional testing, there is still a guarantee that
only areas of the software that are in use are being tested and the defects detected are
ones that pose the greatest risk to customers running the system in the field.
5.3 Time Study of the Proposed Release Testing Process
In order for this new release testing method to be considered effective, it must
detect defects that were injected by code change which were found by customers just
after release. Notice this does not involve a configuration change but a code change, as
the customers are running the same configuration but they have updated the software
version. This evaluation will be shown empirically in Chapter 6.
For the new process and method to be considered efficient, it must save as much
time as possible. Determining this savings from the theory side is simple, as the number
of possible configurations and settings to test is prohibitively large. Since there are a
much smaller number of customers than combinations of configurations and settings, it
will certainly take less time to test each of their configurations than all possible
combinations before release. On the practical side, since it is infeasible to test all possible
combinations, current industry testing selects a specific subset to validate the system with
before release. This subset is based on a combination of expert system knowledge,
guesswork, and conservatism, many times leading to a set that is larger than it needs to be
107
in order to detect as many defects as possible. Since this subset varies for each product,
the savings in time will also vary and need to be determined for each system. Empirical
studies on the time savings of this method will be presented in Chapter 6.
5.4 Future Additions to the Proposed Release Testing Process
Additional time savings could be found by applying other research areas in
software engineering to this modified release testing process. One potential savings could
come from comparing the executions of a number of similar customer configurations to
determine if their execution patterns are the same. In addition, logging and comparing
previously tested execution patterns with newly changed configurations may lead to a
further reduction in the change-based testing done after release. The goal is only testing
the first of these similar configurations completely and then just loading and executing
the other configurations. This kind of execution profiling and comparison could be based
on system execution profiling information and clustering techniques similar to those
presented in [46, 47]. Another area of improvement would be better automation of the
customer configurations. If a set of parallel systems could be set up and augmented with
some automated load and test driver software, this testing would become much less
burdensome.
108
6. Empirical Studies of User Configurable Software Firewalls
This chapter examines the effectiveness and efficiency of the Configuration and
Settings Firewall on user-configurable software in more detail by presenting a set of
empirical studies that were conducted on various software products developed at ABB.
These products are large real-time user-configurable software systems currently in use at
thousands of locations around the world for industrial control. These systems are
configured to run various types of process control applications such as power generation,
chemical and beer production, and pharmaceutical manufacturing. The first part of this
chapter, Section 6.1, presents an overview of the empirical studies that were conducted.
Section 6.2 discusses the limitations of the studies that were performed. Sections 6.3 and
6.4 present the first case studies and their results, respectively. In Section 6.5, a
breakdown of the type of configuration based defects found by customers in the field is
presented. Finally, in Section 6.6, a case study was performed to show how efficiently
this method can be used in practice.
6.1 Empirical Studies Overview
In order to validate that the Configuration and Settings Firewall is effective and
efficient, a number of empirical studies were performed. These studies involved a few
different approaches, depending on the goal of each study. The first approach, used in the
first and second case studies, involves applying the Configuration and Settings Firewall
to a large number of past customer configuration changes and then comparing the
identified impact to any known defects that customer found when those changes were
made. This approach is useful to show the effectiveness of the change determination,
109
code mapping, and impact analysis steps of the firewall at determining the correct areas
of the software to test. By not running the tests, test effectiveness is removed from
consideration when determining the effectiveness of this new firewall method as well as
increasing the number of changes that can be analyzed given a set period of time.
A second approach, used in the third case study, takes a smaller subset of the
customer changes used in the first two case studies, applies the Configuration and
Settings Firewall just as before, but now includes execution of the tests. The goal of this
approach is to show any additional defects that pose a future risk for the customer that
have not yet been detected in the field.
The third approach shown in the fourth case study involves looking at each latent
defect found by customers and classifying it using the Beizer Defect Taxonomy [2]. Once
classified, an analysis of the defect types found by customer configuration change is
presented. This analysis provides insight into the number of defects found by customers
in each defect type. These defect types are each assigned to a code change firewall based
on the dependency involved with that defect. Once this assignment is done, the number of
defects that can be detected by each firewall is calculated.
The final approach presented in the fifth case study involves measuring and
recording static metrics of the code analyzed by the firewall for the first two sets of
changes. These measures, including Pc, Pd, Pl, and the frequency of EFWs, will help
describe the time complexity of the firewall.
The first and second case studies aim to show the effectiveness of the new
firewall model in detecting latent defects when the configuration and settings change.
The first study is conducted on an embedded process controller module which is
110
procedurally designed and implemented as a mix of C and C++. The second study is
conducted on an HSI console product running on a standard PC under Windows. This
HSI is implemented in C++ and C# following Object-Oriented design principles. Each of
these case studies start with a released version of the specific software and a running
customer configuration. Many customers are inherently secretive with their specific
configuration, as the actual running of their process is often a trade secret. Due to this,
ABB only has a few opportunities to get real customer configurations. One configuration
commonly available is the initial configuration used when the plant was first
commissioned. In addition, customers submit their currently running configuration when
field failures are detected in the software. These failures, if caused by latent software
defects, are detectable in the submitted customer configuration. To prevent any bias in
the analysis, no information about the customer found defect is available at the time the
firewall is created, but is utilized at a later time for firewall evaluation.
There are two ways available to get the specific changes the customer made to the
running configuration, both of which were used in these studies. The first method is
possible only when the customer submits detailed steps of the actions they took that
caused the failure to occur, as well as the configuration they were running when the
failure occurred. For this case, the submitted configuration is opened and the changes
they made are removed, leaving a configuration similar to what they were running before
the failure. Then the changes are made and a second configuration is saved, representing
the changed configuration. This method provides the precise configuration and settings
changes which caused the failure and exposed the latent defect, allowing for a more
accurate logging of the time required to analyze the changes. This accuracy is due to the
111
analysis only using the actual set of changes which exposed the defect, as opposed to a
grouped set of all changes over time that ABB knows about. This is further explained in
Section 6.3.
The second method for determining change information from the customer
involves taking a previously known configuration and using it as the base configuration.
The submitted customer configuration, when compared to this base configuration,
contains all of the changes to the configuration the customer made over some period of
time. This may include many years worth of changes, depending on when the last field
failure was reported for that customer. As a result, this method is slightly less
representative of a single atomic change set, but does allow more studies to be performed
when the detailed change information is not submitted by the customer. When using this
second method, the difference between the new and base configurations is broken into a
set of small grouped changes. Each of the changes in the groups has a Configuration and
Settings Firewall created for it, approximating incremental changes coming from the
customer.
In the first and second case study, once all of the impact areas have been
identified by the Configuration and Settings Firewall, reported customer failures which
are due to latent software defects are analyzed and checked against the impact identified
by the change. If they exist within the areas identified as needing retesting, the defects are
considered detected. If they exist outside the impacted areas, the defects will be studied to
determine if they were related to the configuration change. If they are related, the defects
are considered missed, and if not, they are considered outside of the scope of this firewall
method and discounted.
112
The overall effectiveness is measured by the percent of customer reported latent
software defects that were detected by the Configuration and Settings Firewall. Any
additional defects found are considered new defects that have not yet been detected in the
field, but do exist in the system as future risks. In addition, the time required to perform
the analysis and create the needed firewalls is recorded.
The third case study takes a small set of changes from the second case study and
involves running the tests themselves in addition to just checking impact. A few changes
from the GUI configuration product are used with a goal of detecting additional defects
that are currently at risk to that customer at that point in time. If a defect is found, it will
be checked against the known defect list for the product to determine if it was already
detected or is still latent in the software. The goal of this study is to show that the firewall
can actually detect the identified latent defects by testing, and also determine if additional
defects in and around the change can also be found.
The fourth case study takes all of the customer reported defects in the embedded
controller module and classifies them using the Beizer Taxonomy [2]. Instead of the full
four levels of detail used in the Beizer taxonomy only the first two levels are used. More
information on the taxonomy and the customizations are given in Chapter 6.6. The fourth
case study shows the types of defects that are revealed by customer configuration
changes as well as the types of defects that the different firewalls can detect.
Finally, the fifth case study presents a set of static metrics measured from the
software that was used for these empirical studies. These measures were taken using both
the source code as a whole and using just the code representing the configurable
elements. The measures collected include the fan-in and fan-out of each configurable
113
element, either class or function, the maximum calling tree depth containing a class or
method in a configurable element, the cyclomatic complexity of the functions inside the
configurable element, and the number of external values used in the function.
6.2 Limitations of Empirical Studies
It has not been determined yet if results on ABB systems are representative of all
real-time industrial software applications. In addition, real-time systems themselves may
behave in a way that is different than other types of applications. Also, the configurations
used by ABB customers are treated as trade secrets and there is no way to know exactly
how all of the changes were performed over time. Currently, the only way ABB knows
about customer configuration changes is when a failure is observed and reported to
technical support. Since time sequence data for each change is not available, the total
changes made to the customer’s configuration are split arbitrarily into a set of smaller
changes. This could lead to a larger amount of time for analysis and testing, due to
overlapping of the firewalls. A final limitation of the study is that the test time component
of the efficiency data is based upon a small number of test runs, as it was not possible to
run tests for each of the studies. The static metrics collected in Section 6.7 were
calculated on the entire system and support the claim of efficient test time creation and
execution.
6.3 First Case Study
The first case study was conducted using an embedded process controller which is
implemented as a hybrid containing both OO designed C++ code and procedurally
designed C code. This software includes 761 files, 4831 functions, 49 classes, 533,002
114
Executable Lines of Code (ELOC), and 247 configurable elements. This software runs on
a custom ABB designed hardware board running a proprietary embedded operating
system. Since this system runs a proprietary OS, there are no third party components in
the system that would require a COTS Firewall to be created. This case study is broken
up into many smaller studies involving different customers and configuration changes.
The main goal of these smaller studies is to show that the Configuration and Settings
Firewall is effective at detecting latent software defects exposed by configuration change
at the customer site. This is accomplished by creating the required firewalls for all of the
changes and then determining if they contain the failure reported by the customer.
Configurations for this system are created graphically and compiled by a tool into
files which are loaded into the specified controllers. Inside these files, the configurations
are represented as a list of configurable elements in the order they are to be executed.
Each configurable element in this list contains values for each of its settings, which are
then assigned to the internal variables that represent them when the configuration is
downloaded to the controller.
6.3.1 First Customer Study – Embedded Controller
The first configuration studied is from a customer that has used the system for
many years and is very familiar with how it works. There were two separate failure
reports submitted by this customer reported against the same version of the software.
Each failure was caused by separate latent defects in the software and contained the
configuration that was used to observe the defect. The most recent configuration was
compared to the originally installed configuration and the changes identified were
grouped randomly into three sets. It was not known which set contained the two reported
115
defects and the specific details of the defects themselves were not known before the
analysis was done. Prior to these changes, this specific customer had been running for
many years without reported failures.
The first set of changes included five configurable elements being added as well
as changes to nine settings. The settings changes were examined first. Each setting was
mapped to the internal data variable in the code representing it. This variable was marked
as a code change, and a TFW model was created for it. As the TFW was being created,
each control flow path was checked for any data flow dependencies. Since no dataflow
dependencies were found, an EFW was not needed. In addition, the settings changes did
not cause any change to blocking calls in the system, so no Deadlock Firewall was
needed. For the added configurable elements, four of them were previously used
elsewhere in the configuration and the final element was new to the configuration. The
code representing each element was considered code changed and a TFW was created for
each element. Just as with the settings changes, no dataflow or blocking calls were
affected, so only the TFWs were created.
The second set of changes from this customer included two added configurable
elements as well as five settings changes. Each settings change involved determining
which internal variables represented the setting, marking them as changed, and creating a
TFW model. As the TFWs were being created, each control flow path was checked for
any data flow dependencies. Since none were present, no EFWs were created. Each of the
added configurable elements was used elsewhere in the configuration. The code
representing it was marked as a code change, and a TFW model was created. When
creating the model, it was determined that one of the changes included a new semaphore
116
call, so a Deadlock Firewall was also created. There were no affected data flow
dependencies in these added configurable elements, so no EFWs were needed.
The final set of changes included only nine settings changes. Each setting was
mapped to the variables that represented them and each was marked as code changed.
These variables were checked to see if they belonged in a dataflow path with any other
parts of the system while the TFW was being created. It was found that two of the
changed variables were included in a longer data flow path which could lead to impact
spreading more then one level away. As a result, EFWs were created for these two
settings changes. The remaining seven only had TFWs created for them.
Once all of the changes were studied and the required firewalls created, the
customer defects were analyzed and their code locations determined. Each of the defects
was compared to the impact identified in the TFWs and if they were inside the impact,
they are considered detected.
The first set of changes contained one latent defect reported by the customer. This
defect resulted from a change where a previously used configurable element was added at
the end of the configuration and its output value was being passed to an element that
existed earlier in the configuration. This type of change is valid, but the system executes
code in the order it exists in the configuration, specifically by a unique ordering ID called
a block number. Since the new configurable element is at a higher ID, it is executed after
the existing configuration element that uses its output. The value being passed from the
new element back to the existing element was not initialized properly and the existing
element had no check inside it to verify that its connected data providers were executed
before it. This led to a potential error when the system is first started up where the
117
uninitialized value can cause the system to perform incorrectly or crash. This defect is
only observable when the source configurable element is added after the receiver element
which is dependent on its value, since the initial value is never needed otherwise. The
specific configuration change is shown in Figure 30.
Figure 30. Case Study 1, Configuration Change with Latent Defect
Figure 30 shows a new configurable element being added whose output is used by
a previously existing element. The latent defect exists within the new element and only
occurs when the new element is added in such a way that its execution happens after the
execution of the existing element. In this case, the currently existing element uses the
value from the new element before the new element has written to that value. This defect
is considered detected by the TFWs since the output function from the new configurable
element was marked as code changed and the existing configurable element, specifically
the interface from the new element to the existing one, as needing to be checked.
The second set of changes contained no defects reported by the customer. In the
third set of changes, one of the settings changes led to a customer defect. This setting
change affected the calculation of the output value for the configurable element it resided
in. This output was, in turn, passed between many configurable elements until it was
finally used to compute a final value that was then output to the system. This change is
shown in Figure 31.
118
Figure 31. Case Study 1, Settings Change with Latent Defect
Figure 31 shows an existing configurable element with a settings change. There
was a latent defect in configurable element C which was exposed due to the settings
change affecting the output value from configurable element Z. This defect was detected
by the EFW created for this change, as it includes this latent defect within the data flow
path marked to be retested.
This first customer configuration study shows that latent defects existing in the
code base can be detected by the Configuration and Settings Firewall. In total, two latent
defects were detected by this firewall. These defects were originally detected by
customers in the field and required development and test rework to correct. In addition,
no defects were missed by the firewall as no additional latent defects were reported from
the customer.
6.3.2 Second Customer Study – Embedded Controller
The second configuration studied involved a different customer with a completely
different configuration. This customer has also been running the software for a long time
and was very familiar with the working of the system. There were three failure reports
submitted by this customer on the same version, each of which was caused by a latent
software defect. For this study there were four sets of changes created. Previous to the
first set of changes, the software had been running continuously failure free for a number
of years.
119
The first set of changes included three settings changes and the addition of one
configurable element used previously in this configuration. Each setting was mapped to
the internal data variable in the code representing it. This variable was marked as a code
change, and a TFW model was created for it. As the TFW was being created, each
control flow path was checked for any data flow dependencies. No data dependencies
were found so no EFWs were created. For the added configurable element, a TFW was
created. As this TFW was created, each control flow path was checked for data flow
dependencies. No dataflow paths were affected, so no EFWs were created. In addition, no
blocking calls were affected by either type of change.
The second set of changes included the addition of four configurable elements
which were used previously in the configuration. No settings were changed. The code
dealing with the new configurable elements was marked as a code change and a TFW
was created. While creating the TFW, no blocking calls or dataflow paths were affected,
so no other firewall models were created.
The third set of changes included changes to three settings values as well as the
addition of three configurable elements which were new to the configuration. The
settings changes were mapped to the internal variables, which were marked as code
changes, and then TFWs were created. It was determined, while creating the TFWs, that
one data flow dependency was affected by one of the changed settings. This required the
creation of an EFW for that dependency. The code for the added configurable elements
was marked as code changed and a TFW was created around them. No blocking calls or
data flow dependencies were affected by the added configurable elements.
120
The final set of changes contained only five settings changes and one added new
configurable element. These settings changes were mapped to code variables, marked as
code changed, and then a TFW was created. No data flow dependencies or blocking calls
were affected, so no additional firewalls were created. The added configurable element
was new to the configuration and performed a smoothing operation on an input value.
The code for the configurable element was identified and a TFW was created. No data
flow dependencies or blocking calls were affected by the change, so no additional
firewalls were needed.
A latent defect was exposed by one of the settings changes in the first set. This
setting controlled which operation a configurable element performed, and when changed,
affected which code inside that element was executed. Specifically, the element was a
mathematical shaping function used to smooth analog input values and the defect existed
in a code path only executed for the mode of operation selected with the settings change.
The defect involved the accuracy of the shaping function for a certain range of values and
the failure report indicated that it caused process issues for the customer. This defect is
described in Section 4.1 and shown in Figure 14. The firewall model created for this
settings change contained the newly selected execution path within its boundaries, so this
defect is considered detected.
In the second set, one of the added configurable elements previously used in the
configuration exposed a latent defect in the software. It involved adding this configurable
element with its default values, which are automatically set by the configuration tool. If
no changes are made to the settings and the initial defaults are loaded, the configuration
will fail right away. The default values include a value which is specifically not allowed
121
in the configurable element. But the configurable element does not check this value
correctly when it is changed with the controller offline. This defect was definitely
detected by this method as the entire configurable element was selected for retesting by
the Configuration and Settings Firewall. An example of this kind of change is shown in
Figure 32. In this figure, a new configurable element is added without changing its
default settings. It is just dragged into the configuration page and saved. The other
instances of this configurable element in the configuration had their settings changed
before the system was run.
Figure 32. Case Study 1, Added Configuration Change
The third set of changes contained one latent defect reported by the customer.
This defect involved messages being missed in the system, due to increased processing
time required by the newly added configurable elements. TFWs and one EFW were
created for this change, but the defect was not contained inside the impact. Once the
failure report was analyzed, the underlying cause of the failure was a performance defect.
Performance defects such as these will require a Performance Firewall in order to detect
these defects reliably.
In the final set of changes, two separate defects existed. The first defect involved
a settings change which led to a latent defect being found in the software. The latent
defect itself prevented any changes to the settings of this element from taking effect until
a restart is done where no restart is usually required. Therefore, when the settings change
was made, it did not take effect initially. This defect was identified by the Configuration
and Settings Firewall, since both the setting itself was marked as changed as well as the
122
startup routine. As a result, this defect is considered detected by the firewall. The other
defect involved the addition of a configurable element which was new to the
configuration. This new element takes an input value and applies a mathematical function
to smooth its value out. This element was added in response to captured data values from
the input, showing that the physical device providing the input was causing variance in
the input value that did not actually exist in the process.
This second configuration studied showed similar results to the first study. Both
settings changes and configuration changes can lead to latent defects in the field. The
Configuration and Settings Firewall was successful at identifying the correct area to test
after the changes were made.
6.3.3 Additional Customer Studies – Embedded Controller
In order to prevent additional repetition, all of the additional customers studied
are summarized in Table 11 at the end of this section. In addition, each reported customer
defect and the configuration or settings change which exposed it are described separately.
The same process used for the studies in 6.3.1 and 6.3.2 is used here, but a description of
the steps followed for these additional studies is omitted for the sake of brevity. The
overall data collected for all of the studies conducted on this embedded controller are
shown in Table 11.
The third customer studied had a number of settings changes, including a settings
change to an advanced PID configurable element inside their existing configuration. This
change involved the customer changing the increment and decrement limit settings used
by the PID element. The customer changed these values by a large amount and the
process variable spiked rapidly, leading to the controller entering an error state. The
123
cause was an internal code defect, where the system would first disable increments and
decrements for the PID algorithm, forcing the output to remain the same. This was
accomplished by using a copy of the last output value as the output value for the PID
element. The internal algorithm did not stop calculating the error between the set point
and the current plant value, since it was updating its actual output, leading to larger and
larger changes of the output to correct the perceived error. Once the changes were
complete to the limit settings, the held output value was cleared, and the actual output
was connected. This led to a very large PV value being output from the PID element and
the controller, detecting it, entered the error state. This defect was detected by a TFW
built around the setting that was changed and its users, as a change in the setting value
executed the control flow path which held the output value steady.
The fourth customer studied changed only settings values in their configuration.
This customer had been improving the overall physical process with better materials and
up front quality control. When the physical quality was good enough, the customer made
a few changes to the advanced PID configurable element to allow for tighter control since
the process had less variation. These changes affected the values of the proportion and
integration settings used by the PID algorithm. Once the changes were made, the process
would drift by 4%, even though the underlying process did not require it. Internal to the
PID algorithm, the calculation used had a small rounding error in the calculation of the
new output value, leading to the detected instability. This defect was also detected by a
TFW built around the changed setting values and their uses inside the configurable
element.
124
The fifth customer studied added a set of previously used configurable elements
to the system. These elements used different values for a small number of settings, so
TFWs were created for each of the settings values that were different than previous
usages. When the customer loaded this configuration and started running it, the controller
crashed and went into error mode. The defect was related to one of the new settings
values used in the added configurable element. There is a latent code defect which is only
revealed when the settings value is set to a number above 16384. The customer had set
the setting value to 18726, which caused the error to be revealed. This defect was
contained inside the TFW created for the setting and its users inside the configurable
element.
The sixth customer studied also added a set of previously used configurable
elements to their configuration. These elements represented new physical IO devices that
were added to the system, and IO values which are read in from them. The settings values
used are mostly the same between the new instance of the elements and the previous
usages. As a result, TFWs are created for the configurable elements and only a few
settings. Three EFWs were created, as the new elements were connected in the
configuration to previously configured elements by data values being passed to them. The
customer loaded this configuration into the controller and started running it. Every once
in a while, the data sent to the previously used elements goes bad and then recovers a few
seconds later. This defect exists in the interface between the device bus and the
configurable elements, but is only detectable by elements that are connected to it. This
defect was detected by one of the EFWs, as the previously existing configurable elements
were involved in a data relationship with the newly added elements.
125
Table 11. Summary of Case Study 1
Embedded
Controller:
Cust 1:
Cust 2:
Cust 3:
Cust 4:
Cust 5:
Cust 6:
Total:
# of
Settings
Changes
23
11
18
8
4
0
64
# of
Defects
# of
Added
Used
CEs
1
2
1
1
0
0
5
6
5
0
0
9
21
41
# of
Defects
# of
Added
New
CEs
# of
Defects
Analysis
Time
(Hours)
1
1
0
0
1
1
4
1
3
0
0
0
0
4
0
1
0
0
0
0
1
4
1.5
1
0.5
1.5
2
10.5
#
#
# Deadlock # 3rd
TFWs EFWs
FWs
Pty
30
19
18
8
13
21
109
2
1
0
0
1
3
7
1
0
0
0
0
0
1
0
0
0
0
0
0
0
Table 11 shows the summarized results for the entire first case study. In all, 64
settings were changed and 45 configurable elements were added, 41 of which were
instances of previously used elements. These changes led to the creation of 109 TFWs, 7
EFWs, and one Deadlock Firewall. Creation of all of these firewalls took only 10.5 hours,
as there were few EFWs and Deadlock Firewalls created. These firewalls were able to
detect 10 latent software defects originally detected in the field, missing only one
performance defect which requires an additional future firewall to detect.
6.4 Second Case Study
The second case study was conducted on a graphical configuration program that is
used by customers to configure the entire system, from the embedded controllers to
Human System Interface displays and graphics. This system is implemented as a hybrid
of OO designed C++ code and procedurally designed C code. This software includes
5121 files, 39655 functions, 3229 classes, 767431 Executable Lines of Code (ELOC),
2398 configurable elements, and 17 third party components. This software runs on a
standard PC running the Windows operating system. This case study is broken up into
many smaller studies containing different customers and configuration changes. The
126
main goal of these smaller studies is to show that the Configuration and Settings Firewall
is effective at detecting latent defects found at customer sites that were exposed by
configuration changes. This is accomplished by creating the required firewalls for all of
the changes and then determining if they contain the defect reported by the customer.
Configurations for these systems contain configurable elements which are used to
create the files that are downloaded to the controllers, Human System Interfaces, and
other software products in the system. These configurations are stored as projects
containing a physical layout of the process. Each physical part of the process contains a
link to files that contain a graphical list of configurable elements, settings values, and the
relationships between them. Customers use this product to create the graphical
configurations, compile them, and then load them into the various software products that
use them.
6.4.1 First Customer Study – GUI System
The first customer study for this GUI-based system involves a customer who was
adding new graphical display elements into their configuration. These changes involved
the addition of previously used configurable elements into the configuration. These added
elements were connected to previously used input values from a field device and allow
these values to be displayed in the Human System Interface by plant operators.
The customer made a set of changes to their configuration to allow a large number
of internal input values to be displayed by the HSI. These were values that were found to
be important after the initial configuration of the plant was complete. The project files
before and after the change were analyzed, and a list of changes was created. This list
127
was split randomly into two smaller groups of changes and each group had Configuration
and Settings Firewalls created for it.
The first group of changes contained three settings changes and the addition of
two previously used configurable elements. For the settings changes, a set of Settings
Firewalls were created. Settings were mapped to the internal variables in the code which
represent it. These variables were marked as code changed, and TFWs were created for
them. As the TFWs were being created, each changed variable was checked to see if it
was involved in any data flow dependencies. For these settings changes, no data
dependencies were found, and no EFWs were created.
Each of the added configurable elements was previously used elsewhere in the
configuration, so their settings were compared to the previously used instances of these
elements. Only a few settings had changed for each, so Settings Firewalls were created
around those variables and uses, resulting in a set of TFWs. When creating the TFWs, the
variables were checked to see if they were used in any data dependencies. Two of the
added configurable elements had different settings which had relationships to other
configurable elements in the configuration. As a result, EFWs were created for each.
None of the configurable elements or settings changes were involved with any third party
components or blocking calls, so these firewalls were not needed.
The second group of changes studied contained two settings changes and the
addition of one new configurable element. Each of the changed settings was mapped to
the internal variables which represented them, and were marked as code changes. TFWs
were created, and each included checks for data dependencies. None were found, so no
EFWs were created. For the newly added configurable element, it was marked as a code
128
change and all dependencies, both into and out of it, were marked as affected. These
included both dependencies in the code and dependencies based on the configuration.
TFWs were created first, and data dependencies were checked. One such dependency was
found, and an EFW was created for it. Finally, no blocking calls or third party
components were affected, so these firewalls were not created.
After all of the changes were identified and the firewalls created, the failures
reported from the customer were analyzed. The three added configurable elements, one
new and two previously used, each exposed failures. These failures were related to one
latent defect in a support function for the added elements. This defect involved
connecting configurable elements across different graphical pages of the configuration,
by way of a reference. These references act as helper functions for all configurable
element types, but contained a defect for the two types which were added by the
customer. When the customer compiled the project into configuration files, the compiler
generated an error saying that the compilation failed. The failure was due to no matching
reference being found for these three additions to the configuration.
The EFWs created for the two previously used configurable elements and the new
configurable element contained this defect inside its identified impact area. The data
dependency itself involved the output of the newly added elements being connected and
used by other elements on other pages through a set of connected cross page references.
These references allow values and elements on one page of the configuration to be
connected to elements that exist on a different page. The defect existed in the processing
of these cross sheet references, and only occurred when the specific configurable
elements were connected through it.
129
6.4.2 Second Customer Study – GUI System
The second customer study for this GUI system involved a customer upgrading
their Human Systems Interface software. In addition to upgrading the software, the
customer changed their configuration to take advantage of new features in the HSI. Many
of these new features require changes to the settings of existing configurable elements,
allowing for better information to be displayed on the new HSI. The changes that were
made to the configuration were determined by comparing two versions of the customer’s
project. Once the configuration changes were identified, they were broken down into
three groups. The settings changes were split up randomly, but the configuration changes
were grouped together, as they represented replacing one set of elements with a different
set. This change is known to be an atomic set, as the problem description describes it in
high detail. Each group of changes had Configuration and Settings Firewalls created for
it. The group that contained the defect was not known when the firewalls are created.
The first of the three groups contained nine settings changes. Each of the changed
settings was mapped to the underlying code variables which represent them and marked
as a code change. Once complete, TFWs were created around these variables and their
uses, checking for data dependencies as they are constructed. No data dependencies were
found, so no EFWs were created. In addition, no blocking calls or third party components
were impacted, and these firewalls were not needed.
The second group of changes contained the addition of three new configurable
elements and the removal of three others. A set of existing elements were removed and a
different set were added which allowed the customer to take advantage of functionality
provided in the new HSI system. Each removed element was replaced with a new
130
element that contained additional functionality specific to the new HSI. These changes,
taken together, constitute an atomic change which was done in response to the HSI
upgrade. Since this change was atomic, only one set of firewalls were created for this
change, instead of one for each removal and one for each addition.
For each of the newly added configurable elements, the internal code for each was
considered changed, and TFWs were created. These TFWs contain control flow
dependencies from both the code and from the configuration itself. While creating the
TFWs, analysis for data dependencies was conducted. No dependencies were found, and
no EFWs were created. In addition, no blocking calls or third party components were
impacted, so these firewalls were not created.
The final group of changes contains four settings changes. These settings changes
affect the format of output data needed by the new HSI. These settings were mapped to
the internal variables, and were marked as code changed. After this, TFWs were created,
and each setting change was found to impact existing data dependencies. These
dependencies are between the output of the configurable elements which have settings
changes, and the configurable elements which send data out to the HSI. These
dependencies resulted in EFWs being created for each changed setting. No deadlock or
third party components were affected, so these firewalls were not created.
Once the various firewalls were created and the impact of the change identified,
the reported failures were studied. There were four failures detected, each resulting in
incorrect data being displayed on the new HSI. The failures caused values to be truncated
to 14 characters, instead of the 16 characters stated in the requirements. All of the other
types of configurable elements that send data to the HSI correctly send 16 characters.
131
These failures are caused by a single latent software defect contained inside the
configurable elements added in the second set of changes.
This defect exists in the impact identified by the TFWs created for the settings
change inside the added two configurable elements. The defect can be observed by
checking the output value of the added elements, requiring specific tests on those outputs.
If the TFW impact is too difficult to test, as the affected function is called by system
elements outside the product itself, the EFWs created also contain the defect, as they
included testing the data dependency between the newly added element and the HSI.
Since the quality of the tests was not the point of this study, the TFW is considered to
have found this defect.
6.4.3 Additional Customer Studies – GUI System
To prevent additional repetition, each additional customer change studied is
summarized in Table 12. Each of the reported customer failures and the configuration or
settings changes which exposed it are described separately in high detail. The same steps
were followed for these additional studies as were used in Sections 6.4.1 and 6.4.2, but
these details are omitted for the sake of brevity. The overall data collected for all of the
studies on this GUI System are shown in Table 12.
The third customer change studied involved adding five configurable elements, all
of which were previously used in the configuration. These added elements represent
redundant controller modules which were added to increase the reliability and safety of
the process. These modules do not contain any new logic, as they represent redundant
modules in the system. When the customer next exported this project for use in their HSI
system, the operation does not export all of the data in the project to the HSI. The defect
132
underlying this failure involved the algorithm used to export projects to the consoles. The
algorithm created exports one controller at a time in the order they appear in the project.
When the export processes the redundant module, it finds no logic, and continues on.
This works for all cases except when certain configurable elements are configured as
redundant modules. In this case, the number of controllers to export, which is used as the
loop termination value, is based only on the number of primary controllers. As a result,
the data exported is incomplete. Both TFWs and EFWs were created, as the added
configurable elements had data dependencies to many other areas of the software. This
defect, which caused failures for each element added, is contained only in the EFWs
created for these added configurable elements, as the elements are accessed by the export
routine which contained the defect.
The fourth customer change studied involved a change where the customer added
ten previously used configurable elements to their configuration, eight of one type and
two of another. These elements were additional values needed by the plant operators, and
were added to the configuration loaded into the HSI. Once the configuration was loaded a
failure was observed. The new instances of these configurable elements only had a few
settings values different from previous usages, leading to Settings Firewalls being created
for each changed value. These settings differences did not affect any data dependencies,
so no EFWs were created. The first type, of which eight were added, only required one
TFW while the second type required five TFWs for each of the two added elements.
These added configurable elements are involved in a dependency with a third party
component. This component was a Microsoft database, which was used to store all of the
configurable elements in the configuration. Due to this dependency, a COTS Firewall
133
was created. This firewall creation is in reverse compared to the code change version.
Instead of finding a change in the module and propagating it out to the high level APIs
that use it, the high level API affected was propagated inward, and all other API
functions which use these changes are marked as affected. The failure occurred due to a
latent software defect involving the database, which used a user-passed parameter as the
index value instead of generating its own index, as described in the documentation. When
negative values were passed into this database, it crashes with an unhandled exception,
which was the case for these newly added configurable elements. This latent defect was
contained in the COTS Firewall and was exposed by this change.
The fifth customer change involved a customer who added eight new configurable
elements to the system. These elements represented an analog input module and data
values which were connected to it. These new configurable elements were considered
code changes and TFWs were created for them. These TFWs include both static
relationships, based on the code, and dynamic relationships, based on the configuration
file in use. None of these newly added configurable elements were involved in a data
dependency with other parts of the code or configuration, so no EFWs were created.
Once these changes were made, the customer tried to export the configuration for use in
another software product in the system. The export completed successfully, but when the
configuration was loaded into the other product, eight failures were detected. The failures
involved a number of values being incorrect, specifically the values from the newly
added configurable elements. The underlying latent software defect was contained in the
TFWs created, as the configurable elements internal export method was being called by
134
the export routine. This required the export routine to be retested and a failure
corresponding to this defect was detected.
The final customer change studied included 25 settings changes. These settings
changes affect the update rates of the data being displayed. Each of the settings changes
had TFWs created for them. While creating these TFWs, a number of blocking calls were
identified, necessitating the use of a Deadlock Firewall. This firewall, once completed,
identified the potential for deadlock. Since this firewall is an analysis firewall and not a
testing firewall, the likelihood of the deadlock occurring is not determined, just that the
potential exists. The defect itself was an occurrence of deadlock, where the changed
update rates for the displayed values caused the GUI configuration software to lock up
completely. The change in timing was just enough to expose the latent deadlock to be
observed by the customer when the configuration was changed. Since the tech support
and test labs used different speed hardware, they were not able to directly reproduce this
problem, and engineers had to go out to the site to study the problem. This deadlock was
the same deadlock that was detected by the Deadlock Firewall. By using the
Configuration and Settings Firewall, this deadlock and large expense could have been
avoided.
Table 12. Summary of Case Study 2
HSI System
Cust 1:
Cust 2:
Cust 3:
Cust 4:
Cust 5:
Cust 6:
Total:
# of
Settings
Changes
5
13
0
0
0
25
43
# of
Defects
# of
Added
Used
CEs
0
0
0
0
0
1
1
2
0
5
10
0
0
17
# of
Defects
# of
Added
New
CEs
# of
Defects
Analysis
Time
(Hours)
2
0
1
1
0
0
4
1
3
0
0
8
0
12
1
1
0
0
1
0
3
1.5
2.5
0.5
3
0.5
4
12
135
#
#
# Deadlock # 3rd
TFWs EFWs
FWs
Pty
8
16
5
18
8
25
80
3
4
5
0
0
0
12
0
0
0
0
0
1
1
0
0
0
1
0
0
1
6.5 Third Case Study
The goal of the third case study is to run the required tests for a few of the
configuration changes analyzed for the GUI Configuration product selected from those
studied in the second case study. Running these tests allows actual testing time to be
measured, as well as determining if any additional existing defects around the change can
be detected. When testing the impact around the change, few detailed tests currently exist
to select from. The lack of existing detailed tests is due to the way that current release
testing is performed today, where testing spreads out very thinly just trying to test
representative configurations. As a result, the needed tests are created with exploratory
testing [4] using the impact identified by the created firewalls and system knowledge.
While the tests are being run on the system, certain measures are being recorded.
These are shown in Table 13 at the end of this section, with a separate line for each of the
two customer changes tested. These measures are broken into two categories. The first
category is time required and the second category is failures detected.
The measures selected for required time are presented first. A count of tests run is
the first measure used. Analysis time, which is the time needed in the second study to
create the firewalls, is taken from the second case study. The time required to run the
tests is logged as test time and is calculated as the elapsed time from the start of the tests
to the end of the tests, as measured by a wall clock. After that, the total test time is
calculated as the sum of the analysis and test times. Next, the original time is calculated
by summing the time required to investigate and discuss this problem, including technical
support, development, and management. By comparing the original time to the total time,
136
a time savings is computed. This savings represents the decreased time required to
reproduce and discuss the problems, when compared to field reported failures.
The second category of measures in Table 13 involves failures observed during
the testing. First, the overall number of observed failures is counted. Each of the observed
failures is split into two categories, known and new. New defects do not currently exist in
the defect repository while known defects do. This determination is made using expert
knowledge to compare the observed failure to existing failures described in the defect
repository. Finally, a determination is made if the observed failures contain the failure
reported by the customer.
6.5.1 First Testing Study – GUI System
The first change tested was one of the changes studied in Section 6.4.2. This
change involved adding configurable elements which export values out of the GUI
system and into the HSI. When the change was performed, a failure was detected where
values were truncated to fourteen characters instead of the required sixteen. The TFWs
and EFWs that were created for that study were reused here.
In order to test these changes, the correct version of the product was installed.
Once installed, the configuration was loaded which caused the failure to be detected.
Then, tests were created to cover the impact from the TFWs and EFWs generated. These
tests were created by using concepts from the Complete Interaction Sequences (CIS)
method [10]. This method will not be discussed here, but the key concept used involved
testing a required action by creating tests for all of the possible ways in which the GUI
allows that action to occur. For example, in a GUI system, copying a configurable
element can be done either through the Edit menu or by right clicking on the element and
137
selecting Copy. For this study, both of these actions would have been tested. Using this
method, along with exploratory testing, allowed more test cases to be created and run
leading to a higher overall coverage, while still being completed in a short amount of
time.
When testing was performed for this impact, a few failures were observed. The
first failure occurs when a settings value is being updated. If the user tries to switch GUI
screens in the middle of updating the settings, they are prompted to save the changes. If
the user selects cancel, the changes are lost and a message appears which says the record
could not be locked. This failure was not found in the failures listed in the defect
repository for this product, and is considered a new failure.
A second failure was found that occurs when the user enters 16 characters into a
description field and tries to export the list of configurable elements. This export
operation fails, as the exported list contains only 14 characters of the text. This failure
matches the customer reported failure in Section 6.4.2 exactly, so it is counted as a
known failure.
A third failure was detected when the customer configuration was first imported
into the tool. The tool reported an error when this operation was first attempted,
displaying only “Non-recoverable Error”. This is a non-descriptive error, as it gives no
information about the issue or resolution, and was listed as a failure. This failure is a
known error as it was detected internally by ABB when performing testing for a service
pack release.
The final observed failure occurred when a user exports the list of configurable
elements. If the user selects an available option on the export dialog box, the resulting
138
output contains no data. This occurs regardless of what configurable elements are
contained in the list. This defect is not due to configuration and settings changes, but
represents a more traditional latent defect in the functionality of the system. This failure
was the same as one described in the defect repository that was originally reported by a
separate customer one year after the release of this version. Therefore, it is counted as a
known failure.
This first study tested the impact identified by the Configuration and Settings
Firewall for the defect studied in Section 4.6.2. These tests were created and executed in
1.5 hours and was able to detect four separate failures in the released software. One of
these failures was not reported by testing or customers previously. Two of the failures
were found by other customers and the final failure was observed by ABB in testing
subsequent releases of the product. A summary of the data collected from this study is
shown in Table 13.
6.5.2 Second Testing Study – GUI System
The second change tested in this study is one from Section 6.4.1. The specific
change tested was the addition of three previously used configurable elements. These
elements were connected across graphical pages by references. When the customer made
these changes, a failure was observed. This failure involved the compiler failing the
compilation of the configuration and providing no error message documenting the issue.
The TFWs and EFWs created for the study in Section 6.4.1 were reused here. The correct
version of the product was installed and the customer configuration that detected the
failure was loaded. Tests were created using the CIS method.
139
When testing the impact of this change, a number of failures were observed. The
first test involved simply compiling the project. This basic operation exposed a defect,
where the compiler failed due to the customer adding three configurable elements. This
first failure matches the original customer reported failure for this configuration change.
A second failure was observed when testing alternate ways to change settings in
the added configurable elements. If a specific setting contains a value which comes from
a configurable element on a different page of the configuration, then the entire program
crashes when the configurable element is opened for change by the tool. When this
happens, the software must be restarted and all unsaved changes are lost. This failure
matches a failure found in the defect repository that was observed by a separate customer
in the field.
An additional failure was observed while testing the export functionality to verify
that the newly added configurable elements are exported properly. The system seems to
export correctly, as a file is generated and no errors are detected. Once the file was
opened, it was observed that the system failed to export all of the configurable elements
and settings to the file and reported no errors. This failure was matched to a separate
customer reported failure that was described in the defect repository. After further
analysis, this failure is due to the same defect as the one found by the third customer
change studied in Section 6.4.3. This defect was not found by this customer as export is
not a function they currently use. Since this failure was observed by another customer, it
is counted as a known failure.
One final failure was observed when testing the impact of this change. Any
textual changes to a setting value in the configurable element connected to the newly
140
added element are not saved. The failure is observed by changing the setting value,
saving the project, and then checking the value of that setting. This failure was found by
another customer when they were updating the settings values of those elements and
existed in the defect repository at the time of this study. The final results of this study are
shown in Table 13.
Table 13. Results from the Third Case Study
# of Tests
Run
Analysis
Time
(Hours)
Test Time
(Hours)
Change 1:
25
2.5
Change 2:
18
1.5
43
4
GUI System
Total:
Total Time
(Hours)
Original
Time
(Hours)
% Time
Saved:
# of
Failures
# of Known
Failures
Detected
# of New
Failures
Detected
Reported
Failures
Found?
2.5
5
42
88.10%
4
3
1
Yes
2
3.5
51
93.14%
4
4
0
Yes
4.5
8.5
93
90.62%
8
7
1
100%
6.5.3 Summary of Third Case Study
The third case study shows that performing the testing identified by the firewalls
can detect the original customer-found failures based on the configuration changes, as
well as additional failures in areas around the change. Seven of these failures were
reported by customers at a later point in time than the original configuration change,
representing defects that would have been found by ABB before customers observed
them. In addition, one new defect was found. This defect was likely observed in the field
by a customer but not considered important enough to report back to ABB.
6.6 Fourth Case Study
The fourth case study aims to better describe the latent defects which cause the
failures reported by customers, specifically latent defects related to configuration and
settings changes. In this study, 210 customer defects reported against the embedded
controller were studied, as well as 250 customer defects reported on the GUI
141
configuration system. These defects were reported by more then 150 different customers
from sites located around the world.
6.6.1 Taxonomy Overview
Each defect was classified into a slightly modified Beizer Defect Taxonomy [2].
This taxonomy splits defects into eight main categories, each describing a specific
grouping of defects around a specific characteristic. Each individual main category is
further refined three additional times, each representing a different, more specific sub
level. A defect is then assigned a four digit number with each digit representing a
category. The first digit is the main category followed by progressively more detailed
subcategories for digits two through four. For example, processing bugs would be 32xx,
where the three designates a structural defect and the number two further refines this
defect into the subcategory of processing. The last two numbers, shown as x here, would
refine the defect to more levels of detail if they were used. It is not necessary to specify a
defect to all four levels of the taxonomy. The eight main categories are shown in Table
14.
Table 14. Beizer’s Taxonomy’s Major Categories
1xxx Functional Bugs: Requirements
and Features
2xxx Functionality As Implemented
3xxx Structural Defect
4xxx Data Defect
5xxx Implementation Defect
6xxx Integration Defect
7xxx System and Software
Architecture Defect
8xxx Test Definition or Execution
Bugs
142
For this study, only the first two levels of the taxonomy are used and the resulting
classification is displayed without the trailing x’s. An overview of the subcategories is
shown in tables 15-19 and the surrounding paragraphs. This overview is meant to clarify
the taxonomy and its use here, not to describe how the defects were classified.
The first subcategory used is Functional Bugs. These defects deal with errors in
the requirements themselves. This subcategory includes defects dealing with incomplete,
illogical, unverifiable, or incorrect requirements. The specific subcategories can be seen
in Table 15.
Table 15. Beizer’s Taxonomy’s Functional Bugs
11xx
12xx
13xx
14xx
15xx
16xx
Requirements Incorrect
Logic
Completeness
Verifiability
Presentation
Requirements Changes
The second subcategory, Functionality as Implemented, deals with defects where
the requirements are known to be correct, but the implementation of these requirements
in the software product was incorrect, incomplete, or missing completely. This includes
defects due to incorrect implementation, missing cases, incorrect handling of ranges of
values, and missing exceptions or error messages. These subcategories can be seen in
Table 16.
Table 16. Beizer’s Taxonomy’s Functionality as Implemented Bugs
21xx
22xx
23xx
24xx
25xx
26xx
Correctness
Completeness – Features
Completeness – Cases
Domains
User Messages and Diagnostics
Exception Conditions Mishandled
143
The next two subcategories deal with low level developer defects which exist in
the source code. Structural defects deals with control flow predicates, loop iteration and
termination, and control state defects. The Data defects subcategory deals with defects
such as initialization of variables, scope issues, incorrect types, and manipulation of data
structures. The subcategories for both of these defect types are shown in Table 17.
Table 17. Beizer’s Taxonomy’s Structural & Data Bugs
31xx
32xx
41xx
42xx
Control Flow and Sequencing
Processing
Data Definition, Structure,
Declaration
Data Access and Handling
Two other categories, implementation and integration bugs, deal with errors such
as simple typos, code not meeting coding standards, or documentation problems. Specific
errors in these types include missing or incorrect code comments, mistyped or copy paste
issues, as well as violations of department or company coding standards fit in this
category. Integration defects, on the other hand, are errors in the interfaces, both internal
and external, that make up the software system. The subcategories for both
Implementation and Integration are shown in Table 18.
Table 18. Beizer’s Taxonomy’s Implementation & Integration Bugs
51xx
52xx
53xx
54xx
55xx
61xx
62xx
Coding and Typographical
Standards Violations
Internal Documentation
User Documentation
GUI Defects
Internal Interfaces
External Interfaces
The final two categories of defects deal with System and Test defects. System
defects comprise errors in the architecture, OS, compiler, and failure recovery of the
system under test. Test defects represent errors found in the test descriptions,
144
configurations, and test programs used to validate the system. These last two
subcategories are shown in Table 19.
Table 19. Beizer’s Taxonomy’s System and Test Bugs
71xx
72xx
73xx
74xx
75xx
76xx
77xx
78xx
81xx
82xx
83xx
84xx
OS
Software Architecture
Recovery and Accountability
Performance
Incorrect diagnostic
Partitions and overlays
Environment
3rd Party Software
Test Design
Test Execution
Test Documentation
Test Case Completeness
6.6.2 Embedded Controller Defect Classification
The taxonomy was used to classify all customer defects reported against the
embedded controller used in the first case study. Specifically, defects which were either
contained in configurable elements or exposed from changes to configurable elements
were selected. In total, 204 defects were found. Of these, 45 defects were still under
investigation at the time of the analysis and no fix had been made. The reports for these
defects did contain enough information on the failure to determine if they were
configuration related, so they were included in the study. Once all of the defects were
studied, 82 of the 204 defects existed in configurable elements or were related to
configuration changes. These 82 defects were classified using the taxonomy and the
results are shown in Figure 33.
Figure 33 shows that the majority of configuration related defects are low level
code problems, which are classified as 31, 32, 41, and 42. In addition, a large number of
user documentation problems were found. These defects mostly dealt with configuring
145
the system into a state that was not supported by ABB, but the documentation did not
explicitly prohibit those configurations. As a result, the documentation was updated to
better describe the illegal configurations.
14
Count of PRC ID:
12
10
8
6
4
2
0
11
13
21
22
23
24
25
26
31
32
41
42
55
61
71
72
Classification
Figure 33. Classification of Embedded Controller Defects
It is possible to map these defect types to the firewalls that have the best chance of
finding them. Traditional Firewalls are ideal for detecting defects of type 31 and 32, since
these defects deal directly with control flow and internal function processing. In addition,
defects of type 21, 22, 23, and 24 are also found by TFWs, as they deal with incorrect or
missing implementation of the configurable element, as well as ranges of acceptable
values inside that element. These defects are exposed by either settings changes or by
adding the element to the configuration. In addition, defects of type 61 are detectable by
the TFW, as these defects represent interfaces which are internal to the configurable
element itself. Finally, many of the defects of type 55 are found by TFWs. These defects
usually just involved making a configuration settings change which leads to either a crash
146
or an obvious violation of the intended change, such as no change in the behavior of the
system. In total, 62% of the reported configuration defects can be detected by the TFW.
As a validity check, all of the defects found in the first and second case studies by the
TFW were classified. The results show that all of these defects were included in these
types.
Extended Firewalls are ideal for detecting defects of type 41 and 42, since defects
of these types involve data access, manipulation, and computation. Many of the defect
reports found also show that the observed failure for these types of defects were not
contained inside the configurable element, but resided in a different element or the
system itself. In addition, defects of type 25 and 26 represent defects that EFWs are
effective at finding. These types deal with user messages and exception handling. These
defects are usually observed outside the element itself, and mostly are related to data
flows. Finally, EFWs are effective at finding defects of type 62 and 63, representing
external and configuration interfaces, respectively. In total, 29% of the reported defects in
configurable elements can be detected by the EFW. Just as with the TFW types, all of the
defects found by EFWs in the first two case studies were classified and their respective
classifications were all included in these types. Finally, Deadlock and COTS Firewalls
are able to detect 3% of the customer defects.
The remaining defects, those of type 71 and 72, require new code change-based
firewall models to be created in order to detect them, both for traditional code changes
and configuration and settings changes. These types of defects are architectural and
performance defects, which include memory leaks, starvation, timing and race conditions,
and overall performance stability. Once code change-based firewalls are created for these
147
defect types, the firewalls can be used in the Configuration and Settings Firewall in the
same way that the currently existing firewalls are. They are not critical at this point, as
they represent only 6% of the total defects, but they do represent the largest group of
defects that are not detectable by firewalls as of now.
6.6.3 GUI System Defect Classification
The taxonomy was also used to classify recent customer defects reported against
the GUI Configuration system used in the second case study. For this product, the 250
most recent customer defects were studied. Of these, 77 were classified using the
taxonomy, as they were either contained in configurable elements or were related to
changes in the configuration and settings. The result of using the taxonomy to classify
these 77 defects is shown in Figure 34.
12
Count of PRC ID Composer Database:
10
8
6
4
2
0
21
22
23
24
25
26
31
32
41
42
51
53
55
62
71
72
74
75
Classification
Figure 34. Classification of GUI System Defects
Figure 34 shows a few similarities to the classified defects from the embedded
controller. First, many of the customers reported configuration defects are low level code
148
problems, classified as 31, 32, 41, and 42. Second, a large number of user documentation
problems were found. In this case, the defects dealt with prohibited values, ranges, and
combinations which were not explicitly prohibited by either the software or the
documentation. Management decided that the documentation would be updated to better
describe the illegal configurations, as opposed to prohibiting these modes of operation in
the software itself.
One difference found in the data between the embedded controller and the GUI
configuration system deals with defects in the category of Functionality as Implemented,
labeled as 21, 22, or 23. There were more of these defects found in the GUI system than
in the embedded controller. Further analysis was done on these defects, and it was
determined that the difference was related to the level of detail in the product
requirements. For the embedded controller, the detail was high as its functions were well
understood and fully documented. The GUI system, on the other hand, had a low level of
detail for many parts of the requirements. These low detail requirements led to defects
where the implementation was incorrect, type 21, or where the implementation was
correct for all existing cases, but certain other cases were missing.
Low detail requirements also lead to a large number of defects having a resolution
type of Not a Problem, Works as Designed, or Will Not Fix. Defects of these types
involve customers submitting defects which represent behavior they believe to be
incorrect. Once the defect is received, product management, marketing, and development
together decide that the product was never meant to do this, resulting in no code change
to resolve the issue. These defects represent a misunderstanding of the requirements
between management, development, and the customers. In total, 240 defects of these
149
three types were reported from customers, out of 894 total customer reported defects.
This represents 27% of all customer reported defects, compared to only 19% in the
embedded controller study. A more formal study of these relationships is ongoing now
and is outside the scope of this research.
The mapping of defect type to firewall remains the same for the GUI system.
TFWs are effective at detecting defects of type 21, 22, 23, 24, 31, 32, 55 and 61. The
justification remains the same as for the embedded controller study in 6.6.2. In total, 61%
of the reported customer defects for the GUI system can be detected by the TFW. EFWs
are effective at detecting defects of type 25, 26, 41, 42, 62, and 63. Overall, defects of
these types make up 25% of the total reported costumer defects for this GUI system, and
require an EFW to be created. Deadlock and COTS Firewalls are able to find some of the
defects of type 72 and all of the defects of type 75. These represented 5% of the total
defects. Finally, defects that require new code change-based firewall models to be created
represented 9% of the defects.
The fourth case study shows the common types of defects that are exposed by
configuration changes, as well as the relative frequency of occurrence of each type
compared to the other types. In particular, low level coding issues that are only observed
when certain configurations and settings are selected represented 47% of the total defects
found by configuration change. This may seem higher than expected but analysis of these
defects show that the code paths containing the defects were never executed before, due
to the prohibitively large number of configurations and settings possible in the system.
By mapping certain types of defects to firewalls that have the best chance of finding
them, it is shown that TFWs are able to detect 61% of the customer exposed defects.
150
EFWs, even with the larger effort required to create them, are still very beneficial, and
can detect 27% of the defects. Deadlock and COTS Firewalls are able to detect 4%, on
average, of the customer defects. These defects, while small, usually lead to long fix
times and larger customer unhappiness. Finally, on average, 7% of the defects reported
are not able to be detected reliably by any of the current firewall models. These defects
will require new code change-based firewalls to be created, and the defects were mostly
memory leaks and performance issues.
6.7 Fifth Case Study
The goal of the fifth and final case study is to show that configurable elements are
easier to analyze and test than the system as a whole. Section 4.5 discussed the time
complexity of analyzing and creating firewalls for user-configurable systems and
identified some key factors which affect the analysis time. These factors include the
number of control flow dependencies and the number and length of data flow
dependencies, each of which represent paths through the code.
In order to describe these important factors, a set of static metrics were selected
and collected. These metrics include:
1. Call depth – A measure of the maximum calling depth of a function which
is related to the maximum length of a dependency.
2. Fan in – A measure of the number of functions who call a specific
function which is combined with fan out to represent the number of
control flow dependencies.
3. Fan out – A measure of the number of callers a specific function calls
which is combined with fan in.
151
4. Global Variables Used – A measure of the number of global variables
referenced in a function which is used to describe the number of data flow
dependencies.
5. ELOC / Method – A measure of the size of a function which is used with
cyclomatic complexity to quantify the number of tests needed.
6. Cyclomatic Complexity [50] – A measure of the number of independent
paths through the code which is used with ELOC / Method.
These metrics were calculated from the source code for the embedded controller
used in the first study. The directory layout of the source code for this product made it
very easy to determine a logical split between configurable elements and the rest of the
system. The GUI system used in the second study was not included in this study, as the
code for its configurable elements is mixed together with the code for the rest of the
system. While it is possible to find each configurable element in the code, splitting
configurable elements from the rest of the system for every function and class would take
a large effort.
Each of these metrics was run on the system with Understand for C++ [51], a
commercial tool to calculate source-based metrics with a customizable Perl engine. Each
metric was collected twice, once on the system as a whole and once on only the source
code representing the configurable elements in the system. Afterwards, the results were
imported into Excel and saved. Inside the Excel workbook two sheets were made, one
containing all of the functions, and one containing just those functions that exist inside
configurable elements. Excel was then used to summarize each metric into small
152
comparable values, including minimum, maximum, mean, standard deviation, and
median.
These values are simple to calculate and allow the claims made in this study to be
validated. They are shown in Table 20, where each metric group has two sets of recorded
values: All, representing the value for the whole system, and CEs, representing the value
for the source code of the configurable elements. As the value increases for each of these
metrics, the analysis and test time increase.
Table 20. Summary of Source Metrics
Min:
Max:
Average:
St.Dev.:
Median:
Min:
Max:
Average:
St.Dev.:
Median:
Call Depth
All
CEs
0
0
10
9
1.4398607 1.407679
2.1212166 1.839869
0
1
Globals Used
All
CEs
1
1
206
206
12.951104 14.64268
15.65772 17.04165
7
7
Fan In
All
CEs
0
1
598
48
7.395059 1.937677
40.05258 3.959914
1
1
LOC / Method
All
CEs
1
1
578
405
32.74362 26.54504
49.80465 40.15872
15
10
Fan Out
CEs
0
1
186
110
8.490187 5.011331
15.6977 10.01778
1
1
Cyclomatic
All
CEs
1
1
169
70
5.237573 4.21813
8.819084 6.747848
2
2
All
When comparing the All column to the CEs column for each metric in Table 20, a
decrease of some magnitude is visible for all categories except globals used. After further
analysis, it was found that the embedded controller module’s proprietary OS contains no
memory manager. Due to this, all allocations of memory are made from a large, statically
allocated array which is treated by Understand as a global variable. In addition, all other
standard system functions offered by the operating system also access this global memory
array, leading to global variable access for all operations on semaphores, inter-task
messages and signals through message queues, and all buffer access. This causes too
many false positive values for globals used, so it was removed from further analysis. All
153
of the other values showed some amount of decline between the whole source code, in
the All column, and the configurable elements, in the CEs column.
In order to use these data to form any conclusions, a set of statistical hypothesis
tests were conducted on each metric type. A two-sample T-test for unequal variances was
used. The hypothesis presented for each of the tests was that there was no difference in
the means of the values. If this hypothesis is true, then the conclusion is that the metric
calculated for the whole system is not statistically different from the metric calculated for
just the configurable elements. Otherwise, if the hypothesis is rejected, the means are
proven to be different, and the measure for the configurable elements is smaller than the
measure for the program as a whole.
Table 21. T-test Results for Call Depth
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesis Mean Diff
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
All Depth
1.439478261
4.495814652
5750
0
3342
0.612732905
0.270047326
1.6453097
0.540094652
1.960674021
CE Depth
1.407679277
3.385116137
1771
The first metric set compared was call depth. This metric measures the largest
number of functions which are called sequentially and containing this function in the
sequence. As call depth increases, so does the length of control flow paths and possibly
data flow paths, if they exist in the system. This tends to leads to a longer EFW creation
time. Performing the hypothesis test yielded the result shown in Table 21. These results
show that the hypothesis of equal means can not be rejected, as the P value is 0.27 which
154
is greater than 0.05. Therefore, the call depth for the program as a whole and for the
configurable elements is the same.
The second metric set compared was fan in. This metric represents the number of
functions that call the specific function being measured. It is one of the usual measures of
coupling in a system. A higher fan in means more functions can be impacted by a change
in the function being measured. In addition, a higher fan in leads to a larger analysis time
by the TFW and EFW, as each direct caller and callee must be analyzed when creating
these firewalls. The results of performing this test show that the hypothesis, that the
means are different, can be rejected, as the P value is ~0, which is less then 0.05.
Therefore, the fan in of the configurable elements is smaller than the fan in for the entire
system. The details are shown in Table 22.
Table 22. T-test Results for Fan In
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
All Fanin
7.395058878
1604.209481
4331
0
4534
8.86137118
5.57795E-19
1.645189773
1.11559E-18
1.960487286
CE Fanin
1.937677054
15.6809209
1765
The third metric set compared was fan out. This metric represents the number of
functions that are called from the specific function being measured. This is also one of
the standard measures of coupling in a software system. A higher fan out means more
functions could be affected by a change in the function being measured. In addition, a
higher fan out also leads to a larger analysis time when creating both TFWs and EFWs,
155
as the algorithm looks at each caller and callee. The results of this hypothesis test are
shown in Table 23. The results show that the hypothesis can be rejected, as the P value is
~0, which is less then 0.05. Therefore, the fan out of the configurable elements is smaller
than the fan out for the entire system.
Table 23. T-test Results for Fan Out
t-Test: Two-Sample Assuming Unequal Variances
All Fanout
8.490187024
246.4176289
4331
0
5015
10.3145661
5.33694E-25
1.645157526
1.06739E-24
1.960437078
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
CE Fanout
5.011331445
100.3558806
1765
The fourth metric set compared was executable lines of code per method (ELOC /
Method). This metric is computed for each method in the system by counting the lines of
executable code that exists inside it. Comments, blank lines, and braces are not counted,
unless they also have executable code on the same line. This measure is the standard
measure for the size of a function. A higher ELOC / Method count usually indicates a
function with less potential reuse and less cohesion, as each function is implemented to
performing too many unrelated tasks. The results in Table 24 show that the hypothesis
can be rejected, as the P value is ~0, which is less then 0.05. Therefore, the ELOC /
Method of the configurable elements are smaller than that of the system as a whole.
156
Table 24. T-test Results for LOC / Method
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
All Count LOC /
Method
32.74361845
2480.503347
4466
0
3979
5.113989091
1.65139E-07
1.64523667
3.30279E-07
1.960560307
CEs Count LOC /
Method
26.54504249
1612.723168
1765
The fifth and final metric set compared was Cyclomatic Complexity [50]. This
metric is computed for each method in the system by computing: v(G) = e – n + p, where
G is the program’s flow graph, e is the number of edges in the graph, n is the number of
nodes in the graph, and p is the number of connected components. As this number
increases, the number of paths that must be covered by tests increases. This leads to a
larger testing effort, as high code coverage is a common measure of thorough testing [2].
The results of the hypothesis test shows that the hypothesis can be rejected, as the P value
is ~0, which is less than 0.05. The details are shown in Table 25.
Table 25. T-test Results for Cyclomatic Complexity
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
All Cyclomatic
5.237572772
77.77624531
4466
0
4194
4.904046521
4.87183E-07
1.645217029
9.74366E-07
1.960529726
CEs Cyclomatic
4.218130312
45.53345795
1765
The fifth case study clearly shows that, for the system studied, configurable
elements are smaller, more encapsulated, and less coupled than the system as a whole.
157
This proves that, on average, the underlying code change-based firewalls needed for the
Configuration and Settings Firewall are easier to create than they are on the system as a
whole. In addition, this data was used for the average case time complexity in Section
4.5.
This study presents a set of static measures of the source code used in the first
case study. The data collected show that configurable elements have smaller values in
each key metric. Since smaller values lead to smaller analysis and test time, configurable
elements themselves are easier to analyze and test than the entire system. This was
proven statistically and supports the discussion of time complexity in Section 4.5.
Finally, the Configuration and Settings Firewall only creates code change-based firewalls
on configurable elements, which require less analysis time then those created on the
system as a whole. Combining this with the empirical data collected on code changebased firewalls in Chapter 3, which shows the efficiency of these firewalls on userconfigurable systems, the Configuration and Settings Firewall itself is efficient on these
systems.
158
7. Conclusions and Future Work
User-configurable systems present many difficult challenges to software testers.
Combinatorial problems prevent exhaustive testing before release, leaving many latent
defects in the software after release. Customers are then at risk to exposing these defects
at a later point in time, whenever they make changes to their running configuration.
Current methods for regression testing systems are based on code changes and rely on
differences in various software artifacts, such as source code, metadata, or executable
images, to determine impact on the system. Due to this, current RTS methods are not
directly applicable to the problem where configuration changes reveal latent software
defects.
The Configuration and Settings Firewall was created as a solution to this problem.
This method allows incremental testing of user-configurable systems by determining the
impact of each customer change on the system as a whole, and determining what
retesting is needed, if any. Impact analysis is determined by mapping configurable
elements and settings onto the code in the system that implements them and treating that
code as changed. Once the mapping is complete, the impact is propagated through the
system and tests are selected or created, using existing code change-based RTS methods.
In addition, these existing RTS methods have been validated for industrial use on userconfigurable systems.
A set of five case studies were performed on the Configuration and Settings
Firewall, showing its efficiency and effectiveness at detecting customer found defects in
real, deployed industrial systems. These studies analyzed 460 reported failures on two
very large user-configurable systems, each of which are used at thousands of locations
159
around the world. The results of the study show that each of the reported customer
defects would have been detected by this method, as well as some additional defects
found later in the system by other customers and testers. In addition, the analysis time
required to create this new firewall is not substantial compared to the cost of diagnosing
and fixing the problems found at a customer site.
Future research on user-configurable systems is very important, as these systems
are becoming more widely used for critical applications. In the short term, a better
definition of user-configurable systems is an important contribution still needed. Any
definition should cover the degree of configuration allowed, as almost all programs today
allow some level of customization. In addition, an understanding of these systems needs
to be published. This should include a detailed analysis of the source code, defects, and
customer usage of these systems. An initial set of this data is included in this work, but
more information needs to be collected and published.
The first main area of future work is initial release testing of these userconfigurable systems. Previous work in that area, such as [33, 34], shows some
techniques which may work on smaller systems with fewer configurable elements and
settings. These techniques will need to be studied on software with a larger number of
configurable elements and settings, such as ERP systems and industrial control systems.
Chapter 5 presented a proposal for a way to initially test the software before
release. This proposal needs to be further refined into a method which can be empirically
studied. This may involve adding in elements from previous work testing userconfigurable systems. A main focus on this proposed release testing process is a focus on
160
testing relevant customer configurations. This will allow the method to compliment use
of the Configuration and Settings Firewall.
Another area of future research involves creating the firewall models for the
additional forms of impact analysis identified in Chapter 3, such as performance and
memory leaks. Changes that impact these types of dependencies are not currently
supported for either code change-based RTS or configuration change-based RTS today.
This represents only 6-9% of the defects studied so far, but it is a large risk to its
mainstream adoption by industry.
Besides the research areas above, which aim to increase the effectiveness of the
testing performed, a reduction in the effort required to release these systems should be
researched. One possible way to reduce this effort involves applying research in
execution profiling to this area. Currently, no real understanding exists on how userconfigurable systems run in the field. It is common to compare the static configurations
between two different uses of the system, but execution information provides much more
data on how the events and user-interactions caused the system to run. Besides providing
a better understanding of the system, these methods may a further reduction in the testing
required when using the Configuration and Settings Firewall on two similar customers
making the same changes their configuration. This reduction will significantly reduce the
overhead that the software vendor does for each customer configuration change.
Another reduction is possible with automation. Each system using this method
should have access to a configuration differencing tool, either proprietary or third party.
These tools, combined with recent advances in semantic and static impact propagation,
allow many steps of the firewall creation to be automated. Also, some research into the
161
feasibility of a web system to determine the difference and impact of a configuration
change needs to be done. This system would enable customers to propose some changes
and get fast feedback on how large of an impact these changes may have on the software.
While this system must protect proprietary information, it should be possible to provide a
feature or requirement level impact, along with an overall measure of system impact.
162
8. References
[1]
IEEE, "IEEE Standard Glossary of Software Engineering Terminology," IEEE
Standard 610.12, 1990.
[2]
Beizer, B. Software Testing Techniques, Second Edition. International Thompson
Computer Press, Boston, 1990.
[3]
Bach, J. Satisfice, Inc. ALLPAIRS test generation tool, Version 1.2.1.
http://www.satisfice.com/tools.shtml, 2004.
[4]
Kaner, C, Bach, J., and B. Pettichord. “Lessons Learned in Software Testing: A
Context Driven Approach,” Wiley Publishing, New Jersey, 2001.
[5]
Sommerville, Ian, “Software construction by configuration: Challenges for
software engineering research”. ICSM 2005 Keynote presentation, Budapest,
September 2005.
[6]
J. Bible, G. Rothermel, and D. Rosenblum, "A Comparative Study of Course and
Fine-Grained Safe Regression Test-Selection Techniques," ACM Transactions on
Software Engineering and Methodology, vol. 10(2), pp.
[7]
L. White and B. Robinson, "Industrial Real-Time Regression Testing and
Analysis Using Firewall," in International Conference on Software Maintenance,
Chicago, 2004, pp. 18-27.
[8]
K. Abdullah, J. Kimble, and L. White, "Correcting for Unreliable Regression
Integration Testing," in International Conference on Software Maintenance, Nice,
France, 1995, pp. 232-241.
[9]
L. White and K. Abdullah, "A Firewall Approach for the Regression Testing of
Object-Oriented Software," in Software Quality Week San Francisco, 1997.
163
[10]
L. White, H. Almezen, and S. Sastry, "Firewall Regression Testing of GUI
Sequences and Their Interactions," in International Conference on Software
Maintenance, Amsterdam, The Netherlands, 2003, pp. 398-409.
[11]
H. Leung and L. White, "A Study of Integration Testing and Software Regression
at the Integration Level," in International Conference on Software Maintenance,
San Diego, 1990, pp. 290-301.
[12]
H. Leung and L. White, "Insights into Testing and Regression Testing Global
Variables," Journal of Software Maintenance, vol. 2, pp. 209-222, December
1991.
[13]
L. White and H. Leung, "A Firewall Concept for both Control-Flow and Data
Flow in Regression Integration Testing," in International Conference on Software
Maintenance, Orlando, 1992, pp. 262-271.
[14]
J. Zheng, B. Robinson, L. Williams, and K. Smiley, "Applying Regression Test
Selection for COTS-based Applications," in 28th IEEE International Conference
on Software Engineering (ICSE'06), Shanghai, P. R. China, May 2006, pp. 512521.
[15]
R. Arnold and S. Bohner, Software Change Impact Analysis: Wiley-IEEE
Computer Society Press, 1996.
[16]
T. Ball, "On the Limit of Control Flow Analysis for Regression Test Selection,"
in ACM SIGSOFT International Symposium on Software Testing and Analysis,
Clearwater Beach, FL, March 1998.
164
[17]
S. Bates and S. Horwitz, "Incremental Program Testing Using Program
Dependence Graphs," in 20th ACM Symposium on Principles of Programming
Languages, January 1993, pp. 384-396.
[18]
P. Benedusi, A. Cimitile, and U. D. Carlini, "Post-Maintenance Testing Based on
Path Change Analysis," in Conference on Software Maintenance, October 1988,
pp. 352-361.
[19]
D. Binkley, "Reducing the cost of Regression Testing by Semantics Guided Test
Case Selection," in International Conference on Software Maintenance, October
1995, pp. 251-260.
[20]
T. L. Graves, M. J. Harrold, Y. M. Kim, A. Porter, and G. Rothermel, "An
Empirical Study of Regression Test Selection Techniques," ACM Transactions on
Software Engineering and Methodology, vol. 10(2), pp. 184-208, 2001.
[21]
R. Gupta, M. J. Harrold, and M. L. Soffa, "An Approach to Regression Testing
Using Slicing," in Conference on Software Maintenance, November 1992, pp.
299-308.
[22]
M. J. Harrold and M. L. Soffa, "Interprocedural Data Flow Testing," in Third
Testing, Analysis, and Verification Symposium, December 1989, pp. 158-167.
[23]
M. J. Harrold and M. L. Soffa, "An Incremental Approach to Unit Testing During
Maintenance," in Conference on Software Maintenance, October 1988, pp. 362367.
[24]
D. Kung, J. Gao, P. Hsia, F. Wen, Y. Toyoshima, and C. Chen, "Change Impact
Identification in Object-Oriented Software Maintenance," in International
Conference on Software Maintenance, Victoria, B.C., Canada, 1994, pp. 202-211.
165
[25]
D. Kung, J. Gao, P. Hsia, F. Wen, Y. Toyoshima, and C. Chen, "Class Firewall,
Test Order and Regression Testing of Object-Oriented Programs," Journal of
Object-Oriented Programming, vol. 8(2), pp. 51-65, 1995.
[26]
Dunietz, I. S., Ehrlich, W. K., Szablak, B. D., Mallows, C. L., and Iannino, A.
“Applying design of experiments to software testing.” Proceedings of the
International. Conference on Software Engineering, 1997, pp. 205–215.
[27]
T. J. Ostrand and E. J. Weyuker, "Using Dataflow Analysis for Regression
Testing," in Sixth Annual Pacific Northwest Software Quality Conference,
September 1988, pp. 233-247.
[28]
G. Rothermel and M. J. Harrold, "Selecting Regression Tests for Object-Oriented
Software," in International Conference on Software Maintenance, September
1994, pp. 14-25.
[29]
A. B. Taha, S. M. Thebaut, and S. S. Liu, "An Approach to Software Fault
Localization and Revalidation Based on Incremental Data Flow Analysis," in 13th
Annual International Computer Software and Applications Conference,
September 1989, pp. 527-534.
[30]
Cohen, M. B., Dwyer, M. B., and Shi, J. “Interaction testing of highlyconfigurable systems in the presence of constraints,” In Proceedings of the 2007
international Symposium on Software Testing and Analysis. July 2007, pp 129139.
[31]
Cohen, D. M., Dalal, S. R., Fredman, M. L., and Patton, G. C. “The AETG
System: An Approach to Testing Based on Combinatorial Design,” IEEE
Transactions on Software Engineering. July 1997.
166
[32]
“Software Fault Interactions and Implications for Software Testing.” IEEE
Transactions on Software Engineering. June 2004, pp. 418-421.
[33]
X. Qu, M.B. Cohen and K.M. Woolf, “Combinatorial interaction regression
testing: a study of test case generation and prioritization,” IEEE International
Conference on Software Maintenance, Paris, October 2007, pp. 255-264.
[34]
Cohen, M. B., Snyder, J., and Rothermel, G. 2006. “Testing across
configurations: implications for combinatorial testing,” SIGSOFT Software
Engineering Notes November 2006, pp. 1-9.
[35]
Poshyvanyk, D., Marcus, A., "Combining Formal Concept Analysis with
Information Retrieval for Concept Location in Source Code", in the Proceedings
of the 15th IEEE International Conference on Program Comprehension, Banff,
Canada, June, 2007, pp. 37-48.
[36]
E. Hill, L. Pollock, and K. Vijay-Shanker. “Exploring the Neighborhood with
Dora to Expedite Software Maintenance.” International Conference of Automated
Software Engineering. November 2007.
[37]
Dorf, R., Biship, R. “Modern Control Systems,” Eleventh Edition. Prentice Hall,
2008.
[38]
O’Leary, Daniel. “Enterprise Resource Planning Systems: Systems, Life Cycle,
Electronic Commerce, and Risk,” Cambridge University Press, 2000.
[39]
White, L., Jaber, K., and Robinson, B. “Utilization of Extended Firewall for
Object-Oriented Regression Testing.” Proceedings of the 21st IEEE international
Conference on Software Maintenance.” Budapest, September, 2005, pp. 695-698.
167
[40]
Kuhn, D., and Reilly, M. “An investigation of the applicability of design of
experiments to software testing.” Proc. 27th Annual NASA Goddard/IEEE
Software Engineering Workshop. 2002, pp. 91–95.
[41]
H. Do, S. G. Elbaum, and G. Rothermel. “Supporting controlled experimentation
with testing techniques: An infrastructure and its potential impact.” Empirical
Software Engineering: An International Journal, 10(4):405–435, 2005.
[42]
Araxis Inc. “Araxis Merge: A two and three way file and folder comparison tool.”
http://www.araxis.com/merge/index.html. January 11th, 2008.
[43]
Basili, V. R. and Boehm, B., "COTS-Based systems Top 10 List," IEEE
Computer, 24(5), 2001, pp. 91-93.
[44]
S. Williams and C. Kindel, "The Component Object Model: A Technical
Overview," in MSDN Library, 1994.
[45]
C. Kaner, J. Zheng, L. Williams, B. Robinson, and K. Smiley, "Binary Code
Analysis of Purchased Software: What are the Legal Limits?" Submitted to the
Communications of the ACM, 2007.
[46]
Dickinson, W., Leon, D., and Podgurski, A. “Finding Failures by Cluster Analysis
of Execution Profiles.” Proceedings of the 2001 International Conference on
Software Engineering, Toronto, May 2001.
[47]
Dickinson, W., Leon, D., and Podgurski, A. 2001. “Pursuing failure: the
distribution of program failures in a profile space.” Proceedings of the 8th
European Software Engineering Conference Held Jointly with 9th ACM
SIGSOFT international Symposium on Foundations of Software Engineering.
Vienna, Austria, September, 2001. pp. 246-255.
168
[48]
Pressman, Roger S. “Software Engineering: A Practitioner's Approach,” Sixth
Edition. The McGraw-Hill Companies, 2005.
[49]
Li, P. L., Herbsleb, J., Shaw, M., and Robinson, B. 2006. “Experiences and results
from initiating field defect prediction and product test prioritization efforts at
ABB Inc.” In Proceeding of the 28th international Conference on Software
Engineering, Shanghai, China, May 2006. pp. 413-422.
[50]
McCabe, Thomas J. “A Complexity Measure.” In IEEE Transactions on Software
Engineering, 2(4) 1976, pp. 308-320
[51]
Scientific Toolworks Inc. “Understand for C++: A software metrics tool for
C/C++.”
http://www.scitools.com/products/understand/cpp/product.php.
May,
2007.
[52]
Adam Porter, Atif Memon, Cemal Yilmaz, Douglas C. Schmidt, Bala Natarajan,
“Skoll: A Process and Infrastructure for Distributed Continuous Quality
Assurance.” IEEE Transactions on Software Engineering, August 2007, 33(8),
pp. 510--525.
169
Download