Uploaded by Alexey Pikalov

FULLTEXT01

advertisement
Linköping University | Department of Computer and Information Science
Master thesis, 30 ECTS | Computer Science
2019 | LIU-IDA/LITH-EX-A--19/010--SE
Economics of Test Automation
–
Test case selection for automation
David Lindholm
Supervisor : Azeem Ahmad
Examiner : Kristian Sandahl
External supervisor : Christoffer Green
Linköpings universitet
SE–581 83 Linköping
+46 13 28 10 00 , www.liu.se
Upphovsrätt
Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår.
Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för
enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning
av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art.
Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god
sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens
litterära eller konstnärliga anseende eller egenart.
För ytterligare information om Linköping University Electronic Press se förlagets hemsida
http://www.ep.liu.se/.
Copyright
The publishers will keep this document online on the Internet - or its future replacement - for a period
of 25 years starting from the date of publication barring exceptional circumstances.
The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial
research and educational purposes. Subsequent transfers of copyright cannot revoke this permission.
All other uses of the document are conditional upon the consent of the copyright owner. The publisher
has taken technical and administrative measures to assure authenticity, security and accessibility.
According to intellectual property law the author has the right to be mentioned when his/her work is
accessed as described above and to be protected against infringement.
For additional information about Linköping University Electronic Press and its procedures for
publication and for assurance of document integrity, please refer to its www home page:
http://www.ep.liu.se/.
© David Lindholm
Abstract
In this thesis a method for selecting test cases for test automation is developed and evaluated. Existing methods from the literature has been reviewed and modified with the result
being the proposed method, a decision tree containing 23 factors grouped into 8 decision
points. The decision tree has been used and evaluated in an industrial setting. The economic benefits were calculated with return on investment and the organisational benefits
were measured in a survey at a software producing company. The result was that automated tests, selected with the decision tree, provided economic benefits after 0.5 to 4 years,
these tests were also found to lead to 3 organisational benefits: less human effort when
testing, reduction in cost and allowing for shorter release cycles.
Acknowledgments
First of all, I would like to thank my examiner Kristian Sandahl for the feedback during the
project. A special thanks to my supervisor Azeem Ahmad for his exceptional supervision
throughout my thesis.
Thanks to Sectra Imaging IT Solutions Ltd for providing with the opportunity to realise this
project. I would like to thank my project sponsor Magnus Ranlöf for the discussions about
where to take the project. Thanks to my supervisor Christoffer Green, for his continuous
support and advices. Finally, I want to thank all of the people at Sectra that has participated
in interviews and surveys in my thesis.
iv
Contents
Abstract
iii
Acknowledgments
iv
Contents
v
List of Figures
vii
List of Tables
viii
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
3
3
3
3
Theory
2.1 Software Testing . . . . . . . . . . . . . . . . .
2.2 How to Test Software . . . . . . . . . . . . . .
2.3 Manual Testing . . . . . . . . . . . . . . . . .
2.4 Automated Testing . . . . . . . . . . . . . . .
2.5 Benefits and Limitations of Test Automation
2.6 What to Automate . . . . . . . . . . . . . . .
2.7 Return on Investment . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
8
9
9
14
17
3
Method
3.1 Qualitative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Quantitative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
24
33
4
Results
4.1 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
35
40
5
Discussion
5.1 Results . . . . . . . . . . . .
5.2 Method . . . . . . . . . . . .
5.3 Internal Validity . . . . . . .
5.4 External Validity . . . . . . .
5.5 Reliability . . . . . . . . . .
5.6 Ethical and Societal Aspects
44
44
47
50
50
50
50
2
6
Introduction
1.1 Aim . . . . . . . . .
1.2 Research Question
1.3 Research Objectives
1.4 Project Context . .
1.5 Delimitations . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Conclusion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
52
v
Bibliography
7
54
Appendices
7.A Interview Benefits from Test Automation
7.B Checklist Survey . . . . . . . . . . . . . .
7.C Checklist 1 . . . . . . . . . . . . . . . . . .
7.D Checklist 2 . . . . . . . . . . . . . . . . . .
7.E Decision Tree . . . . . . . . . . . . . . . . .
7.F Benefits from Automation Survey . . . . .
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
60
60
62
68
70
72
74
List of Figures
1.1
Thesis aims. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1
2.2
V-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Decision Tree in viability analysis method . . . . . . . . . . . . . . . . . . . . . . . .
7
16
3.1
3.2
3.3
Overview of the research method. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview of the method used for modifying the checklist . . . . . . . . . . . . . . .
Decision Tree usage example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
26
31
4.1
4.2
4.3
Decision Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ROI of automation project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ROI for individual test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
41
42
vii
List of Tables
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
Software testing process . . . . . . . . .
Questions in viability analysis method .
Checklist for deciding what to automate
Fixed costs of test automation . . . . . .
Variable costs of test automation . . . .
Benefits of test automation . . . . . . . .
Variables in Hoffman’s ROI formula . .
Variables in Münch et al. ROI formula .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
16
17
19
19
19
20
21
3.1
3.2
Scores assigned to Likert scale points . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example of excluded and included factors to modified checklist . . . . . . . . . . .
28
29
4.1
4.2
4.3
4.4
Results from interview of benefits from test automation. . . . . . . . . . .
Results from modification interviews of checklist. . . . . . . . . . . . . . .
Data used in ROI calculations . . . . . . . . . . . . . . . . . . . . . . . . . .
Results from survey evaluating organisational benefits of automated tests.
.
.
.
.
35
37
43
43
5.1
Factors found in literature that are not included in decision tree. . . . . . . . . . . .
45
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
Introduction
With the usage of agile project methodologies and rapid release cycles software companies
are releasing their product more frequently than ever before [1]. For instance, Firefox release their product every 6 weeks and some companies release even more often than that [2].
Mäntylä et al. argue that rapid releases implicate challenges in testing [3]. To illustrate an example of how rapid releases can affect quality; Porter et al. bring up that quality can decrease
due practitioners not having time to test all platform configurations before releasing a product [4]. Testing the product to a sufficient degree is important, as phrased by Sawant, Bari,
and Chawan: “Testing can be costly but not testing software can be even more costly” [5].
Charette shows several examples of how software failures can result in hundreds of million
dollars in costs for companies, and in some cases it even led to bankruptcy [6]. Testing can be
used to prevent this from happening, by being used as a tool for validation and verification
of whether that software meets the goals and requirements from the customer [7].
It is clear that testing is needed, but how can software testing catch up with the short release
cycles? One way of testing the product more efficiently can be with the use of test automation
[3], [8]. However, many testers agree on the fact that test automation in its current state cannot
fully replace manual testing, as both have different, albeit equally important roles [9]–[14].
Test automation can help improve quality, efficiency and shortening the time to market [8],
[15], but the question remains of what should be automated and what is better left for manual
testing.
Selecting test cases for automation is a challenge, according to Amannejad et al., who state that
there is a lack of research in how to select test cases for automation [16]. Kasurinen, Taipale,
and Smolander define test automation strategy as “The observed method for selecting the
test cases where automation is applied and the level of commitment to the application of test
automation in the organizations” [11]. The authors conducted interviews with 55 industry
specialists from 31 organizations in 2009 and found that many organizations have a need of
a clear test automation strategy [11].
In a literature review from 2016, Garousi and Mäntylä provide an extensive checklist, consisting of 43 factors divided into 5 categories, which can be used when deciding whether
to implement test automation and when selecting tests for automation [17]. Two factors that
1
1.1. Aim
hinder companies from test automation are high implementation cost and maintenance effort
[11]. These costs and the benefits of test automation are commonly estimated with return on
investment (ROI) [17], [18].
This thesis will attempt to find a method for selecting test cases for automation that will
result in economic and organisational benefits. Benefits of test automation were identified by
reviewing the research made on this subject and interviewing practitioners. Interviews with
software engineers led to modifying the checklist from Garousi and Mäntylä into a decision
tree. The decision tree was validated in an industrial case study, in which a set of test cases
were selected by using the decision tree and later evaluated on their return on investment
and possibility to achieve the identified organisational benefits.
1.1
Aim
The primary aim of this thesis is to establish a method that will facilitate the selection of test
cases for test automation for software producing companies. Before attempting to construct
this method, interviews were held to find out why test automation is needed and which
benefits software-producing companies want to achieve with test automation.
The secondary aim of this thesis is to identify what benefits have been presented in the scientific literature and to find out whether practitioners are in agreement of these. When the need
of test automation had been identified the construction of a tool to simplify test case selection
for automation can begin. In this thesis, it was studied whether the checklist [17] provided
by Garousi and Mäntylä could be used to select test cases for automation. First it needs to
be verified that the checklist is a useful tool for achieving test automation in an industrial
setting. If practitioners agree on the fact that the tool can be used, it is necessary to find out
which modifications are required for the checklist for it to suit practitioners. To simplify the
use of the checklist, the factors will be put into a decision tree. In the end the decision tree
needs to be evaluated.
The third and final aim of this thesis is to evaluate the created method for selecting test cases
for test automation. Can this method provide economic and organizational benefits in the
industry? The economic benefits are to be measured with return on investment and organisational benefits are to be evaluated against the benefits identified from literature and interviews with practitioners.
Figure 1.1: Thesis aims.
2
1.2. Research Question
1.2
Research Question
Based on the above aims, the following research question has been formulated:
Can the checklist provided by Garousi and Mäntylä [17] be modified in such a way that it can be used
to select test cases for test automation that result in economic and organisational benefits?
1.3
Research Objectives
As an aid to answer the research question, a couple of research objectives were formulated.
The first two research objectives aim to establish a basis of understanding why test automation is needed and how the economic benefits from test automation can be measured. The
third research objective aim is to verify that the proposed checklist is suitable for an industrial
setting. The fourth and last research objective will aid the process of adapting the checklist
to practitioners. When an adapted version of the checklist has been established, the data
collecting phase of the case study can be started. Data are collected by using the checklist
for selecting test cases that can be automated, automating a subset of these test cases and
evaluating their outcome in the research question.
The research objectives were formulated as follows:
1. What do practitioners believe are the common benefits software producing companies
relate to test automation?
2. How can economic benefits be measured for test automation?
3. Is the checklist provided by Garousi and Mäntylä [17] applicable in an industrial setting
to achieve test automation?
4. What modifications to the checklist provided by Garousi and Mäntylä [17] are required
to make it applicable for practitioners?
1.4
Project Context
This project is carried out at Sectra Imaging IT Solutions Ltd, which is a subsidiary company
of Sectra Ltd [19]. Sectra Ltd was founded in 1978. 40 years later, 2018 Sectra Ltd has 645
employees and a turnover of 1 266 496 million SEK [19]. Sectra Ltd’s headquarter is located
in Linköping, Sweden. Among other products, Sectra Imaging IT Solutions Ltd develops a
picture archiving and communication system (PACS), a software system that aids in storage,
visualization and manipulation images for departments such as radiology, pathology, cardiology and orthopaedics. The medical products produced by Sectra Imaging IT Solutions Ltd
are used in more than 1800 hospitals all over the world [20].
1.5
Delimitations
This thesis studies test automation at Sectra Imaging IT Solutions Ltd, the methods that are
reviewed are chosen on the basis that they have to be suitable for the software development
process used at Sectra Imaging IT Solutions Ltd. In the scope of this thesis test automation
is considered to be the software development process that results in software-performed test
execution and result analysis. Automation at other levels such as requirement analysis and
test implementation are not included in the scope for this thesis. Furthermore, the tests that
3
1.5. Delimitations
are considered for automation in this project are tests of the higher levels of tests types. Integration, system and acceptance tests are considered for automation whereas unit tests are not
considered for automation.
Sectra Imaging IT Solutions Ltd, expressed a wish for a simple method for selecting test cases
for automation. For this reason, a checklist-based approach was studied in this thesis, which
is also why systematic approaches (see section 2.6.1) are not considered.
Similar thoughts have been expressed in earlier studies. It has been shown that systematic
approaches are not commonly used in industrial settings even though they might occur frequently in research. This was shown by Engstöm and Runeson, and Engstöm, Runeson, and
Skoglund for regression test selection techniques [21], [22]. In another study, Runeson conducted a focus-group meeting and a survey with 17 and 15 participants respectively, which
had a similar result: the participants stated that none of the companies used a systematic
approach to select which unit tests to write, but instead tests were chosen based on the developer’s intuition and experience [23].
4
2
Theory
The scientific theory used in this thesis is presented in this chapter. The chapter is divided
into the following sections; Software Testing, How to Test Software, Manual Testing, Automated Testing, Benefits and Limitations of Test Automation, What to automate and Return
on Investment.
2.1
Software Testing
Software testing can be defined as “Evaluating software by observing its execution” [24].
However, this definition of software testing only covers the “what” of testing. The goals
of software testing are to verify and validate that a software works in a certain way and to
find errors in software [25]. Beizer, describes the “why” of testing in his five levels of testing
maturity: [26].
Level 0: There is no difference between testing and debugging.
Level 1: The purpose of testing is to show that software works.
Level 2: The purpose of testing is to show that software doesn’t work.
Level 3: The purpose of testing is not to prove anything but to reduce the perceived risk of
not working to an acceptable value.
Level 4: Testing is a mental discipline to develop low-risk software without much testing
effort.
Testing can be used as a tool for validation and verification of that a software meets the goals
and requirements from the customer [7]. Validation of software is used to ensure that the
product works as expected and can be a help for management in making decisions of when
to release the product [7]. To achieve reliable testing, it needs to be done in such a way that
the process and the results from it are repeatable and independent of who performs the tests
[25].
2.2
How to Test Software
In this section the software testing process is defined, the different testing levels are described
and software testing techniques are presented.
5
2.2. How to Test Software
2.2.1
Software Testing Process
The software testing process can be described through four main activities, requirements
analysis and specification, test implementation, test execution and test evaluation, these are
shown in Table 2.1 [16], [27], [28].
Table 2.1: Software testing process
Test activity
Requirements analysis
and Specification
Activity
Defining the testing goals and exit criteria, i.e. what testing
should accomplish [27], [28]. The exit criteria can be that
a set of tests have been executed, that a certain code coverage [28]
or requirements coverage [29] has been reached.
In this activity, the test activities should also be clarified,
it can for example be done by defining a set of tests that
should be executed and specifying the test activities [16], [27], [28].
Test Implementation
In this activity, the tests are created. If manual testing is used,
the implementation will consist of writing manual test scripts [16], [28]
or defining guidelines for exploratory testing.
If test automation is used, the implementation activity
is the production of automated test code [16], [28].
Test Execution
The test cases are being executed. In manual testing, testers carry out
the steps defined in the test scripts [16] or perform exploratory testing.
In automated testing, test code is executed,
either by running the code manually or
by using a test automation tool to run the code [16], [28].
Test Evaluation
The result from the test execution is evaluated in this activity.
In manual testing, the tester checks the outcome of the test and
compares it with the expected result [16], [28]. For automated evaluation,
a test tool is used to verify the outcome with the expected result [16], [28].
2.2.2
Testing Levels
Dalal and Chhillar state that testing should be started early and performed in all stages of
software development [7]. The V-model [30], Fig 2.1, is a common way to represent the development and testing stages of software development.
Unit testing, is the lowest level of testing which has as goal to verify that a small piece of code
does what it should. What is a unit depends on which programming language that is used,
it can for example be a function, procedure or method [24]. Unit tests are often performed
by developers shortly after writing the code. A great advantage for unit tests is that they can
help with finding bugs in an earlier stage of software development which reduces the cost of
the bug [25], which makes unit tests very cost effective [5].
Integration testing, tests how a set of units function when combined through interfaces.
Many bugs only occur when modules interact with each other, the module itself may work
and pass a unit test but when integrated with other modules they may be used in a way that
was not intended or thought of by the developer.
System testing is used to verify that the software product as a whole works as expected.
System testing is often performed by a testing team and has the goal to look over the design
and specifications of the software [24].
6
2.2. How to Test Software
Figure 2.1: V-model
Acceptance testing aims to make sure that the software meets the needs and requirements
from the customer.
2.2.3
Testing types
The testing levels presented in the V-model gives a good overview of the testing levels. In
each testing level, tests of different types can be used to achieve the testing goals. In this section some testing types are presented, the selection has been made with consideration to what
is relevant for this thesis. An overview of the testing types can be found in: A comparative
study of black box testing and white box testing techniques by Kumar, Singh, and Dwivedi
[31].
Build verification testing is performed on new builds, a smaller set of regression tests runs
with the aim to verify that no major defects have been introduced in the main functionality
of the software [31].
Smoke testing is done at an early stage of the testing process, much like build verification
testing the idea is to verify that the software is performing well enough to spend further
testing effort on it [31]. Kumar, Singh, and Dwivedi state that the tester quickly goes through
different parts of the software to answer questions like “Can I launch the test item at all?”
[31].
Sanity testing is another form of quick and broad testing [31]. In sanity testing the tester
aims to verify that the logic in the software is functional and correct [31]. Much like smoke
testing and build verification testing, sanity testing is a tool to check if further testing should
be made [31].
Scenario testing assess the product in how it will be used by end-users. Kaner state that
“The scenario is a story about someone trying to accomplish something with the product
under test” [32]. Scenario testing can be used to learn how the product will be used both
by new users and expert users, it is also useful for verifying that the software delivers the
features and possibilities that users need in their work [32].
7
2.3. Manual Testing
2.2.4
Software Testing Techniques
Khan classifies software testing techniques by their purpose into correctness, performance,
reliability and security testing [33]. In this section correctness testing will be discussed briefly.
Correctness testing verifies the behaviour of a software and is divided into black box, white
box and grey box testing [33]. Grey box testing is only a combination of the black box and
white box testing techniques, it will not be discussed further here.
Black box
In black box testing the tester does not consider the internal parts of the software under test
(SUT), instead the tester looks at the output provided from the SUT when given a certain
input. Black box testing can be performed at all levels of testing defined above (unit, integration, system and acceptance) [34], although unit tests are commonly done with knowledge of
the underlying code. Some black box testing methods are exploratory testing, smoke testing,
stress testing, load testing, equivalence class testing, boundary value testing, model-based
testing and use-case testing [5], [33], [34]. One advantage of using black box testing is that
test cases can be defined independent of the implementation of the software and for this
reason they can be written in parallel with the development of the software [35]. Black box
testing is efficient at finding defects [5], [33], [34]. However only relying on black box testing
is likely to result in that some parts of the software are not being tested [5]. Black box testing
needs a clear specification of what the software should do [33] and implemented behaviour
that is not defined in such specification might not be caught [35].
White box
White box testing makes use of the underlying structure and paths to test the software under
test [34]. While white box testing is often used in unit and integration testing, it can also be
used in system testing [5], [34]. The following methods are used in white box testing; path
testing, statement coverage, control structure testing and data flow testing [5], [33]. These
techniques can help the tester to find errors hidden in the code [33], and gives the tester
possibility to test all logical decisions, loop boundaries and paths in a module [5], [34].
A disadvantage to using white box testing techniques is that it requires the tester to have
developer skills and as such often is costly [5], [33], [34]. Another disadvantage is that if
required functionality is missing in the implementation it is not likely to be found by using
white box testing [35].
2.3
Manual Testing
In manual testing a human performs the tests by interacting with a software and evaluating
the results, manual testing can be carried out by following scripted instructions or ad-hoc,
exploratory test the software.
Scripted tests rely on predefined test cases that describes what input should be given to the
software and which output that is expected [34], [36]. The result of the test is a comparison
between the actual output and the expected output. Input can be created using test case
design techniques from black box testing such as boundary value testing or they can be based
on requirement documents, release notes or defect reports [34], [36].
Fewster and Graham classifies scripted tests into vague manual scripts and detailed manual
scripts, where both scripts define input and expected output, in vague scripts the test input
and expected output is described in general terms whereas in a detailed script they are defined precisely [29]. An advantage for scripted tests is that they can be carried out by any
tester, are easily repeated and can therefore be used in regression testing [34]. However, if the
8
2.4. Automated Testing
scripted test is of the vague type it may have different outcomes depending on the tester’s
choice of test input and execution [29].
Scripted testing is inflexible and in many cases it might be hard to define test cases on beforehand, exploratory testing can assist in finding more test cases or be used as an alternative
testing approach [34], [37].
Exploratory testing is a type of black box testing, Bach defines it as: “Exploratory testing is
simultaneous learning, test design, and test execution” [37]. In exploratory testing the tester
starts with a goal and defines new tests along the way as he is testing the software. Depending
on where the tester wants to place the test on the spectrum between scripted and exploratory
testing, the tester decides if and how much it should be guided by written goals and tactics
[37]. Exploratory testing can help testers to diversify testing, evaluate or learn about a new
functionality and is fast at finding the most important bugs [37].
Although manual testing has many benefits, a few is mentioned above, sometimes it can be
more efficient to perform tests with the aid of a computer. The following section looks into
the technique of using software to test software.
2.4
Automated Testing
Dustin, Garrett, and Gauf provides an inclusive, high-level definition of automated software
testing that is “The application and implementation of software technology throughout the
entire software testing lifecycle (STL) with the goal of improving STL efficiencies and effectiveness” [38].
Dustin, Garrett, and Gauf’s definition states that test automation can be made in all stages
of the software testing process, that is requirement analysis, implementation, execution and
evaluation can all be automated with certain methods [16] (see section 2.6.1). But as already
mentioned in the delimitations, section 1.5, the term test automation in this thesis is focused
on the software development process that result in test automation in the activities of test
execution and test evaluation from the testing process (see section 2.2.1).
Software engineers in Test Automation have a varied range of tasks such as; planning and
implementing test scenarios, developing test automation frameworks, preparing and configuring the infrastructure to run the tests and to present the test result reports [39]. Typically, a
tool for Continuous Integration (CI) is used for running the tests and displaying test reports,
the developer “push” (send) a code change to the code repository, the CI tool automatically
builds the code, run smoke tests and provide build and test results [40].
Kasurinen, Taipale, and Smolander stresses that test automation commonly is used for repetitive tasks, from their survey with 31 organization managers they concluded that the respondents thought that unit testing and regression testing were the two most efficient application
areas of test automation tools [11]. Dustin, Garrett, and Gauf state that test automation is
typically used for the following testing types: unit tests, regression tests, functional tests, security tests, performance tests, stress tests, concurrency tests and code coverage verifications
[38].
2.5
Benefits and Limitations of Test Automation
Rafi et al. conducted a systematic literature review and practitioner survey in 2012, in which
benefits and limitations of test automation was identified from research and later verified by
practitioners [9]. In the following two sections the benefits and drawbacks with test automation are explained with a starting point taken from “Benefits and Limitations of Automated
9
2.5. Benefits and Limitations of Test Automation
Software Testing: Systematic Literature Review and Practitioner Survey” [9].
The references from “Benefits and Limitations of Automated Software Testing: Systematic
Literature Review and Practitioner Survey” has been checked and are shortly summarized
under each factor, when newer references were found these have been added. References
that support the factor are shown next to the heading of the factor, note that the reader is advised to read the whole section, several times one references occurs in more than one factor
and is only described once. The goal has not been to perform a systematic literature review.
The aim of this chapter is to give the reader a solid introduction to what benefits and limitations that comes with test automation.
2.5.1
Benefits of Test Automation
Rafi et al. presented 9 benefits from their literature study and in their survey they show that
practitioners are in agreement with research for 8 of the 9 benefits, in the following section research related to these 8 factors is summarized. The ninth factor, left out here, was “increased
fault detection” [9].
Improved product quality [11], [41]
Rafi et al. define quality as a low defect level in SUT [9].
Malekzadeh and Ainon presents a technique for automated test case generation, which according to the authors can be used to reveal ambiguities in the specification of SUT [41]. The
method was only validated on a non-industrial example [41].
Through surveys and interviews in 30 organisational units Kasurinen, Taipale, and Smolander found that test automation can provide quality improvements from increased test coverage and reduced testing time [11].
Test coverage [11], [18], [38], [42]–[49]
According to Hoffman and Dustin, Garrett, and Gauf test coverage can be increased with
automated tests [18], [38]. They explain that an increase could be due to that automated
tests can cover more combinations of data and paths compared to when testing is performed
manually [18], [38].
Saglietti and Pinte created a multi-objective optimization model that optimizes test case generation in unit and integration tests [42]. The objectives of the model being to maximize code
coverage and minimize test amount [42]. From experimental verification in industrial software the authors concluded that coverage could be improved [42].
With usage of the programming language Sulu, Tan and Edwards performed unit test case
generation on non-industrial software [43]. The result being 90% statement coverage and
high mutation coverage [43]. Alshraideh carried out a similar study, where a partly automatic
generation of unit tests were made for JavaScript code [44]. Non-industrial experiments in
their research showed that coverage can increased with the tool [44]. The authors argue, but
do not provide data to support the statement, that this type of tools can lead to a reduction in
testing cost [44].
In 2008, Burnim and Sen presented heuristic search strategies for generating test input with
symbolic execution [45]. The authors validated their method on the two open source projects
Grep 2.2 and Vim 5.7 that had 15K lines of code (loc) and 150K loc, respectively [45]. The
authors were able to increase coverage with their method and argues that the method can be
used on real-world software systems [45].
Geetha Devasena, Gopu, and Valarmathi proposed a hybrid optimization method for generating tests for conditional branches, the method used both a Genetic Algorithm (GA) and
Particle Swarm Optimization (PSO) [46]. The aim of this method was to achieve a certain
10
2.5. Benefits and Limitations of Test Automation
branch coverage, the authors validated the method on benchmarking samples and compared
the result to a method that only uses GA or PSO [46]. The hybrid method could reduce the
execution time by around 50% compared to when only GA or PSO was used [46]. The authors
suggest that this type of method can minimize testing effort, time and cost [46].
Nagowah and Kora-Ramiah created a tool under the name Control Ripper and Test Case
Player (CRaTCP) that can be used on web applications to generate and execute test cases [47].
The tool looks for fields and buttons where the user can give input to the web application,
it considers the given constraints for these input fields and generate test cases to cover the
possible inputs [47]. A tester can later execute the test cases on the web application, the
tool does not provide for automatic test evaluation [47]. The authors have not provided any
validation data of the tool, they state that the ambition with the tool is to achieve complete
test coverage by executing the generated test cases [47].
Reduced testing time [11], [16], [46], [47], [50]–[54]
In 1999, du Bousquet and Zuanon presented their testing environment for synchronous reactive systems [50]. The tool automatically generates test data and with a user provided test
oracle it also provides for automatic test execution and evaluation [50]. The authors have
validated their tool and argues that the solution can result in cost reductions and reduced
testing time [50].
Wissink and Amaro argues that a Keyword-based approach to test automation can lead to
reduced testing time[51]. In the Keyword-based approach the test cases are defined as a set
of actions, or keywords, a test automation engineer develops tests that can perform these
actions and arrange for a tool where the tests can be executed [51]. The authors have not
validated this method, instead they refer to a white-paper 1 where promising results of the
technique has been presented [51].
Haugset and Hanssen has used Robot Framework, a Keyword-based testing tool, for implementing regression tests [52]. The authors report that automated testing can decrease the
testing effort in regression tests and reduce costs due to finding bugs earlier in the development process [52].
Amannejad et al. showed in an industrial setting that test automation can save time in the test
processes of test design, test execution and test evaluation [16], see section 2.6.1.
Reliability [29], [48], [55]
Reliability here means that tests produces the same result when repeated [9].
Test automation can result in more reliable testing [29]. Automated tests will be executed
several times and the execution is always performed in the same way [29].
In 2018, Banerjee and Yu investigated how test automation could be used to test face recognition software made possible with a robotic arm [48]. Banerjee and Yu reports that test
automation resulted in reliable tests, the authors also argue that test coverage was increased
with test automation [48].
Increase in confidence [52]
From interviews the authors of “Automated Acceptance Testing: A Literature Review and
an Industrial Case Study” found that automated testing can increase the confidence for the
perceived quality of the SUT [52].
1 http://www.sdtcorp.com/cs_gtnprogram.html
11
2.5. Benefits and Limitations of Test Automation
Reusability of tests [11], [14], [54], [56], [57]
In 2009 a tool was developed for generating test cases in Java [56]. The tool has not been
validated in any industrial setting, but the authors argue that reusable test cases can be created with the tool [56]. Kasurinen, Taipale, and Smolander state that test automation requires
an initial investment but that with the possibility of increased reusability that comes with
automation, it can lead to a payoff in the long term [11].
According to Obele and Kim a software test automation tool can improve test reusability
[54]. The authors present their tool and state that in their experience automated software
testing can free testers from mundane activities and minimize human effort, cost, time and
human errors [54]. Flenström et al. have provided and validated an optimization model for
prioritizing test cases based on the possibility to reuse code [57], see section 2.6.1.
Less human effort [14], [46], [49], [52], [54]
Haugset and Hanssen and Berner, Weber, and Keller reports that with automated regression
testing, testers have more time for other test activities [14], [52]. The authors also state that
test automation makes it easier to test complex interfaces and that it can enable a higher test
frequency compared to manual testing [52].
Reduction in cost [46], [50], [52]–[56]
Test automation can find bugs earlier in the development process [52]. Due to the possibility of running the automated tests frequently, the test can find simple bugs early by being
executed directly after the code has been produced. Bugs found earlier in the development
process are cheaper to fix than bugs that are found late [38].
Shan and Zhu provide a solution for test case generation called data mutation [55]. The data
mutation method, which is inspired by mutation testing methods, uses mutation operators
on input data to generate test cases [55]. The authors validated their method on CAMLE
which is a modeling language and environment developed by the authors [55]. The authors
state that the method can provide several benefits, these being; reduced costs, good coverage
and increased reliability [55].
2.5.2
Limitations of Test Automation
Rafi et al. identified 7 limitations from research, in the survey they show that practitioners
agrees with research for 6 of the 7 benefits, in the section below research related to these 6
factors are summarized. The seventh factor, left out here, was “failure to achieve expected
goals” [9].
Automation can not replace manual testing [11]–[14]
From empirical observations Kasurinen, Taipale, and Smolander, Bach and Pettichord reports
that some tasks are better suited for manual testing, while others preferably are automated
[11]–[13].
Berner, Weber, and Keller state that manual testing will be likely to detect new defects and
argues that automated tests are suitable for revalidation of SUT [14].
Difficulty in maintenance of test automation [11], [14], [58]
Kasurinen, Taipale, and Smolander state that test automation will lead to an increased effort
in maintenance due to changes in SUT or in product infrastructure [11]. Similar thoughts are
express by Berner, Weber, and Keller, who argues that testware has to be maintained at each
new release of SUT [14].
12
2.5. Benefits and Limitations of Test Automation
Liu argues that test automation is sensitive to changes in SUT, the author presents a testing
language with the aim to simplify maintenance of automated tests [58].
Process of test automation needs time to mature [14], [59]
Bashir and Banuri used a model based technique to generate test data [59]. The authors argue
that test automation can result in time and costs savings, but that it takes time to achieve these
goals [59]. In section 2.5.2, similar thoughts is described, which are expressed by Berner,
Weber, and Keller [14].
False expectations [13], [14]
From observations made by Berner, Weber, and Keller, the authors found that test automation
failed to deliver on the expectations of exposing known defects [14]. The authors also note
that automation does not deliver a short return on investment as some practitioners had
expected [14].
Pettichord reports that from his experience there are several false expectations with test automation, such as, the expectations that test automation can be achieved as a side project or
that test automation can provide all accomplish benefits at the same time, such as combining
a wish for increased test coverage and time savings [13].
Inappropriate test automation strategy [14], [60]
Berner, Weber, and Keller and Persson and Yilmaztürk argues that choosing the right test
automation strategy is vital for success in test automation [14], [60]. The authors of “Observations and Lessons Learned from Automated Testing” explains four commonly occurring
problems with test automation strategies; “misplaced or forgotten test types”, “wrong expectations”, “missing diversification” and “tool usage is restricted to test execution” [14].
Persson and Yilmaztürk argues that the test strategy has to consider the different needs for
manual and automated testing [60].
Lack of skilled people [13], [49], [60], [61]
Rafi et al. describe this factor as that automation requires many different types of skills [9].
Pettichord says that from his experience it can be hard to maintain automated tests that are
developed by inexperienced developers [13].
In an observational study, Fecko and Lott report that test automation demands several skills;
test tool knowledge, proficiency in development, software design and expertise in SUT [61].
The authors argue that testers commonly have good tool knowledge and expertise in SUT,
but that there is a lack of software development and design skills among testers [61].
In 2018, Gafurov, Hurum, and Markman propose test automation solution based on a
keyword-driven testing language [49]. The aim is to decrease the cost of test implementation
by having automation engineers (expensive personnel with development skills) implement
test steps and letting tests analysts (less expensive personnel, non-developers) organize and
combine test steps with input data [49]. The authors validated their method in an industrial setting and argue that this approach can decrease the test implementation cost [49]. It
was also found that automated testing resulted in a decrease in manual test effort and an
improved test coverage [49].
Persson and Yilmaztürk explain that if test automation is implemented by personnel without
the right competence, it can result in higher cost and even failure [60]. The authors recommend that the automation project should have expertise knowledge available, but not all
who automate have to be experts [60]. This competence has to remain within the company
after the automation project has been realized [60]. The automation project should consist of
13
2.6. What to Automate
a mixed knowledge between testing, development, project management and other skills in
related areas, e.g. database and hardware [60].
2.6
What to Automate
As mentioned in the introduction there is a lack of research of which tests to automate [16].
In this section systematic and checklist based approaches for selecting which test cases to
automate are described.
2.6.1
Systematic Approaches
In research, optimization and simulation methods have been used to help practitioners in the
decision making of which tests to automate. This section will describe a few examples but
the aim is not to cover these methods in depth, rather to provide a short overview of other
solutions to the problem. The reason for not investigating these methods further is described
in the delimitations, section 1.5.
It is also worth mentioning that according to Amannejad et al. the research for systematic
approaches for deciding what to automate is still in an early stage [16].
Optimization based approaches
In 2014, Amannejad et al. formulated an optimization problem of what to automate and verified their approach in an industrial setting [16]. The authors used a matrix that indicates
what stages of a use-case that should be performed with manual testing methods or automated testing methods [16]. The optimization problem was solved by searching through
solutions in the matrix with a genetic algorithm and the goal function, i.e. the evaluation
of the solutions, were based on the return on investment [16]. Data was collected from software tools and when such were not available interviews or estimations models were used
[16]. As for estimating manual test effort, the authors customized an existing test execution
effort estimation model and as for estimating maintenance cost of the automated test code
the maintenance estimation principle of COCOMO was used [16].
The optimal value of the goal function for the problem provided a ROI of 367% [16]. The
authors found that the test artifacts had to be used more than 2, 3 and 8 times to provide a
positive ROI in the activities of test design, test execution and test evaluation [16]. Amannejad et al. state that the highest ROI was gained from automation in test execution, thereafter
automation in test design and lastly automation test evaluation [16]. The result was also
presented in time savings, where it was found that for test design, test execution and test
evaluation a saving of 85, 275 and 21 working days (8-hour) could be made [16].
Ramler and Wolfmaier constructed a constrained linear optimization problem of what to automate [62]. The problem was constrained from a fixed budget and a minimum number of
automated and manual test executions [62]. The goal function was constructed by creating
functions for risk mitigation, the objective was to maximize the risk mitigation of the sum
of manual and automated testing [62]. The authors argue that manual and automated tests
fulfil different purposes, automated tests are suitable for mitigating regression risks while
manual testing can be used to explore new functionality, this is the reason behind the choice
of the constraints and goal function [62]. Ramler and Wolfmaier express that a drawback with
this model is that it is simplified and ignores important factors such as maintenance cost for
automated tests and growing test effort over time in iterative development [62]. The article
is widely cited and google scholar reports that it has been cited 113 times [63], however the
authors did not evaluate their model empirically and an empirical evaluation was not found
among the citations.
14
2.6. What to Automate
In 2018, Flenström et al. proposed a method for helping decision makers to prioritize which
tests cases to automate first [57]. The aim of this study is to reduce test effort by prioritizing
automation of test cases that have the possibility to reuse code from test cases that have been
automated previously [57]. Reuse of test automation code is made possible by comparing the
proposed set of manual test cases for automation with the manual test cases that have already
been automated, if the steps are similar, it is likely that code can be reused [57].
In this optimization problem the goal function is formulated to measure the manual test effort
[57]. The manual test effort decreases when manual tests are replaced with automated tests,
the objective is to find the ordered set of test cases to be automated that minimize the manual
test effort [57].
The method was empirically validated in a case study consisting of four projects at a company in the vehicular embedded systems domain [57]. The result was that if reuse of test
automation code is considered, then the manual test effort can be decreased by up to 12%
with the usage of an optimization model for similarity-based reuse of test steps. [57].
Simulation based approach
Sahraf et al. constructed a System Dynamics (SD) simulation model with the aim to answer
the problem of what to automate in all stages of the software testing process [64]. The SD simulation model was created as a general model and later adapted and validated in an industrial
case study [64]. The authors state that it exists an uncertainty of the concrete results from the
simulations made in this study, due to uncertainties in the input data to the SD simulation
model [64]. It is concluded that the usefulness of the proposed SD simulation model has been
shown in this study, but that more research is needed to validate the proposed model [64].
2.6.2
Checklist Based Approaches
In 2015, Garousi and Mäntylä published a multi-vocal literature review with the aim to support decision making on when and what to automate [17]. One of the results in the paper is a
checklist, see factors in table 4.2, where the author has used coding to identify 43 factors from
78 sources [17]. These factors can be evaluated to find out if test automation is suitable for a
company and if so, which tests that can be automated.
The factors are grouped into five categories: 1) Software Under Test-related factors, 2) testrelated factors, 3) test-tool-related factors, 4) human and organizational factors and 5) crosscutting and other factors [17]. The authors assigned an area weight to each factor which is the
frequency in which the factors appear in their sources [17]. In the paper the authors clearly
state that the area weight cannot be viewed as a prioritization made for practitioners, rather
the checklist needs to be evaluated and prioritized in the context that it will be used [17].
In 2006, Oliveira, Gouveia, and Filho proposed a viability analysis method, which uses 9
questions, Table 2.2, together with a decision tree, Fig 2.2, when deciding on whether to automate a given manual test or not [65]. The questions are answered with “High”, “Medium”
or “Low”, represented in the decision tree as “H”, “M” and “L”.
The authors, Oliveira, Gouveia, and Filho, trained their model on 500 manual tests and validated it on 200 tests [65], the model has been recommended in one paper by Assad et al. [66]
and used and evaluated with positive results in another paper by Kadry [67], however no
usage in an industrial setting has been found.
In the book Implementing Automated Software Testing, Dustin, Garrett, and Gauf present a
checklist, shown in Table 2.3, consisting of 12 factors which aim to answer whether a specific
test case should be automated or not [38]. The authors argue that a test case is a good candidate for automation if all questions are answered with “yes” [38]. Dustin, Garrett, and Gauf
also provide 6 guidelines for when and what to automate tests, where the authors among
15
2.6. What to Automate
Table 2.2: Questions in viability analysis method [65]
Id
1
2
3
4
5
6
7
8
9
Topics
Frequency
Reuse
Relevance
Automation
Effort
Resources
Manual
Complexity
Automation
tool
Porting
Execution
effort
Related Questions
How many efforts is this test supposed to be executed?
Can this test or parts of it be reused in other tests?
How would you describe the importance of this test case?
Does this test take a lot of effort to be deployed?
How many members of your team should be allocated or
how expensive is the equipment needed during this test’s
manual execution?
Is this test difficult to be executed manually? Does it have
any embedded confidential information?
How would you describe the reliability of the automation
tool to be used?
How portable is this test?
Does this require a lot of effort to be executed manually?
Figure 2.2: Decision Tree in viability analysis method [65]
other things recommend practitioners to consider time and budget constraints when deciding what to automate and recommends specifically to automate repetitive tasks [38]. The
authors state that the checklist has been used in various projects [38], the authors refer to one
of the authors previous book “Effective Software Testing: 50 Specific Ways to Improve Your
Testing” written by E. Dustin, unfortunately they do not elaborate more on these results.
2.6.3
Other Approaches and Advices
Graham and Fewster have provided a short text about which tests that should be automated
first in their book Experiences of test automation: Case studies of software test automation from
2012. Graham and Fewster recommends testers to consider the following factors when choosing what to automate [68]:
• most important tests,
• a set of breadth tests (sample each system area overall),
• tests for the most important functions,
• tests that are easiest to automate,
16
2.7. Return on Investment
Table 2.3: Checklist for deciding what to automate Dustin, Garrett, and Gauf [38]
Test Automation Criteria
Is the test executed more than once?
Is the test run on a regular basis, i.e., often reused, such as part of regression or build testing?
Does the test cover most critical feature paths?
Is the test impossible or prohibitively expensive to perform manually, such as concurrency,
soak/endurance testing, performance and memory leak detection testing?
Are there timing-critical components that are a must to automate?
Does the test cover the most complex area (often the most error-prone area)?
Does the test require many data combinations using the same test steps (i.e., multiple
data inputs for the same feature)?
Are the expected results constant, i.e., do not change or vary with each test? Even if the
results vary, is there a percentage tolerance that could be measured as expected results?
Is the test very time-consuming, such as expected results analysis of hundreds of outputs?
Is the test run on a stable application; i.e., the features of the application are not in constant
flux?
Does the test need to be verified on multiple software and hardware configurations?
Does the ROI as discussed in Chapter 3 of Implementing Automated Software Testing [38]
look promising and meet any organizational ROI criteria?
Yes
No
• tests that will give the quickest payback,
• tests that are run the most often.
The authors argue that a high return on investment can be achieved by selecting tests from
different product areas and automating the most important tests first [68]. Both Graham
and Fewster and Dustin, Garrett, and Gauf bring up the importance of not rushing into automation and attempting to achieve too much at an early stage, due to the learning curve of
automation [68] and limited experience with the automation tool and other related factors to
automation [38]. Another recommendation that both of the previous mentioned author gives
is to automate tests that have repetitive tasks which can free up the time for testers to do other
work [38], [68].
2.7
Return on Investment
Return on investment (ROI) is the ratio between the benefits and the costs of a given investment. ROI shows how much profit that is generated from each monetary unit spent on an
investment.
2.7.1
Return on Investment for Test Automation
Before the test automation process is started it is advised to calculate the ROI, in order to
be sure that the savings and benefits are greater than the costs of automation [29], [38], [68].
Münch et al. define a ROI formula for test automation [69] as:
ROI =
Benefit
Gain Costs
=
Investment
Investment
2.1: ROI formula for test automation by Münch et al.
The factors that can be included into the gains and costs of the formula can be divided into
intangible and tangible factors.
17
2.7. Return on Investment
2.7.2
Intangible Factors
Intangible factors are factors that are difficult to measure in a quantifiable way. In this section
the following factors will be discussed briefly: time for manual tests, testers motivation, test
coverage and quality in testing. In addition to the factors mentioned above, the following intangible factors have been identified in section 2.5; improved product quality, reduced testing
time, reliability, increase in confidence, reusability of tests, less human effort and reduction
in cost.
Automated testing can result in testers having more or less time for manual testing. If automated tests find simple bugs such as platform/browser specific bugs, the developer is notified about this and have the possibility to fix these bugs before a tester performs a manual
test. With less simple bugs to report, testers have more time to perform manual testing to find
bugs that are harder to identify [8], [10], [14]. Test automation can also reduce testing time by
providing efficient testing tools and by automating test activities in the software testing process (section 2.2.1) (for more information about reduced testing time, see section 2.5.1) [11],
[16], [46], [47], [50]–[54]. The other possibility is that development, maintenance or analysis
of data from automated testing becomes time consuming and allows for less manual testing
[12].
Automated testing is good for performing the same test on multiple platforms or configurations. Time can be saved by verifying several platforms at once with a script instead of
doing it manually [38]. Performing the same test in many platforms/configurations can be
a monotonous task, which can lead to a tester getting tired and missing potential problems
[38].
Motivation of testers might increase or decrease due to automation. Some testers will enjoy
test automation and embrace it, while others might be sceptical and see it as less time for
manual tests [18].
As shown in section 2.5.1, test coverage can be improved with automated tests [11], [18], [38],
[42]–[49]. It is also possible that test coverage decreases with automation if a simple automated test is implemented instead of an exploratory testing session that would cover more
functionality [18]. Persson and Yilmaztürk report that a weak knowledge of the existing test
coverage can be a problem when implementing test automation [60]. The authors report that
they experienced trouble with establishing test automation due to having to specify automation coverage in manual test coverage measurements [60].
Change of quality in testing [18]. Test automation allows for using different types of tests,
such as stress testing, load testing, higher frequency of regression testing, test multiple platforms at once etc. These tests might not be possible to perform with manual testing, but
manual testing can be preferred in other cases, for example GUI testing is often done manually due to the ability to have a human verify that the application “looks good” and the fact
that the GUI might change rapidly and cause a high cost of maintenance if the tests were to
be automated [12].
Shorter release cycles [8], [14]. Due to running automated tests frequently a more stable version of the software can be obtained in between release testing periods [8], [15]. This can be a
part of allowing the release cycle to be shortened.
2.7.3
Tangible Factors
Hoffman divides the tangible factors of test automation into fixed costs, Table 2.4, and variable costs, Table 2.5 [18]. Some of the factors of cost and benefit, Table 2.6, from test automation will be described briefly in this section.
Test automation might result in the need of upgrade in hardware, since it can be resource
consuming to run a big test suite several times per day. Another cost factor is the software
18
2.7. Return on Investment
Table 2.4: Fixed costs of test automation
Factor
Hardware
Automation
ware
soft-
Software training
and setup
Definition
Hardware for running
automated tests.
Testware software licenses
and support.
Initial configuration of tool(s),
training of staff, initial test suite
implementation.
Reference(s)
[18], [69]
[18], [69]
[18], [69]
used for test automation, that includes software for creating automated tests, continuous
integration server, software for analyse test results etc. In some cases, companies choose to
use an open source solution such as Selenium, which reduces the cost for software. But even
in those cases, there is still a cost for configuring the tool and training the staff in the tool.
Table 2.5: Variable costs of test automation
Factor
Test design &
implementation
Test
maintenance
Definition
Designing and implementing
tests for automation.
Maintenance of tests
that are broken due
to e.g. new functionality
making the tests
outdated.
Reference(s)
[18], [69]
[18], [69]
The largest cost of test automation is test design and implementation. Writing automated
tests requires development skills and knowledge of the test tool used [12]. Test cases also
need documentation, maintenance and be tested themselves [12]. Product changes may result
in broken tests or that new tests need to be developed to cover new features.
Table 2.6: Benefits of test automation
Factor
Failure cost
Greater regression
test coverage
Test execution
savings
Definition
Failures found by
automated tests.
Being able to run
regression tests more
frequently increases
the coverage from
these tests.
Automated tests run
faster than tests
performed by a human
resulting in possibility
to run more tests per
time unit.
Reference(s)
[69]
[38]
[18]
Some bugs are more likely to be found by computers and automated tests than manual tests
performed by humans [70]. Manual tests are good for analysing what appears on the screen,
but automated tests can better examine data that lie behind the screen. That can enable for
example testing for memory leaks and monitor unexpected system calls [18]. The benefit
19
2.7. Return on Investment
from finding these bugs that without automated tests would end up in production can result
in a great saving for the company [69].
Being able to run an automated regression test suite frequently will increase the test coverage
and might reduce the number of bugs that would (or would not) be found in a later stage of
the development process.
Another factor to consider is savings in test execution. Overly simplified you could say that
a manual tester can execute tests for 8 hours per day, while a test automation engineer would
create and maintain tests for 8 hours per day and then let those tests run for an additional 16
hours per day [71].
Often the benefits of automated testing are determined by comparing costs of automated
tests with costs of manual tests [18]. Graham states that benefits of test automation can be
measured by considering the equivalent manual test effort, that is the time that it would take
to execute the tests without automation [72].
2.7.4
ROI using Equivalent Manual Test Effort
Hoffman defines a ROI formula for test automation that can be used in the situation when
manual testing has been used in the project and one are to consider investing in test automation [18]. This ROI formula differs from the others described in this section with that it
considers the incremental costs and benefits of automation. For this reason, it can be used to
calculate the return on investment from one small project or even isolated tests, whereas the
other formulas described in this section consider ROI of the whole automation investment.
The advantage being that initial investments, which may have been made a long time ago,
such as tool requisition and staff training, can be disregarded from the ROI calculations.
ROI(in time t) =
∆(Benefits from automation over manual)
∆Ba
=
∆(Costs of automation over manual)
∆Ca
2.2: ROI formula for test automation using EMTE by Hoffman
Table 2.7: Variables in Hoffman’s ROI formula.
Var
n1
n2
N
Ba
Ca
∆Ba
∆Ba (in time t)
∆Ca
∆Ca (in time t)
Definition
Number of automated only test executions.
Number of manual tests executions.
Average number of runs for automated tests before maintenance is
needed.
The benefits from automated testing.
The costs of automated testing.
The incremental benefits from automated over manual testing.
Σ (improvement in fixed costs of automated testing (t / Useful Life)) +
Σ (variable costs of running manual tests n2 times during time t) Σ (variable costs of running automated tests n1 times during time t)
The incremental costs of automated over manual testing.
Σ (increased fixed costs of automated testing times (t / Useful Life) +
Σ (variable costs of creating automated tests) Σ (variable costs of creating manual tests) +
Σ (variable costs of maintaining automated tests) (n1 / N)
Note that much of the values used in Hoffman’s equation need to be determined for both
manual and automated testing. For example, in the equation for incremental benefits from
20
2.7. Return on Investment
automated over manual testing (∆Ba ), Table 2.7, the manual and automated tests used in
the equation are assumed to cover the same test cases. That is, the benefits are defined as
equivalent manual test effort (EMTE) minus the automated test effort [72].
Schwaber and Gilpin uses a definition of cost of test automation that is similar to Hoffman’s,
namely the following one [73]:
Cost of test automation = Cost of tool(s) +
Labor costs of Labor costs of script
+
script creation
maintenance
2.3: Cost formula for test automation by Schwaber and Gilpin
But both these definitions leave out the factor of training the employees to use the automation
tool. The ROI formula by Münch et al. takes this factor in consideration [69], the variables are
defined in Table 2.8. Similar to the formula by Hoffman, the benefit in this formula is defined
as the saved cost of executing automated tests compared to executing all tests manually.
ROIn =
Benefit
Gain Costs
=
Investment
Investment
2.4: ROI formula for test automation that include tool cost by Münch et al.
Table 2.8: Variables in Münch et al. ROI formula.
Var
n
Gain
Costs
Investment
Definition
Number of testing cycles
Costs of executing all tests purely manually (EMTE)
Cost of executing the manual and automated tests for all the cycles and
the cost of the automation investment.
Cost of buying the automation tool, training the employees in the tool
and an initial test automation suite implementation.
21
3
Method
The research method, shown in Fig 3.1, started by reviewing the literature to select tools for
the method and evaluation of the results in the thesis. The literature study aimed to answer
the following areas: how test cases can be selected for test automation, what benefits exists
for test automation, how interviews should be conducted, how a checklist can be evaluated
and to find information that could give the reader a short introduction to testing, manual and
automated, practices. As for evaluation of the result, it was studied how return of investment
can be measured for test automation.
When a sufficient literature base had been established the interview process was started. The
aim of the interviews was divided into three parts, responding to the research objectives,
that is, to identify the benefits that Sectra Imaging IT Solutions Ltd want to achieve with test
automation. To verify that the proposed checklist is applicable for the current situation and
if so, to find out what modifications that are needed to establish a checklist that is suitable for
the company.
The data from the interviews were processed and the result was a decision tree. This tree was
to be used when deciding which test cases to automate. When a set of test cases had been
found, the automation process could start. During the automation process the effort spent
was strictly noted, this would provide data that were to be used when performing return
on investment calculations. The return on investment calculations were used to answer the
research question, Can the checklist provided by Garousi and Mäntylä [17] be modified in such a way
that it can be used to select test cases for test automation that result in economic and organisational
benefits?
22
23
Figure 3.1: Overview of the research method.
3.1. Qualitative Methods
Case Study
The method used in this thesis is best defined as case study. Robson and McCartan defines
the term case study as: “Case study is a strategy for doing research which involves an empirical investigation of a particular contemporary phenomenon within its real life context using
multiple sources of evidence” [74].
Runeson and Höst describes the stages of case studies as [75]:
1. Case study design:
Defining the objectives and planning the case study.
2. Preparation for data collection:
Evaluating which resources that are available and scheduling data collection activities.
3. Collecting evidence
4. Analysis of collected data
5. Reporting
Runeson and Höst define four purposes of research, exploratory, descriptive, explanatory
and improving [75]. The purpose of this case study was of the improving type, with the aspect that this thesis aims to improve being the test case selection for test automation at Sectra
Imaging IT Solutions Ltd.
Case studies can use both qualitative methods such as interviews or focus groups and quantitative methods such as surveys [75]. Runeson and Höst state that it is preferred to perform
case studied with mixed methods, that is to use both qualitative and quantitative methods
[75]. This thesis makes use of a mixed method, the qualitative methods used in this thesis are
presented first, at the end of the method chapter the qualitative methods are described.
3.1
Qualitative Methods
In this section the qualitative methods used in this thesis are explained. Interviews were
used in several areas of this thesis, to identify benefits from test automation, to evaluate and
modify the checklist and to validate the checklist. This section starts with describing how
interviews are conducted, before explaining the method for the actual interviews.
Interview structure
Interviews can be categorised as structured or unstructured and formalised or informalised,
depending on how standardised the questions are [76]. In a structured interview, the interviewer read out each question and notes the response on a standardised schedule [76]. A
semi-structured interview is more flexible, the interviewer may change the questions and
their order depending on the outcome of the ongoing interview [76]. When the interview
will be used to gather data to a quantitative analysis, standardised questions are often preferred over more flexible forms of interviews, on the other hand non-standardised questions
are useful for qualitative analysis [76].
Runeson and Höst state that the interview questions can be structured in three manners, following the funnel, pyramid or time-glass model [75]. Using the funnel model, the initial
questions are open and gets more specific as the interview proceeds [75]. The pyramid model
follows the opposite structure of the funnel model, moving from concrete to open questions
and the time-glass model starts with open questions, then goes to concrete questions and later
uses open questions again [75].
24
3.1. Qualitative Methods
Saunders, Leiws, and Thornhill suggest that its commonly preferred to participate in an interview rather than to filling out a questionnaire [76]. This can be due to that the respondent
does not have to write down the answers themselves, the respondent can get feedback during
the interview and that in some occasions the respondent might not feel trust to give out information through a questionnaire [76]. Another situation where it is preferred to conduct an
interview instead of a questionnaire is when the questions are complex and the respondent
might need help to interpret them.
To get an accurate answer from the respondent the interviewer should ask open-ended questions, prepare follow-up questions and be as neutral as possible when asking questions [77].
Turner III recommends the interviewer to conduct a pilot test of the interview to give the
interviewer possibility to refine the interview design before performing the actual interview
[77]. After the interview has been conducted it should be transcribed before it can be analysed, it can also be helpful to ask the respondents to review the transcript to give them the
possibility of correcting the interpretation and change or rephrase their answers [75].
3.1.1
Benefits from Test Automation
In order to be able to answer the first research objective, What do practitioners believe are the
common benefits software producing companies relate to test automation?, two interviews were
held.
The interview questions were designed from Benefits and Limitations of Automated Software Testing: Systematic Literature Review and Practitioner Survey [9], discussed in section
2.5.1. The respondents were asked if they agree that test automation can have an impact on
the benefits defined in [9], the questions were formulated to be as neutral as possible, the
questions can be found in section 7.A.
After these initial questions the respondents were asked if they thought there were any other
benefits with test automation. The interviews were semi-structured as follow-up questions
were asked throughout the interviews, the questions were organized following the funnel
model. The interviews were carried out with two respondents, one of them being Vice President of Product Development with 3.5 years of experience in that role, the other being a
CI/CD Engineer with 10 years of experience.
3.1.2
Checklist Evaluation & Modification
Checklists can be used as a decision and memory aid when performing a task. A strong
reason for using a checklist when taking a decision or performing a task is that it can make
the outcome more predictable and reliable independently of who that is performing the task
[78]. Kramer and Drews defines four types of checklists: 1) laundry list 2) criteria of merit
list 3) sequential checklist and 4) flowchart\diagnostic checklist [78]. A laundry list is used
to remember steps or items, the criteria of merit list are used to rate and rank items, a sequential checklist defines steps where the performance order is important and a flowchart or
diagnostic checklist is used to make decisions based on the current situation [78].
Stufflebeam talks about how to create and evaluate checklists, the first five steps describes
tasks for how to create a checklist, in this thesis the checklist used already exists so focus will
be on the subsequent steps of how to review and validate a checklist. After the checklist have
been created an initial review should be made by asking potential users to judge and give
feedback about the checklist [79]. When feedback has been received the checklist needs to be
revised. After the initial review the checklist developer can ask potential users of the checklist
to grade the categories of the checklist using a given scale, such as a Likert scale [79]. The last
steps of evaluating a checklist consists of giving the checklist to users and ask them to use it
in their work and provide feedback and redesign the checklist based on the outcome from the
evaluation. Other methods that can be used for evaluating a checklist is the Delphi technique
25
3.1. Qualitative Methods
[80], interviewing experts and asking about their opinion of the checklist [81] and evaluating
a checklist through a survey [81]–[84].
In an industrial multi-case study, Usman et al. investigated how checklists for effort estimation in agile teams can be developed. This study resulted in a method that can be used to
develop and validate checklists. The method was implemented at three software companies
and the researchers used semi-structured interviews, workshops, questionnaires, metrics and
checklist usage data to carry out the steps in the proposed method [85]. The method consists
of the following five activities [85]:
1. Understand estimation context
In this step the creator of the checklist should study how the current work process is
done and while doing so collecting factors that can be used in the checklist.
2. Develop and present initial checklist
The identified factors from the previous step are put together into a checklist. The
checklist is to be presented to the agile teams and in an iterative manner it should be
modified until consensus is reached.
3. Validate checklist statically
At this stage the checklist is used for the first time by the teams. In the context of
the case study [85], that aimed to study effort estimation, the teams used the checklist
to estimate work performed in previous sprints. During this stage small changes are
allowed from team members, if any large changes are suggested it should be discussed
with the whole team before implemented.
4. Validate checklist dynamically
Now the checklist is ready to be used in the everyday work. The checklist can still be
modified when needed, it should not be seen as a static document.
5. Transfer and follow up
When the checklist has been used and validated dynamically, it should be reviewed
and the results of its usage can be concluded. These results should be communicated to
the management, to allow for it to become a standardized tool. After a reasonable time
period a second follow up should be made to study the usefulness of the checklist.
Figure 3.2: Overview of the method used for modifying the checklist
Checklist evaluation
In this thesis the checklist from Garousi and Mäntylä was used. It contains factors that can be
used when deciding to automate and that can help with identifying what test cases to automate [17]. Garousi and Mäntylä recommend that this checklist is evaluated at the company
26
3.1. Qualitative Methods
that wish to use it, some factors might not be applicable, and the level of importance can vary
in different companies [17].
The checklist was evaluated for this project in several stages of the thesis. First it was used in
an interview setting with the respondent being a test automation engineer with 6 years of experience. In this interview the respondent answered all factors in the checklist with plus and
minus signs as instructed by the authors of the checklist. Analysis from the answers could
indicate whether the respondent thought that the checklist was useful at the company.
After modifications to the checklist had been made, the checklist was evaluated in two additional ways. First the checklist was evaluated by studying the results from the interviews that
aimed to modify the checklist, verifying that the respondents thought that a reasonable number of factors were important at the company. Secondly, the checklist was evaluated when it
was used to select test cases, verifying that the checklist could be used.
Checklist modification
A survey was created to modify the checklist. To measure the level of agreement from the
responders a Likert scale was used. In the original Likert Scale the following responses were
used (as cited in [86]):
1. strongly
approve
2. approve
3. undecided 4. disapprove 5. strongly
and
disapprove.
According to Li, Likert scales are a popular choice in research due to that they are easy to
construct, provide numerical results and have a good reliability [87]. Likert scales can be
created with different numbers of scale points. A large number of scale points can confuse
responders and increase the measurement error [87]. It is common for Likert scales to have
5- or 7-scale points [88]. In a 5-point scale, the points can be labelled as for example [87]:
1. strongly
disagree
2. disagree
3. neither
disagree
nor agree
4. agree
5. strongly
agree
One way of evaluating the result and obtaining the central tendency from Likert scale data is
by using the mean value [86]. The mean value is calculated by assigning scores to each scale
point, summarising the score from all respondents and dividing the sum by the number of
the respondents.
In the survey, the responders looked at each factor in the checklist by Garousi and Mäntylä
[17] and were then asked select a response to the statement “The following factor is important when deciding if the given situation favours test automation”, the response had to
be selected from the following Likert scale points “Disagree Strongly”, “Disagree Slightly”,
“Agree Slightly”, “Agree Strongly” or “Do not know”. The responder also had the option to
write their own factors that they thought was important when evaluating tests to automate.
The last part of the questionnaire asked the responder to state some product areas that they
thought was suitable for automation and asked to motivate their answer and to connect it to
one or several of the checklist questions.
The checklist was reviewed in an interview setting and the survey was used as an aid to carry
out these semi-structured interviews that were conducted following the pyramid model. The
interviews were carried out with two respondents, one tester and one developer with 11
respectively 3 years of experience in their role. One respondent was active in interface rich
desktop and web products, whereas the other respondent was solely active in web related
products.
27
3.1. Qualitative Methods
Scores to Likert scale points
The Likert scale points were assigned scores as shown in Table 3.1. The survey was reviewed
in a semi-structured interview with a total of two respondents, thus each factor could get
a score between 0 and 8. For each factor in the survey the mean value of the score was
calculated. The mean value for each factor could be a number between 0 and 4. Where 0
meant that both respondents answered “Do not know” and 4 that both respondents answered
“Agree strongly”.
Table 3.1: Scores assigned to Likert scale points
Likert scale point
Do not know
Disagree strongly
Disagree slightly
Agree slightly
Agree strongly
Score
0
1
2
3
4
Inclusion criteria
For a factor to be included in the modified checklist the mean value had to be greater or equal
to 3. If the mean value for a factor was below 3 it was excluded from the modified checklist.
Example of exclusion and inclusion
In this section a few examples are shown of when factors were excluded and added to the
checklist. The examples are shown in Table 3.2. The first two rows in Table 3.2 shows two factors that were excluded from the checklist and the reasoning that the respondents provided.
In the two last rows of Table 3.2 the two factors that the respondents wanted to add to the
checklist are shown.
28
Factor
Tests require
large amounts
of data
We make
several releases
of our product
The product being tested is highly
customizable, i.e. have much
configurations.
Developers have low
knowledge in the
product being tested, i.e.
product has not been developed
on for a long period of time.
Factor
Id
13
42
44
45
R2: Several releases of our product, that’s a yes.
(Talking about factor 4:
“SUT is a generic system, i.e. not tailor made or heavily customized system”)
R1: No, I don’t think that affect automation.
Because here it says “customized”, if it had said “customizable”
I would had thought that it’s a factor worth considering.
(Talking about a product)
R2: We don’t sell the product anymore, but existing customers will
continue using it. And people at the company have relatively low knowledge
in this area. Because it has not been developed on for ages.
That’s also one aspect how well developers know the product.
I: Yes, perhaps this is a factor that you would like to add?
R2: Yes, exactly.
R2: Perhaps it’s worth considering, but its not the most important.
R1: No, I don’t think that releases are. . .
Even if there only is one release per year, there can still be a lot of
iterations that require much builds. Of course, it’s a factor that influence,
but I wouldn’t say that it’s the biggest factor.
R1: It depends of how data is needed. I guess I would say disagree.
A test doesn’t need to have large amounts of data to be automated.
If it’s necessary to import much data, then only that part can be automated
and then tested manually. I don’t think the two are clearly related.
Interview
Yes
Yes
No
No
Factor added
to modified
Checklist
Table 3.2: Examples of factors from the checklist provided by Garousi and Mäntylä that were excluded and included in the modified checklist.
R1 stands for respondent one, R2 for respondent two and I for Interviewer.
3.1. Qualitative Methods
29
3.1. Qualitative Methods
Decision Tree representation of Checklist
The factors from the second checklist were reformulated and grouped together into a decision
tree. The idea of using a decision tree was taken from Oliveira, Gouveia, and Filho [65] as
described in section 2.6.
In the decision tree (see appendix 7.E) similar factors were grouped together and made into
decision points. The order of the factors in the decision tree were decided using the mean
value for the factors, meaning that factors that had higher mean value were placed higher
up, that is, closer to the start point of the decision tree.
The tester was to use the checklist by starting at a decision point, evaluating all the factors
belonging to that point and deciding for each factor if the given situation and test case is
favourable for automation. The decision tree was used in this thesis to decide which tests
that were to be automated.
Example of how to use the decision tree
Figure 3.3 shows how the decision tree can be used on a test case. Below the steps that are
shown in the figure are explained.
For example, say that a tester is thinking about automating a test case testing a user login
functionality.
First the tester considers decision point 1, the factors in group F1. The test is deterministic,
either the user gets logged in or he does not, agree is chosen for this factor. The test result does
not require human judgment, again agree is chosen. The factors in F1 would be answered as
shown in Fig 3.3. The total result from F1 is agree.
Since F1 resulted in agree, the tester moves in the right direction in the tree. The next factors to
consider, in decision point 2, are the factors in the F2 group. The login functionality should
be tested often so 2.1 and 2.5 are answered with agree. But the test is not likely to reveal
defects, since it can be assumed that the login functionality has been thoroughly tested in the
past, factor 2.2. is answered with disagree. The factors in F2 would be answered as shown in
Fig 3.3 and the total result from F2 would be agree.
The tester again moves in the right direction of the tree and coming to an endpoint, this
endpoint states that according to the tree the test case should be automated.
30
Figure 3.3: Decision Tree with path marked out from the example.
3.1. Qualitative Methods
31
3.1. Qualitative Methods
3.1.3
Validation of Decision Tree by Usage on Regression Test Cases
A step in the verification of the decision tree was to study when and why the result from
the tree was to not automate a test case. For this purpose, a group interview was conducted.
The group interview was held with two respondents. The respondents were developers in
a web related product with 3 respectively 7 years experience. In the interview the author’s
usage of the decision tree for 11 regression test cases was verified and changed when needed.
The interviewer started the discussions by presenting the test case, later the two respondents
discussed the usage of the decision tree for the test case, without much involvement from the
interviewer.
The test cases were chosen in no specific order and randomly selected from the regression
suite. If the result from the decision tree was to not automate a test case, the respondents
were asked to motivate why the test was not suitable for automation. The interview can be
classified as unstructured as the respondents had the opportunity to freely discuss the usage
of the tree without rigid guidelines.
The decision tree was also used on one test case for an interface rich desktop product. The
result from the usage of the decision tree on this product was verified in an informal meeting
with one tester with 11 years of experience in his role. In total the usage of the decision tree
was verified on 12 test cases.
3.1.4
Automated Tests Implementation
As a part of gathering data for the ROI calculations a set of manual test cases were automated.
The reasoning was that it is hard to make estimations of how much time it takes to automate
a test case and such estimations are likely to be inaccurate. The decision tree was used on
several regression test cases from areas identified in previous interviews (see section 3.1.2).
The tests that were considered for automation by the tree was discussed in informal meetings with testers and developers. The tests were coded in C# using the test automation tool
Gauge 1 and Selenium 2 for browser automation. After the tests had been implemented, they
were reviewed in two steps, first by a developer\tester and later by a test automation engineer. This was done to make sure that the tests had been implemented correctly according to
code standards at the company, and to verify that the tests performs the necessary steps as
described in the manual test cases.
1 https://gauge.org/
2 https://www.seleniumhq.org/
32
3.2. Quantitative Methods
3.2
Quantitative Methods
In this section the quantitative methods used in this thesis are explained. The first quantitative method that is described is return on investment (ROI). The second quantitative method
described here is surveys, in this thesis a survey was used to validate the decision tree. In
the survey section, it is first described how surveys are conducted and afterward how this
survey was conducted.
3.2.1
ROI
The return on investment was calculated on the tests described in section 3.1.4. To calculate
the return on the automation investment the formula by Hoffman was used, it is described in
section 2.7.4. The formula is presented below:
ROI(in time t) =
∆(Benefits from automation over manual)
∆Ba
=
∆(Costs of automation over manual)
∆Ca
3.1: ROI formula for test automation by Hoffman Hoffman [18].
∆Ba being the incremental benefits from automated over manual testing and is defined as:
¸ (variable costs of maintaining manual tests) (n /N )
¸
+ (variable costs of running manual tests n times during time t)
¸
(variable costs of running automated tests n times during time t)
∆Ba (in time t) =
2
2
2
1
Note that the first variable, variable costs of maintaining manual tests, and N2 is not included
in the original formula from Hoffman. N2 is defined as the average number of runs for manual tests before maintenance is needed. This variable was added to have a more realistic
estimation of the costs for manual testing, at Sectra Imaging IT Solutions Ltd the manual test
scripts are frequently updated and this costs needs to be accounted for in the return of investment. One could argue that improvement in fixed costs of automated testing can be not
having to maintain manual tests, but it is easier to follow and understand the calculations if
this is added as a new variable.
∆Ca being the incremental costs of automated over manual testing and is defined as:
¸ (variable costs of creating automated tests)
¸
(variable costs of creating manual tests)
¸
+ (variable costs of maintaining automated tests) (n /N )
∆Ca (in time t) =
1
Note that the improvement in fixed costs of automated testing in ∆Ba and the increased fixed
costs of automated testing in ∆Ca , both included in the original formula and shown in section
2.7.4, was set to zero in the calculations, since none of these costs could be identified for this
project.
33
3.2. Quantitative Methods
The primary reason for choosing this formula is that it is suitable for calculating ROI for small
projects and isolated tests. In this thesis three test cases were automated in two product areas
and the ROI was calculated for each test case and for the whole project. Another reason for
using this formula is that initial costs of test automation such as tool costs and staff training
does not need to be considered in this formula. Although these costs are relevant for calculating an overall return on investment for test automation, they are not reasonable to consider
when calculating the ROI for a smaller project.
3.2.2
Validation of Benefits from Test Automation
Surveys are commonly used to describe or explain a phenomenon, the main advantage being the possibility to analyse data from many participants [76], [89]. Dillman defines three
data variables that can be captured in surveys; opinion, behavioural and attribute variables
[90]. Opinion variables capture respondent’s beliefs, from behavioural variables the actions
of the respondents can be studied, and attribute variables are used to study attributes of the
respondents themselves [90]. It is often preferred to have survey questions that are closed
and standardised, this allows for easier analysis of the data collected [76], [89]. The guidelines for surveys are well summarized by Saunders, Leiws, and Thornhill, similar thoughts
are described by Kelley et al., these being [76], [89]:
• Careful design of individual questions,
• Clear and pleasing layout of questionnaire,
• Lucid explanation of the purpose of the questionnaire,
• Pilot testing,
• Carefully planned and executed administration.
When collecting data from surveys it is of special importance to record reasons to why some
respondents choose to not participate, to ensure that the result has not been biased from
nonresponders [89], [91]. Furthermore, information about how the survey was administrated,
how respondents were approached, and response rate should be recorded by the researcher
[89].
The organisational benefits that potentially could be achieved with the implementation of automated tests was reviewed with a survey. The responders were asked if the automated tests
that have been implemented in this thesis could result in the organisational benefits that have
been found (presented in table 4.1). The manual test case that was the basis for the automation and the implemented code was sent to the responders, so that they could review these
documents when answering the survey. The survey can be found in section 7.F. The question
asked for each benefit was “The implementation of the automated test does to some degree
allow for:”, the formulation is relatively weak since it can be difficult to see the benefits from
only one automated test, the benefits are likely to be more prominent when there exists a sizable automated test suite.
The evaluation of the survey was the same as used in the interview for modifying the checklist (see section 5.). That is, the Likert scale points were assigned values from 0 to 4, if the
mean value was greater or equal to 3, agreement was considered to be found.
34
4
4.1
Results
Qualitative Results
In this section the qualitative results are presented, qualitative data comes in words, pictures and diagrams [91]. The qualitative results presented in this chapter have been collected
through interviews with the aim to address research objectives one, three and four. That is, to
find which benefits practitioners want to achieve with test automation, and to evaluate and
modify the checklist provided by Garousi and Mäntylä.
4.1.1
Benefits of Test Automation
To answer the first research objective, What do practitioners believe are the common benefits software producing companies relate to test automation?, two interviews were held. In table 4.1 the
benefits that were identified from the interviews are presented, there was no disagreement
from the respondents on which benefits that test automation has.
Table 4.1: Results from interview of benefits from test automation.
Benefit of test automation
Improved product quality
Increased test coverage
Reduced testing time
Definition by Rafi et al. [9]
Quality in terms of fewer defects
present in the software product.
High coverage of code
(e.g. statement, branch, path)
is achieved through
automation.
Time required for testing,
i.e. the ability to run more
tests within a timeframe.
35
Agreement
from
responders
Yes.
Yes.
No.
4.1. Qualitative Results
Increased test reliability
Increase in confidence
Reusability of tests
Less human effort
Reduction in cost
Shorter release cycles
AST is more reliable when
repeating tests as variance in
outcomes can be due to the
manual tester running the tests
in a different way, but can not
make use of the knowledge of
the tester
Increase of confidence in the
quality of the system
(e.g. as perceived by developers)
When tests are designed with
maintenance in mind they can
be repeated frequently, a high
degree of repetition of test cases
leads to benefits, not a single
execution of an automated test case
Automation reduces human effort
that can be used for other activities
(in particular ones that lead to
defect prevention)
With a high degree of automation
cost are saved
Test automation is a
prerequisite for continuous
integration and will allow
for shorter release cycles. [3], [14], [92]
Yes.
Yes.
Yes.
Yes.
No.
Yes.
Excerpts from the interviews are shown for the following two factors, the excerpt was chosen to show how the responders thought when disregarding a benefit. The two benefits are
shown in table 4.1.
Reduced testing time
When asked if automated testing will reduce the time spent on testing, one responder answered: “I think that if automated tests cover a large part of the regression, integration tests
and so on, then less time could be spent on manual testing. But the manual testing that remains will be more qualitative. If we consider testing for verify quality, I don’t think the total
test time will be changed but it will be a shift in how much manual testing that is carried out.
More time will be spent on developing automated tests.” The other responder agreed and
expressed similar thoughts.
Reduction in cost
The responders were asked: Does test automation affect the costs of testing? The first responder answered: “It will not lower the cost of testing, due to automation being an investment.
Initially it might increase, but with time I think we will get to a similar level. But with an
increase in value. If we achieve more value, the cost per unit of value will be decreased.”
The second responder answered: “Yes, it does. Sometimes you might think that it is only an
additional cost. I would say yes, it is an additional cost. But test automation will allow for
36
4.1. Qualitative Results
faster development of the products and comes with increased quality and confidence in the
product.”
4.1.2
Checklist Evaluation
The checklist was evaluated in three different phases of the thesis.
First before any modifications had been made to the checklist, an interview was carried out,
see section 3.1.2. In this interview the checklist was used at the company and from analysis
of the answers the conclusion could be made that the respondent thought that most factors
in the checklist were suitable at the company.
Secondly, from the results from the interviews with the aim to modify the checklist, conclusions can be drawn about if the checklist is applicable for the company. As shown in Table
4.2, 15 out of 43 factors were removed from the checklist. This result speaks for that a clear
majority (65%) of the factors in the checklist are considered as important when making decisions regarding test automation at the company.
The last evaluation of the checklist, now a decision tree, was when it was used on regression
test cases, see section 4.1.4. At this point, the decision tree was used for 12 test cases and the
result was that 8 of the test cases should be automated, 3 test cases should not be automated
and that one test case should be partly automated. The factors in the decision tree could be
used to select test cases for automation.
Hence the third research objective, Is the checklist provided by Garousi and Mäntylä [17] applicable
in an industrial setting to achieve test automation?, can be answered positively. The checklist is
applicable at the company.
4.1.3
Modifications in Checklist
In this section the modifications to the checklist are presented. The result shown in table 4.2
answers the fourth research objective, What modifications are required to the checklist provided by
Garousi and Mäntylä [17] to make it applicable to Sectra Imaging IT Solutions Ltd?
The modifications consisted of that 15 factors were removed from the modified checklist,
some of the factors that were removed and the reasoning to why is shown in table 3.2. 28 factors were included in the modified checklist and two factors, 44 and 45, were added from the
interview respondents to the modified checklist. Note that factor 11 (“Tests are Unit tests”)
got a mean value that should have included it to the modified checklist but due to the delimitations, section 1.5, it was not included in the modified checklist. Table 4.2 presents the
modifications made to the checklist.
Table 4.2: Results from modification interviews of checklist.
Factor
Id
1
2
3
4
5
Factor from checklist by
Garousi and Mäntylä
SUT or the targeted components will
experience major modifications in the future.
The interface through which the tests
are conducted is unlikely to change.
SUT is an application with a long life cycle.
SUT is a generic system, i.e. not tailor made
or heavily customized system.
SUT is tightly integrated into other products,
i.e. not independent.
Mean
value
score
Factor included
in Modified
Checklist
2.5
No.
2
No.
3.5
Yes.
1.5
No.
2.5
No.
37
4.1. Qualitative Results
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
SUT is complex.
SUT is mission critical.
Frequent regression testing is beneficial
or essential.
Tests are performance and load tests.
Tests are smoke and build verification tests.
Tests are Unit tests.
There are large number of test that
are similar to each other.
Tests require large amounts of data.
Humans are likely to make errors when
performing and evaluating these tests,
e.g. tests require vigilance in execution.
Computers are likely to make errors when
performing and evaluating these tests,
e.g. test execution is not deterministic.
Tests can be reused part of other tests.
Tests need to be run in several hardware
and software environments and configurations.
The lifetime of the tests is high.
The number of builds is high.
Tests are likely to reveal defects,
i.e. high risk areas.
Tests cover the most important features,
i.e. high importance areas.
Test results are deterministic.
Test results require human judgement.
Automated comparison will be fragile
leading to many false positives.
Tests are instable, e.g. due to timing.
We must perform the test repeatedly
and if it passes above a threshold
we consider that the test passes.
Tests are instable, e.g. due to timing.
The results cannot be trusted at all.
We have experimented with the
test automation tool we plan to use
and the results are positive.
A suitable test tool is available
that fits our purpose.
We have decided on which tool to use.
We can afford the costs of the tool.
Our test engineers have
adequate skills for test automation.
We can afford to train our
test engineers for test automation.
We have expertise in the
test automation approach
and tool we have chosen.
2.5
4
No.
Yes.
3.5
Yes.
3.5
3
3.5
Yes.
Yes.
No.
2.5
No.
2.5
No.
3.5
Yes.
4
Yes.
2.5
No.
2.5
No.
3
3
Yes.
Yes.
3.5
Yes.
4
Yes.
4
4
Yes.
Yes.
4
Yes.
4
Yes.
4
Yes.
3
Yes.
3
Yes.
2.5
3
No.
Yes.
2.5
No.
3
Yes.
2.5
No.
38
4.1. Qualitative Results
34
35
36
37
38
39
40
41
42
43
44
45
We are currently under a
tight schedule
and or budget pressure.
We have organizational
and top management support
for test automation.
There is a large change resistance
against software test automation.
We have the ability to influence
or control the changes to SUT.
There are economic benefits
of test automation.
Tests are easy
and straight forward to automate.
Test results are ease
to analyze automatically.
Test automation will require
a lot of maintenance effort.
Our software development process
requires test automation to function efficiently,
for example agile methods.
We make several releases of our products.
The product being tested is highly
customizable, i.e. have much configurations.
Developers have low knowledge in the
product being tested, i.e. product has not been
developed on for a long period of time.
2
No.
3
Yes.
2
No.
3.5
Yes.
3
Yes.
3
Yes.
4
Yes.
3
Yes.
1.5
No.
3
Yes.
N/A
Yes.
N/A
Yes.
The factors that remain after the modifications were divided into two groups, one consisting
of factors to consider before starting an automation process and the other consisting factors
to consider when deciding to automate a test. The factors in the second part of this checklist
was reformulated, edited and put into a decision tree (see appendix 7.E). The decision tree,
see Fig 4.1, was to be used when evaluating a test case for automation.
4.1.4
Usage of Decision Tree on Regression Test Cases
Of the 12 test cases, described in section 3.1.3, the result from the Decision Tree was to fully
automate eight test cases and not automate three test cases. In one of the test cases the tree
was used on several subsets of the steps in the test case, each subset tested different areas in
the product. This test case contained 19 steps, with usage of the tree it was concluded that
7 steps could be automated and 12 steps were recommended to not be automated, that is,
roughly 40% of the test case were recommended for automation.
The test cases were classified based on test type, from the twelve test cases, one were considered to be a smoke test, eight were sanity tests, two were scenario tests and one test cases was
mixed and contained steps that related to smoke, sanity and scenario testing.
4.1.5
Automated Tests
To gather data for the ROI calculations three manual test cases were automated. The tests
were selected by using the decision tree on several regression test cases from areas identified
in the interviews for modifying the checklist (see section 3.1.2). The tests were then discussed
39
4.2. Quantitative Results
Figure 4.1: Decision Tree, for full size version see appendix 7.E
in informal meetings with testers and developers. With this strategy three manual test cases
were chosen for automation. Two of the test cases were considered to be sanity tests and
the third was considered to be a scenario test. Two of the test cases were in a web product,
one sanity and one scenario test, the third test case, being a sanity test, was found in an
interface rich desktop product. The test cases contained between 8 and 24 steps, which was
to be automated. The tests were written in C# and the test automation framework that the
tests were coded in used gauge 1 and Selenium 2 . The data used in the ROI calculations that
comes from the automated tests, such as time to automate the tests, is presented in table 4.3
in section 4.2.1.
4.2
Quantitative Results
In this section the quantitative results are presented. Quantitative data are numbers and
classes [91]. The quantitative results that was found in this thesis comes from return on
investment calculations and a survey to measure organisational benefits of test automation.
4.2.1
ROI
This section answers the second research objective, How can economic benefits be measured for
test automation? In the literature study, see section 2.7, it was found that return on investment
is the most common way to measure the economic benefits of test automation. In this thesis
a formula created by Hoffman was used, for a description of the formula see section 3.2.1.
The return of the investment is dependent on how much time the tests can run. For the whole
project it was found that a positive ROI is achieved after 3.3 release cycles or 1.7 years, as a
release cycle at Sectra Imaging IT Solutions Ltd is 6 months. The ROI of the automation
project is shown in figure 4.2.
1 https://gauge.org/
2 https://www.seleniumhq.org/
40
4.2. Quantitative Results
1.8
1.6
Project Total
1.4
ROI
1.2
1.7 years
1
0.8
0.6
0.4
0
2
4
6
8
10
12
14
16
18
20
22
Releases (6 months)
Figure 4.2: ROI of automation project
The return on investment differed greatly between the two products, the result is presented
in figure 4.3. For the interface rich desktop product, a positive ROI was achieved from the
first usage of the tests. Whereas for the web product a positive ROI was found after around 4
years, the ROI for test A was positive at 3.7 years and for test B a positive ROI was achieved
after 3.85 years.
41
4.2. Quantitative Results
10
Project Total
Web Product A
Web Product B
Interface Rich Desktop Product
9
8
7
ROI
6
5
4
3
2
0.5 years
1.7 years
1
3.85 years
3.7 years
0
0
2
4
6
8
10
12
14
16
18
20
22
Releases (6 months)
Figure 4.3: ROI for individual test cases
The reason for that the ROI differed so much is that the maintenance need and costs were
estimated to be much higher in the web product than in the interface rich desktop product.
The maintenance cost was estimated to be more than 10 times higher in the web product compared to the interface rich desktop product. The variables used in the formula is presented in
table 4.3.
42
4.2. Quantitative Results
Table 4.3: Data used in ROI calculations
Data
Interface
Rich
Desktop
Product
Web
Product
Test A
Web
Product
Test B
Unit
Estimated
/Logged
16.69
25.36
19.38
hours
l
8
6
4
hours
e
3
3.33
3
hours
l
24
6
6
months
e
2
6
6
hours
e
6
6
6
months
l
2
2
3
hours
e
How much time does it take
to create the automated test
How much time does it take
to create an equivalent manual test
How much time does it take
to run an equivalent manual test
How long can the automated
test run without maintenance
How much time does it take
to maintain the automated test
How long can the manual
test run without maintenance
How much time does it take
to maintain the manual test
4.2.2
Validation of Benefits from Test Automation
As a part of answering the research question, Can the checklist provided by Garousi and Mäntylä
[17] be modified in such a way that it can be used to select test cases for test automation that result in
economic and organisational benefits?, a survey was sent out to three responders to study if the
automated tests could lead to organisational benefits. The result from the survey is presented
in table 4.4. The table shows that the test named “Web Product A” can lead to six benefits,
“Web Product B” to seven benefits and “Interface Rich desktop product” to four benefits.
Table 4.4: Results from survey evaluating organisational benefits of automated tests.
Factor / Product
The implementation of
the automated test does
to some degree allow for:
Improved product quality
Increased test coverage
Reduced testing time
Increased test reliability
Increase in confidence
Reusability of tests
Less human effort
Reduction in cost
Shorter release cycles
Web
Product A
Web
Product B
Interface
Rich
Desktop
Product
Agree
Mean
Value
Agree
Mean
Value
Agree
Mean
Value
No.
No.
Yes.
No.
Yes.
Yes.
Yes.
Yes.
Yes.
2.7.
1.7.
4.
2.3.
3.3.
3.7.
4.
3.3.
3.
No.
No.
Yes.
Yes.
Yes.
Yes.
Yes.
Yes.
Yes.
2.7.
1.7.
4.
3.3.
3.7.
3.7.
3.7.
4.
3.
No.
Yes.
No.
No.
No.
No.
Yes.
Yes.
Yes.
1.7.
3.
2.3.
2.
2.3.
2.7.
3.7.
3.7.
3.
43
5
Discussion
The discussion is divided into six subsections, first results and method are discussed. Afterwards, internal and external validity, reliability, ethical and societal aspects are considered.
5.1
Results
In this section the results found in this thesis are analysed and discussed. The chapter is
organised by the same section names as used in the result chapter.
5.1.1
Benefits of Test Automation
From the literature study 8 benefits of test automation were identified, see section 2.5. Out of
these 8 benefits the 2 responders in the interviews agreed with 6 of the benefits, one benefit
was added from the responders (shorter release cycles). The two benefits that the responders
did not agree with the literature was: Reduced Testing time and Reduction in cost. When
discussing reduced testing time with the responders, it was clear that some ideas from the
literature were brought up. The responders agreed on that the time for test execution could
be reduced with automation, as suggested by Amannejad et al. [16]. The responders also
thought that test automation can find bugs earlier in the development process, which [53]
and [52] argues will reduce testing cost. Nonetheless, the responders thought that when
looking at the big picture, test automation will not result in reduced testing time and cost,
but rather a shift in cost and testing activities.
Benefits of test automation were identified in other interviews throughout this thesis. Three
benefits that were mentioned in other interviews but not in the ones about benefits of test
automation were: “testing low risk areas”, “testing product areas where the company has
a low level of knowledge” and “not having to perform time consuming and difficult setup
when regression testing”. One reason to why there are different benefits mentioned by the
responders can be due to the variety in roles that the responders had. The benefits that have
been found, have been expressed by testers, developers, CI/CD engineers and management,
which provide a broad set of different perspectives on the subject.
An unexpected result is that 2/3 of the tests that were automated in this thesis were thought
to lead to reduced testing time, and 3/3 tests were thought to lead to a reduction in cost. This
44
5.1. Results
Table 5.1: Factors found in literature that are not included in decision tree.
Factor
Reuse
Porting, i.e. can
the test run on
several environments?
Definition
Can this test or parts of it be reused in other tests? [65].
How portable is this test? [65]
Does the test require many data combinations using the same
Large input
test steps (i.e., multiple data inputs for the same feature)? [38]
Is the test very time-consuming, such as expected results
Large output
analysis of hundreds of outputs? [38]
Does the test need to be verified on multiple software and
Test Portability
hardware configurations? [38]
result comes from the survey that was sent out to 3 responders (see section 4.2.2). Note that
these are the benefits that the responders in the interview called “Benefits of test automation”
thought were not likely to come with test automation. One reason to why there is a discrepancy from the survey and the interviews, can be that different respondents was asked to
participate in the survey/interviews. The discrepancy can simply come from disagreement
of what benefits that test automation provides. It is also important to note that in the survey
the responders were asked to identify benefits from specific automated tests, whereas in the
interviews the responders were talking about a general and overall scope of test automation,
which very well could influence the answers.
5.1.2
Modifications in Checklist
After the inclusion process had been terminated the remaining factors were divided into two
checklists, the first checklist, see appendix 7.C, contained factors that are to be considered
before starting the automation process answering the question “Are we at a position were
test automation is possible?”. While the second checklist, see appendix 7.D, contained factors
that were to be considered when deciding whether to automate a test or not, hence answering
the question “Is this test suitable for automation?”. The factors from the second checklist were
used in the decision tree.
The reason for choosing a decision tree is that factors can be prioritized depending on how
close to the top they are put in the tree and based on which previous decision points that
has been passed to reach the current factors. The factors were prioritized with the score from
the interviews in mind. The majority of the tree follow the score from the interviews but in
some cases exceptions were made to get a valid grouping of factors and structure in the tree.
Another advantage with using a decision tree is that the tester gets a clear result and it is easy
to summarize the result of the factors considered.
When comparing the decision tree in this thesis to the checklists from the literature many
similarities are found. Some of the factors covered in the literature that are not present in the
decision tree are shown in table 5.1.
The main reason for that these factors were not added to the decision tree is that most of
these factors are not included in the checklist by Garousi and Mäntylä [17] and hence were
not considered when modifying the checklist. It is worth to note that the responders had the
possibility to add factors, if they thought it was needed. The factor “Large input”, is included
in the checklist by Garousi and Mäntylä, but in the interviews the responders thought that it
was not of high importance when making decisions about test automation.
45
5.1. Results
5.1.3
Usage of Decision Tree on Regression Test Cases
The decision tree was used on regression test cases for 11 tests of the web related product
and 1 test of the interface rich desktop product. The reason for not using the decision tree on
more test cases is that it was a time consuming process to verify that the usage of the decision
tree was correct. First the author of this thesis used the decision tree on a test case, later it
had to be verified by testers or developers at the company. The advantage on using the tree
on regression tests is that these tests are likely to be the tests that are automated first. Ramler
and Wolfmaier state that automated tests best address the regression risk, meaning the risk
that new defects are introduced after changes in the software [62]. In a survey it was found
that 50% of the 36 industry respondents performed equal amounts of manual and automated
regression testing and only 30% performed manual regression testing [21]. In the same study
the authors mention that the respondents did not use any systematic approach to select test
cases for regression testing, instead judgment and experience was used [21]. For this reason it
can be argued that it was a good choice to evaluate the decision tree on regression tests, since
there seems to be a need of a systematic approach to select test for which regression tests to
run and possibly also which to automate.
However, a disadvantage with using the decision tree on regression test is that many of these
at Sectra Imaging IT Solutions Ltd are sanity tests, which are suitable for automation. One
reason for evaluating the decision tree on the test cases was to find out why the tree would
result in a recommendation to not to automate a test case. It is possible that if the tree had
been used on another set of test cases it would have resulted in that more test cases should
not be automated, which would give more data to analyse this situation. In the interview
only 3 test cases had the result that they should not be automated, which might be a too small
data set for drawing justifiable conclusions.
5.1.4
ROI
First it is important to note that the ROI calculations assumes that manual tests are executed
once every six months. The test cases that was included in the ROI calculations are regression tests and at Sectra Imaging IT Solutions Ltd these typically runs once per release cycle.
Automated tests are commonly run much more frequent, it is reasonable to assume that the
automated tests would run once per day, but this benefit is not present in the ROI calculations. If the frequency of regression testing were to be increased, a positivie ROI would had
been reached in a shorter time period for the automated tests.
The result of the ROI calculations, see section 4.2.1, is that the automation project would
achieve a positive ROI after 1.7 years or 3.3 release cycles. This is equivalent to 3.3 test executions. This result is not far from what was achieved in Visual GUI Testing: Automating
High-level Software Testing in Industrial Practice [93]. In this study the author performed
automation in Visual GUI Testing in three projects, the mean ROI value for the three projects
was 2.3 executions [93]. Similar results are reported from Amannejad et al. in 2014, where the
authors found that when only automating test execution a positive ROI was reached after 3
test executions [16].
The individual tests have a big difference in maintenance cost in the ROI calculations, however this is not unexpected when considering the differences in the products. One of the
products being tested is an interface product, which is very stable, and changes are not frequently made to this product. Whereas the other product is a web product, where the tests
are performed in several areas of the GUI and changes occur much more frequently.
46
5.2. Method
5.1.5
Validation of Benefits from Test Automation
The result from the survey is shown in table 4.4. None of the tests is thought to increase
product quality, where quality is defined as fewer defects present in the software. In the
“Benefits of test automation”-interviews the responders considered this to be a benefit of test
automation. The responders thought that test automation can find smaller defects, typically
found in regression testing. It is not unexpected that the automated tests were not considered
to increase quality. The thought that automated tests are useful for verification rather than
finding defects is present in the literature [14]. Kaner argues that bugs found from automated
tests, are found when developing the tests, rather than when running the tests [94].
This was the only benefit that was not found for any of the tests. The other benefits were
found in some tests but not in all. Overall less benefits were found for the tests of the Interface
Rich Desktop Product than for the Web Product. This result can be explained by that testing
the Interface Rich Desktop Product is easier, both manually and automatically. For instance,
the reliability will not increase when testing this product automatically since there already
exists a high level of trust in the manual testers performing these tests.
Three benefits were found to be present in all three automated tests, these were: Less human
effort, Reduction in cost and Shorter release cycles.
5.2
Method
In this section the methods used in this thesis is discussed. The chapter is organised by the
methods used and inside each subsection the different usages of the method is discussed.
5.2.1
Interview
The interview questions, which were handed out to responders, and also the survey used
in the thesis, were written in English. This could potentially have been a problem, since it
was not the responder’s native language. The reason for having the questions in English was
to not have to translate technical terms found in the literature. Some technical terms that
are not commonly used at Sectra Imaging IT Solutions Ltd were changed to simplify for the
responders, these terms were identified when having informal meetings with the supervisor
where the questions and layout of the survey/interviews were discussed. The responders
had a good level of English and having the questions in English did not seem to be a problem
in the interviews.
The interviews were held in Swedish, which was the native language of both the responders
and the interviewer. All interviews were transcribed, and the transcription was sent to the
responders as recommended by Runeson and Höst [75]. The texts that have been used in the
report that comes from interviews was sent to the responders for their approval. Partly to
make sure that the translation was correct, but also to assure that the responders approved
that this text can be published in the report.
There was no nonresponders to the interviews or the survey in this thesis.
The organisational benefits of test automation were found from mainly one study, namely
Rafi et al. [9]. There exists a risk that there are other benefits for test automation that have not
been identified by Rafi et al. This risk has been mitigated by reviewing several other sources
of benefits for automation, such as books and studies published after 2012 (the publishing
year of the SLR by Rafi et al.).
There were only two interviews held to verify the benefits from the perspective of practitioners. Also, both interview responders were selected from the same company, Sectra Imaging
IT Solutions Ltd. The conclusions from these interviews is that practitioners at Sectra Imaging IT Solutions Ltd believe there primary exist 7 benefits of test automation (presented in
table 4.1.), but no general conclusions for benefits from test automation can be drawn from
47
5.2. Method
the interviews. Nevertheless, it is likely to assume that there will be similar results in other
companies, since the benefits have been identified in several published papers.
When answering the third research objective one interview that was used had a different purpose than the question. In the interview (see section 3.1.2) the checklist was used to verify that
test automation was possible at the company. While the research objective was to evaluate if
the checklist was applicable at the company. The answers from the interview were analysed
and the conclusion that most questions were applicable could be drawn. But there still exists
a possibility that the result would have changed if the purpose of the interview had been to
verify the usefulness of the checklist, perhaps some questions could have been removed directly from the checklist, before conducting the second set of interviews. In the end the result
would likely had been the same since more interviews with the aim to review the checklist
were carried out. Furthermore, several actions were taken to evaluate the applicability of the
checklist, as described in section 4.1.2.
When conducting the interviews for modifying the checklist, it was clear that the questions
were complex and not always easy to answer right away. For this reason, it can be argued
that it was suitable to conduct interviews instead of handing out a questionnaire to the respondents. In an interview setting the respondents have the possibility to ask the interviewer
what a question is aiming at and it can be discussed in the interview.
Before the interviews were conducted a small review had been made with one worker at the
company. However, there was no complete pilot test of the interview and when a factor could
have several definitions it was decided that the interviewer and the respondent should come
to an agreement of a definition in the interview. This resulted in that in the two interviews
there were different definitions to a few of the factors, which led to varied prioritizing of the
factors from the respondents. In some cases, this was noted in the coding of the interview and
actions could be taken to prevent a factor from being left out from the checklist due to this
reason. It is possible that some factors might have been dropped from the modified checklist
due to differences in definitions. This does not have to be a disadvantage, if the respondents
had difficulties understanding the purpose of the factor it is likely that a tester using the
checklist would have the same problem. If the checklist contains ambiguous factors it might
give different results when used by different individuals. The goal is that the checklist always
give the same answer, independent of who is using it.
One example of a factor that was dropped due to differences in definitions is factor 12. Even
though the factors were discussed with the respondent and the data was analysed, it would
have been better if a pilot test of the interview had been carried out. Turner III states that a
pilot test is necessary to refine the interview design before conducting the actual interviews
[77]. If a pilot test had been used before carrying out the real interview, ambiguous factors
could have been removed or reformulated to be easier to understand.
A more serious problem with these interviews were that both respondents had trouble interpreting the context question. In the questionnaire the context question “The following factor
is important when deciding if the given situation favours test automation” was asked for each
factor in the checklist. But the factors in the checklist contained statements as well and for
the respondents it was difficult to focus on which statement that was discussed. One respondent openly stated that the it was hard to follow the interview design. The other respondent
selected a Likert scale point to a factor in the questionnaire that showed an opposite opinion than the one what he had stated when discussing the factor. The interviewer asked this
respondent if he was sure his answer was correct, which it was not, and the answer could
be changed. This problem could easily have been prevented with a pilot test. If the context
question had been better formulated, the interviews would have been easier to participate in
and would had taken less time to conduct.
48
5.2. Method
5.2.2
Survey
In this thesis one survey was used to validate the decision tree. The aim was to investigate if
the automated test implemented with the aid of the decision tree could provide organisational
benefits for the company. The survey that was used can be found in appendix 7.F. The survey
was sent out to three responders, as described in section 3.2.2. Two of the responders thought
that it was difficult to evaluate whether tests from other product areas than their own could
provide benefits. This was to some extent expected, since it can be troublesome to read the
implementation code. The mitigation was that the survey conductor explicitly stated that
the responders could contact the conductor for asking for guidance. One of the responders
used this. This could reduce the accuracy of the survey result, but it was not possible to
conduct the survey with responders that have experience in both of the product areas. It
would have been better to get actual data to measure the benefits. For example increased test
coverage and reduced testing time are benefits that can be measured, but that data could not
be collected at the moment of the thesis. Also, at this stage, it is complicated to measure if the
tests results in an increase of confidence or shorter release cycles, these benefits, if they exist,
are more likely to be found after more tests have been implemented and used for a longer
time period.
5.2.3
Automated Tests Implementation
The automated tests for the interface rich desktop product were considerably easier to implement than the ones for the web product. This was expected before starting the implementation and was much due to properties of the products. In the web product there was a large
set of possible configurations and the tests that were automated needed to be configured in
many different ways. For a developer that is new to this product it can be troublesome to
understand all configurations and the steps needed to perform them are not always clear.
Furthermore, the tests in the web product interact with the GUI and it was difficult to understand and get an overview of all the functionality needed for the tests. These factors, a
considerable number of configurations and features, resulted in that the developer had to ask
for help many times throughout the implementation phase.
Whereas for the interface rich desktop product, the functionality was clearly defined and the
tests did not need any special configurations. For this product it was obvious what the tests
should achieve and how. The implementation of these tests was uncomplicated and straightforward.
5.2.4
ROI
A factor that needs to be considered for the ROI calculations is the maturation effects [91].
As experience was gained throughout the project it is likely to assume that the first test that
was implemented, web product A, took longer time to implement than the other test due to a
learning curve in the beginning, this is also seen in the data. However, an attempt was made
to mitigate this risk and parts of the learning time was logged and could be shared among
the test cases. The resulting time difference, due to learning curve or not, is not that big and
will not affect the ROI in any significant size. The risk could be avoided if each test had been
implemented by different developers, but then more test would need to be implemented to
assure that the implementation-time is not dependent on individual experience and skills.
A related factor is that the developer that implemented the test is a junior developer, which
might result in a higher cost of implementation than if the tests had been implemented by a
senior developer.
49
5.3. Internal Validity
5.3
Internal Validity
The data used for the return on investment calculations are to a large extent based on estimations from developers and testers at Sectra Imaging IT Solutions Ltd. The validity concern
lies in that the data can be estimated with a hidden agenda, it is possible that the estimations
were made to influence the ROI of automation in a positive or negative manner. To minimize
this risk, existing data was used from issue tracking and project management tools, when
possible. Several estimations were collected from different sources and an average was used
when calculating the ROI.
Hidden agendas might also be found in the interview and survey answers. All interviews
and the survey was conducted with at least two responders, but due to time constraints only
a few responders were interviewed. The interviews and surveys were created and analysed
with published literature in mind, which can increase the reliability in the results.
As mentioned in section 5.2.4, maturation effects can have influenced the result of the return
of investment from test automation. It can also be argued that maturation effects plays a role
in the answers and estimations of the responders, for instance, the effort of manual testing a
test case will be estimated lower by an experienced tester than when estimated by a junior
tester.
5.4
External Validity
As for external validity, the data collected for modifying, using and evaluating the decision
tree comes solely from Sectra Imaging IT Solutions Ltd. Practitioners from other companies
have not been consulted to verify if this tool can be useful in other settings. For this reason, there is no data to support that the tool can be useful for other practitioners than Sectra
Imaging IT Solutions Ltd. Reasonably the decision tree can be used in other companies, since
it builds on general research and the factors included are not specific for Sectra Imaging IT
Solutions Ltd.
5.5
Reliability
The tools used for collecting data in this thesis are shown in a transparent way as possible,
with the integrity of participants in mind. Interview and survey questions can be found as
appendices. The data used for calculating return on investment is shown in table 4.3. Parts of
the method are complicated to replicate, such as the implementation of the automated tests,
since only a limited amount information about the products and the specific tests is published
due to reasons of confidentiality.
5.6
Ethical and Societal Aspects
Runeson and Höst brings up a few key factors for ethical considerations, these include: informed consent, handling of sensitive results, feedback and confidentiality [75]. These factors
have been considered with all responders and stakeholders for this thesis. Transcriptions of
interviews and excerpts from the text used in the report from interviews have been sent to all
responders asking for their consent. Responders and also products that are mentioned in the
report have been anonymised to ensure integrity of responders. The responders have been
informed about these actions and the purpose of the interview, before deciding to participate
in interviews.
Manual testers might have concerns about test automation. Some may wonder if test automation is likely to replace their work. This does not seem to be reasonable in a near future, most
testers agree on that both manual and automated testing are necessary parts of software testing [9]–[14]. Some tests are not suitable for automation, such as usability testing. However,
50
5.6. Ethical and Societal Aspects
test automation will without doubt change the work for testers. One of the benefits that exist
for test automation is “Less human effort” (see section 2.5.1). In the interviews (see section
4.1.1) about the benefits from test automation it was stated that test automation will lead to
a shift in activity for manual testers. The responders thought that manual testers will have
more time for qualitative testing instead of so called must-do-regression, this opinion is also
stated by Berner, Weber, and Keller [14]. Moreover, the responders stated that test automation will be implemented with the help from developers, which will result in changes in work
for both developers and testers. Test automation will seemingly change the work of manual
testers for the better, allowing for more creative tests and qualitative testing.
51
6
Conclusion
The aim of this thesis has been to provide a method for selecting test cases for automation.
The research question has been, whether the checklist provided by Garousi and Mäntylä [17] can
be modified in such a way that it can be used to select test cases for test automation that result in
economic and organisational benefits. The checklist was modified into a decision tree and the
result from the evaluation suggests that it can result in economic and organisational benefits.
Three test cases were selected by using the decision tree and the return on investment of
automating these test cases shows that economic benefits are found after 0.5 to 4 years. Three
organisational benefits were found to be related to this automation, these being: less human
effort when testing, reduction in cost and allowing for shorter release cycles. These results
have been shown to be present at one company, Sectra Imaging IT Solutions Ltd. Whether
the result is replicable at other companies can not be concluded from this thesis, but it is
reasonable to assume that similar results can be found in other settings.
To aid the research process four research objectives were defined, the first one being, what do
practitioners believe are the common benefits software producing companies relate to test automation?
Practitioners of Sectra Imaging IT Solutions Ltd thought that test automation can lead to
seven benefits: Improved product quality, increased test coverage, increased test reliability,
increase in confidence, reusability of tests, less human effort and shorter release cycles.
The second research objective was, how can economic benefits be measured for test automation?
The answer to this research objective was found in the literature study and it was concluded
that the most common way to measure economic benefits of test automation is with return of
investment formulas. In this thesis a formula by Hoffman [18] was used.
The third research objective was, whether the checklist provided by Garousi and Mäntylä [17] was
applicable in an industrial setting to achieve test automation. The checklist was considered to be
applicable at Sectra Imaging IT Solutions Ltd. Practitioners at Sectra Imaging IT Solutions
Ltd. considered that 65% of the questions in the checklist were important when making
decisions related to test automation.
The fourth and last research objective was, what modifications to the checklist provided by Garousi
and Mäntylä [17] are required to make it applicable for practitioners. The checklist was modified
52
in two steps. The first step was to identify which factors were necessary to consider when
making decisions related to test automation. From this step, 15 factors were removed from
the original checklist and two new factors were added. In the second step the remaining
factors were grouped and sorted into a decision tree, the reason being that this prioritisation
and organisation will allow for easier use and give the users of the decision tree a definitive
answer to whether a test case should be automated or not.
More research is needed in the area of test case selection for automation, especially in methods that are simple to implement in test automation strategies. This thesis shows that practitioners can achieve economic and organisational benefits with checklist-based methods in
test case selection for automation.
53
Bibliography
[1]
F. Khomh, T. Dhaliwal, Y. Zou, and B. Adams, “Do faster releases improve software
quality? an empirical case study of mozilla firefox”, Proceedings for 9th IEEE Working
Conference on Mining Software Repositories (MSR), Zurich, Switzerland, Jun. 2012,
pp. 179–188.
[2]
M. Mäntylä, F. Khomh, B. Adams, E. Engström, and K. Petersen, “On rapid releases
and software testing”, In 29th International Conference Software Maintenance (ICSM),
IEEE, Sep. 2013.
[3]
M. Mäntylä, B. Adams, F. Khomh, E. Engström, and K. Petersen, “On rapid releases
and software testing: A case study and a semi-systematic literature review”, Empirical
Software Engineering, vol. 20, no. 5, pp. 1384–1425, Oct. 2015.
[4]
A. Porter, C. Yilmaz, A. Memon, A. Krishna, D. Schmidt, and A. Gokhale, “Techniques
and processes for improving the quality and performance of open-source software”,
Software Process: Improvement and Practice banner, vol. 11, no. 2, pp. 163–176, 2006.
[5]
A. A. Sawant, P. H. Bari, and P. M. Chawan, “Software testing techniques and strategies”, International Journal of Engineering Research and Applications, vol. 54, no. 3, pp. 980–
986, May 2012.
[6]
R. Charette, “Why software fails”, IEEE Spectrum, vol. 42, no. 9, pp. 42–49, 2005.
[7]
S. Dalal and R. S. Chhillar, “Software testing-three p’s paradigm and limitations”, International Journal of Computer Applications, vol. 54, no. 12, pp. 49–54, Sep. 2012.
[8]
D. Kumar and K. Mishra, “The impacts of test automation on software’s cost, quality
and time to market”, Proceedings of the 7th International Conference on Communication, Computing and Virtualization (ICCCV), vol. 79, 2016, pp. 8–15.
[9]
D. Rafi, K. Moses, K. Petersen, and M. Mäntylä, “Benefits and limitations of automated
software testing: Systematic literature review and practitioner survey”, Proceedings of
the 7th International Workshop on Automation of Software Test, Jun. 2012, pp. 36–42.
[10] Test automation is still testing, but don’t go at it alone, Nov. 2018 (accessed November 23,
2018). [Online]. Available: https://blog.testproject.io/2018/11/13/testautomation-is-still-testing/.
[11]
J. Kasurinen, O. Taipale, and K. Smolander, “Software test automation in practice: Empirical observations”, Advances in Software Engineering, vol. 2010, p. 18, Nov. 2009, Article ID: 620836.
54
Bibliography
[12]
J. Bach, “Test automation snake oil”, Proceedings for the 14th International Conference
and Exposotition on Testing Computer Software (TCS’99), 1999.
[13]
B. Pettichord, “Seven steps to test automation success”, Proceedings of STAR West Software Testing Conference, Nov. 1999.
[14]
S. Berner, R. Weber, and R. Keller, “Observations and lessons learned from automated
testing”, Proceedings of the 27th International Conference on Software Engineering
(ICSE), May 2005, pp. 571–579.
[15] Accelerating time to market through next-gen test automation, Apr. 2018 (accessed November 23, 2018). [Online]. Available: https : / / www . cigniti . com / blog /
accelerating - time - to - market - through - next - generation - test automation/.
[16]
Y. Amannejad, V. Garousi, R. Irving, and Z. Sahaf, “A search-based approach for costeffective software test automation decision support and an industrial case study”,
IEEE Seventh International Conference on Software Testing, Verification and Validation Workshops, vol. 3, 2014, pp. 302–311.
[17]
V. Garousi and M. V. Mäntylä, “When and what to automate in software testing? a
multi-vocal literature review”, Information and Software Technology, vol. 76, pp. 92–117,
Aug. 2016.
[18]
D. Hoffman, Cost benefits analysis of test automation, STAR West October 1999, 1999.
[19] Retriever business - a business database for swedish companies, (accessed February 24, 2019).
[Online]. Available: https://www.retriever-info.com/?e=3.
[20] Sectra’s history - the road to world-leading products, (accessed February 24, 2019). [Online].
Available: https://www.sectra.com/investor/about/history.html.
[21]
E. Engstöm and P. Runeson, “A qualitative survey of regression testing practices”, Jun.
2010, pp. 3–16.
[22]
E. Engstöm, P. Runeson, and M. Skoglund, “A systematic review on regression test
selection techniques”, Information and Software Technology, vol. 52, no. 1, pp. 14–30, Jan.
2010.
[23]
P. Runeson, “A survey of unit testing practices”, IEEE Software, vol. 23, no. 4, pp. 22–29,
Jun. 2006.
[24]
P. Ammann and J. Offutt, Introduction to Software Testing. New York: Cambridge University Press, 2008.
[25]
S. Quadri and S. Farooq, “Software testing - goals, principles, and limitations”, International Journal of Computer Applications, vol. 6, no. 9, 2010.
[26]
B. Beizer, Software Testing Techniques, 2nd ed. United States of America: International
Thomson Computer Press, 1990, ISBN: 1850328803.
[27]
P. Ammann and J. Offutt, Introduction to Software testing. New York: Cambridge University Press, 2008, ISBN: 978-0-521-88038-1.
[28]
D. G. E. van Veenendaal, I. Evans, and R. Black, Foundations of Software Testing: ISTQB
Certification. London: Cengage Learning EMEA, 2012, ISBN: 978-1-408-04405-6.
[29]
M. Fewster and D. Graham, Software Test Automation: Effective use of test execution tools.
New York: Addison-Wesley, 1999, ISBN: 0-201-33140-3.
[30]
P. Rook, “Controlling software projects”, IEEE Software Engineering Journal, vol. 1, no. 1,
pp. 7–16, Jan. 1986.
[31]
M. Kumar, S. Singh, and R. Dwivedi, “A comparative study of black box testing and
white box testing techniques”, International Journal of Advance Research in Computer Science and Management Studies, vol. 3, no. 10, pp. 32–44, Oct. 2015.
55
Bibliography
[32]
C. Kaner, “Cem kaner on scenario testing: The power of ’what-if...’ and nine ways to
fuel your imagination”, Better Software, vol. 5, no. 5, pp. 16–22, Oct. 2003.
[33]
M. E. Khan, “Different forms of software testing techniques for finding errors”, International Journal of Computer Science Issues, vol. 7, no. 3, pp. 11–16, May 2010.
[34]
L. Copeland, A Practitioner’s Guide to Software Test Design. London: Artech House, 2003.
[35]
P. C. Jorgensen, Software Testing A Craftsman’s Approach, fourth. Auerbach Publications,
2014.
[36]
J. Itkonen, M. V. Mäntylä, and C. Lassenius, “How do testers do it? an exploratory
study on manual testing practices”, 3rd International Symposium on Empirical Software
Engineering and Measurement, pp. 494–497, Oct. 2009.
[37]
J. Bach, Exploratory testing explained, v.1.3, 2003 (accessed September 12, 2018). [Online].
Available: http://www.satisfice.com/articles/et-article.pdf.
[38]
E. Dustin, T. Garrett, and B. Gauf, Implementing Automated Software Testing. Massachusetts: Addison-Wesley, 2009.
[39] A software engineer in test must have the heart of a developer, Nov. 2018 (accessed November
23, 2018). [Online]. Available: https://blog.testproject.io/2018/11/06/
the-software-engineer-in-test/.
[40] Key guidelines to continuous integration and jenkins ci server, May 2017 (accessed November 23, 2018). [Online]. Available: https://blog.testproject.io/2017/05/11/
jenkins-ci/.
[41]
M. Malekzadeh and R. Ainon, “An automatic test case generator for testing safetycritical software systems”, Proceedings of the 2nd International Conference on Computer and Automation Engineering (ICCAE), Feb. 2010.
[42]
F. Saglietti and F. Pinte, “Automated unit and integration testing for component-based
software systems”, Proceedings of the International Workshop on Security and Dependability for Resource Constrained Embedded Systems, 2010.
[43]
R. Tan and S. Edwards, “Evaluating automated unit testing in sulu”, Proceedings of the
International Conference on Software Testing, Verification, and Validation, 2008.
[44]
M. Alshraideh, “A complete automation of unit testing for javascript programs”, Journal of computer Science, vol. 4, no. 12, pp. 1012–1019, 2008.
[45]
J. Burnim and K. Sen, “Heuristics for scalable dynamic test generation”, Proceedings
of the 23rd IEEE/ACM International Conference on Automated Software Engineering,
Sep. 2008, pp. 443–446.
[46]
M. Geetha Devasena, G. Gopu, and M. Valarmathi, “Automated and optimized software test suite generation technique for structural testing”, International Journal of Software Engineering, vol. 26, no. 1, pp. 1–13, 2016.
[47]
L. Nagowah and K. Kora-Ramiah, “Automated complete test case coverage for web
based applications”, Proceedings of the International Conference on Infocom Technologies and Unmanned Systems (ICTUS), 2017.
[48]
D. Banerjee and K. Yu, “Robotic arm-based face recognition software test automation”,
IEEE Access, vol. 6, pp. 37 858–37 868, Jul. 2018.
[49]
D. Gafurov, A. Hurum, and M. Markman, “Achieving test automation with testers
without coding skills: An industrial report”, Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 749–756.
[50]
L. du Bousquet and N. Zuanon, “An overview of lutess a specification-based tool for
testing synchronous software”, Proceedings of the 14th IEEE International Conference
on Automated Software Engineering, Oct. 1999.
56
Bibliography
[51]
T. Wissink and C. Amaro, “Successful test automation for software maintenance”,
Proceedings of the 22nd IEEE International conference on Software Maintenance
(ICSM’06), 2006.
[52]
B. Haugset and G. Hanssen, “Automated acceptance testing: A literature review and
an industrial case study”, Proceedings of the Agile Conference, Aug. 2008.
[53]
S. Stresnjak and Z. Hocenski, “Usage of robot framework in automation of functional
test regression”, Proceedings of the 6th International Conference on Software Engineering Advances (ICSEA), Oct. 2011.
[54]
B. Obele and D. Kim, “On an embedded software design architecture for improving the
testability of in-vehicle multimedia software”, Proceedings of the IEEE International
Conference on Software Testing, Verification, and Validation Workshops, 2014, pp. 349–
352.
[55]
L. Shan and H. Zhu, “Generating structurally complex test cases by data mutation: A
case study of testing an automated modelling tool”, The Computer Journal, vol. 52, no. 5,
pp. 571–588, Aug. 2009.
[56]
J. Al Dallal, “Automation of object-oriented framework application testing”, Proceedings of the 5th IEEE GCC Conference & Exhibition, Mar. 2009.
[57]
D. Flenström, P. Potena, D. Sundmark, W. Afzal, and M. Bohlin, “Similarity-based prioritization of test case automation”, Software Quality Journal, vol. 26, no. 4, pp. 1421–
1449, Dec. 2018.
[58]
C. Liu, “Platform-independent and tool-neutral test descriptions for automated software testing”, Proceedings of the 2000 International Conference on Software Engineering (ICSE), Jun. 2000.
[59]
M. Bashir and S. Banuri, “Automated model based software test data generation system”, Proceedings of the 4th International Conference on Emerging Technologies, Oct.
2008.
[60]
C. Persson and N. Yilmaztürk, “Establishment of automated regression testing at abb:
Industrial experience report on ‘avoiding the pitfalls’”, Proceedings of the 19th International Conference on Automated Software Engineering (ASE’04), Oct. 2004.
[61]
M. Fecko and C. Lott, “Lessons learned from automating tests for an operations support
system”, Software: Practice and Experience, vol. 32, no. 15, pp. 1485–1506, 2002.
[62]
R. Ramler and K. Wolfmaier, “Economic perspectives in test automation: Balancing automated and manual testing with opportunity cost”, Proceedings of the 2006 International Workshop on Automation of Software Test, vol. 3, Jan. 2006, pp. 85–91.
[63] Google scholar citations economic perspectives in test automation: Balancing automated and
manual testing with opportunity cost by r. ramler and k. wolfmaier 2006, (accessed November
28, 2018). [Online]. Available: https://scholar.google.at/scholar?oi=bibs&
hl=en&cites=3114206012469503695&as_sdt=5.
[64]
Z. Sahraf, V. Garousi, D. Pfahl, R. Irving, and Y. Amannejad, “When to automate software testing? decision support based on system dynamics: An industrial case study”,
Journal of Software: Evolution and Process, vol. 28, no. 4, pp. 272–285, Apr. 2016.
[65]
J. Oliveira, C. Gouveia, and R. Filho, “A way of improving test automation costeffectiveness”, Proceedings for the 1st Annual Conference of the Association for Software Testing (CAST) 2006, Indianápolis, EUA, 2006.
[66]
R. Assad, T. Katter, F. Ferraz, L. Ferreira, and S. Lemos Meira, “Security quality assurance on web-based application through security requirements tests: Elaboration, execution and automation”, Aug. 2010, fifth international conference on software engineering
advances (ICSEA) August 2010.
57
Bibliography
[67]
S. Kadry, “A new proposed technique to improve software regression testing cost”,
International Journal of Security and its Applications, vol. 5, no. 3, Nov. 2011.
[68]
D. Graham and M. Fewster, Experiences of test automation: Case studies of software test
automation. Crawfordsville, Indiana: Addison-Wesley, 2012, ISBN: 0-321-75406-9.
[69]
S. Münch, P. Brandstetter, K. Clevermann, O. Kieckhoefel, and E. Schäfer, “The return
on investment (ROI) of test automation”, Pharmaceutical Engineering, vol. 32, 2012.
[70]
B. Marick, “When should a test be automated”, 1999 (accessed November 5, 2018). [Online]. Available: https : / / www . stickyminds . com / sites / default / files /
article/file/2014/When%20Should%20a%20Test%20Be%20Automated.pdf.
[71]
P. Grossman, Automated testing ROI: Fact or fiction? a customer’s perspective: What real QA
organizations have found, White paper, 2009.
[72]
D. Graham, ROI of test automation: Benefit and cost, Professionaltester.com, November
2010, 2010.
[73]
C. Schwaber and M. Gilpin, Evaluating automated functional testing tools, Forrester Research, February 2005, 2005.
[74]
C. Robson and K. McCartan, Real World Research: A resource for Users of Social Research
Methods in Applied Settings, 4th ed. United Kingdom: John Wiley and Sons Ltd, 2016,
ISBN : 9781118745236.
[75]
P. Runeson and M. Höst, “Guidelines for conducting and reporting case study research
in software engineering”, Empirical Software Engineering, vol. 14, no. 2, pp. 131–164, Dec.
2008.
[76]
M. Saunders, P. Leiws, and A. Thornhill, Research methods for business students, Fifth.
Italy: Pearson Education, 2009.
[77]
D. W. Turner III, “Qualitative interview design: A practical guide for novice investigators”, The Qualitative Report, vol. 15, no. 3, pp. 754–760, 2010.
[78]
H. S. Kramer and F. A. Drews, “Checking the lists: A systematic review of electronic
checklist use in health care”, Journal of Biomedical Informatics, vol. 71, pp. 6–12, 2017.
[79]
D. L. Stufflebeam, Guidelines for developing evaluation checklists: The checklists development
checklist (cdc), 2000 (accessed October 4, 2018). [Online]. Available: https://wmich.
edu/sites/default/files/attachments/u350/2014/guidelines_cdc.
pdf.
[80]
D. H. Goh, A. Chua, E. Khoo, E. Mak, and M. Ng, “A checklist for evaluating open
source digital library software”, Online Information Review, vol. 30, no. 4, pp. 360–379,
Jul. 2006.
[81]
B. M. Gillespie, E. Harbeck, J. Lavin, T. Gardiner, T. K. Wither, and A. P. Marshall, “Using normalisation process theory to evaluate the implementation of a complex intervention to embed the surgical safety checklist”, BMC Health Services Research, vol. 18,
no. 170, 2018.
[82]
W. Martz, “Validating an evaluation checklist using a mixed method design”, Evaluation and Program Planning, vol. 333, pp. 215–222, 2010.
[83]
S. M. Linares and A. C. D. Romero, “Developing a multidimensional checklist for evaluating language-learning websites coherent with the communicative approach: A path
for the knowing-how-to-do enhancement”, Interdisciplinary Journal of e-Skills and Lifelong Learning, vol. 12, pp. 57–93, 2016.
[84]
N. Aggarwak, N. Dhaliwal, and B. Joshi, “To evaluate the use of surgical safety checklist in a tertiary referral obstetrics center of northern india”, Obstetrics and Gynecology
International Journal, vol. 9, no. 2, pp. 133–136, 2018.
58
Bibliography
[85]
M. Usman, K. Petersen, J. Börstler, and P. Neto, “Developing and using checklists to
improve software effort estimation: A multi-case study”, Journal of Systems and Software,
vol. 146, pp. 286–309, Dec. 2018.
[86]
H. Boone Jr. and D. Boone, “Analyzing likert data”, Journal of Extension, vol. 50, no. 2,
Apr. 2012.
[87]
Q. Li, “A novel likert scale based on fuzzy sets theory”, Expert Systems with Applications,
vol. 40, no. 5, pp. 1609–1618, Apr. 2013.
[88]
R. Cummins and E. Gullone, “Why we should not use 5-point likert scales: The case
for subjective quality of life measurement”, Proceedings of the second International
Conference on Quality of Life in Cities, 2000, pp. 74–93.
[89]
K. Kelley, B. Clark, V. Brown, and J. Sitzia, “Good practice in the conduct and reporting
of survey research”, International Journal for Quality in Health Care, vol. 15, no. 3, pp. 261–
266, 2003.
[90]
D. Dillman, Mail and Internet Surveys: The tailored Design Method, 2nd ed. Hoboken, New
Jersey: John Wiley and Sons Inc., 2007, ISBN: 9780470038567.
[91]
B. Kitchenham, S. Plfeeger, L. Pickard, P. Jones, D. Hoaglin, K. El Emam, and J. Rosenberg, “Preliminary guidelines for empirical research in software engineering”, IEEE
Transactions on Software Engineering, vol. 28, no. 8, pp. 721–734, 2002.
[92]
J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases Through Build,
Test, and Deployment Automation. 1st ed. Crawfordsville, Indiana: Person Education Inc.,
2010, ISBN: 0-321-60191-2.
[93]
E. Alégroth, “Visual gui testing: Automating high-level software testing in industrial
practice”, Ph.D. Dissertation. Chalmers University of Technology, Sweden., 2015.
[94]
C. Kaner, “Improving the maintainability of automated test suites”, Proceedings of the
10th International Conference Software Quality Week 1997, 1997.
59
7
7.A
Appendices
Interview Benefits from Test Automation
The questions that was used in the interview for identifying which benefits that industry
representatives want to achieve with test automation. See section 3.1.1 for the method used
in the interviews and section 4.1.1 for the result.
60
Interview: What changes does test automation come with?
Name:________________________________
Role:_________________________________
Experience in role (years):________________
Consider the following questions, motivate your answer on why/how test automation affect the
factor discussed.
1. Can test automation result in a change of product quality?
Quality is defined as a low defect level in the product.
2. Will test automation result in changes in test coverage?
3. Will test automation result in less/more testing time?
4. Does automated testing affect the reliability of the testing?
5. Can test automation result in a change of product confidence?
That is, will eg. Developers feel more/less confident in the product quality with test
automation.
6. Does test automation affect the reusability of testing?
7. Will test automation make a difference in the human effort of testing?
8. Does test automation affect the costs of testing?
9. Can you identify any other benefits from test automation than the previous mentioned?
10. What benefits do you aim to achieve with test automation?
7.B. Checklist Survey
7.B
Checklist Survey
62
Survey: Evaluate checklist for deciding what to automate
Name:
Team:
Role:
Experience
(years) in role:
Product area:
Date:
____________________________
____________________________
____________________________
____________________________
____________________________
____________________________
Questions
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
The following factor is
important when deciding if
the given situation favors test
automation.
Category: SUT-related factors
Area: Maturity of SUT
SUT or the targeted
components will experience
major modifications in the
future.
The interface through which the
tests are conducted is unlikely
to change.
Area: Other SUT aspects
SUT is an application with a
long life cycle.
SUT is a generic system, i.e. not
tailor made or heavily
customized system.
SUT is tightly integrated into
other products, i.e. not
independent.
SUT is complex.
SUT is mission critical.
Category: Test-related factors
Area: Need for regression
testing
Frequent regression testing is
beneficial or essential.
Area: Test type
Tests are performance and load
tests.
Tests are smoke and build
verification tests.
Disagree
Strongly
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
Questions
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
The following factor is
important when deciding if
the given situation favors test
automation.
Tests are Unit tests.
There are large number of test
that are similar to each other.
Tests require large amounts of
data.
Humans are likely to make
errors when performing and
evaluating these tests, e.g. tests
require vigilance in execution.
Computers are likely to make
errors when performing and
evaluating these tests, e.g. test
execution is not deterministic.
Area: Test reuse/repeatability
Tests can be reused part of
other tests.
Tests need to be run in several
hardware and software
environments and
configurations.
The lifetime of the tests is high.
The number of builds is high.
Area: Test importance
Tests are likely to reveal
defects, i.e. high risk areas.
Tests cover the most important
features, i.e. high importance
areas.
Area: Test oracle
Test results are deterministic.
Test results require human
judgement.
Automated comparison will be
fragile leading to many false
positives.
Area: Test stability
Tests are instable, e.g., due to
timing. We must perform the
test repeatedly and if it passes
above a threshold we consider
that the test passes.
Disagree
Strongly
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
Questions
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
The following factor is
important when deciding if
the given situation favors test
automation.
Tests are instable, e.g., due to
timing. The results cannot be
trusted at all.
Category: Test-tool-related
factors
Area: Automation (test) tool
We have experimented with the
test automation tool we plan to
use and the results are positive.
A suitable test tool is available
that fits our purpose.
We have decided on which tool
to use.
We can afford the costs of the
tool.
Category: Human and
organizational factors
Area: Skills level of testers
Our test engineers have
adequate skills for test
automation.
We can afford to train our test
engineers for test automation.
We have expertise in the test
automation approach and tool
we have chosen.
Area: Other hum. and org.
factors
We are currently under a tight
schedule and or budget
pressure.
We have organizational and top
management support for test
automation.
There is a large change
resistance against software test
automation.
We have the ability to influence
or control the changes to SUT.
Category: Cross-cutting and
other factors
Area Economic factors
There are economic benefits of
test automation.
Disagree
Strongly
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
Questions
39.
40.
41.
42.
43.
Disagree
Strongly
45.
46.
47.
48.
Agree
Slightly
Agree
Strongly
Do
not
know
The following factor is
important when deciding if
the given situation favors test
automation.
Area: Automatability of
testing
Tests are easy and straight
forward to automate.
Test results are ease to analyze
automatically.
Test automation will require a lot
of maintenance effort.
Area: Development process
Our software development
process requires test
automation to function
efficiently, for example agile
methods.
We make several releases of
our products.
Other factors that you think
are important when deciding
if the given situation favors
test automation.
Factor
44.
Disagree
Slightly
Importance
Based on the questions
above name a few areas in
your product that could
benefit from automation and
relate them to one of the
questions.
Product area
1.
2.
3.
4.
5.
6.
Q. No (s).
Short motivation
7.C. Checklist 1
7.C
Checklist 1
The questions number relate to the When and What to automate checklist, to see the questions
with their number go to table 4.2.
68
27.
28.
30.
32.
35.
37.
Questions
Consider these questions
before the automation
process is started
We have experimented with the
test automation tool we plan to
use and the results are positive.
A suitable test tool is available
that fits our purpose.
We can afford the costs of the
tool.
We can afford to train our test
engineers for test automation.
We have organizational and top
management support for test
automation.
We have the ability to influence
or control the changes to SUT.
7.D. Checklist 2
7.D
Checklist 2
The questions number relate to the When and What to automate checklist, to see the questions
with their number go to table 4.2.
70
3.
7.
8.
9.
10.
14.
15.
18.
19.
20.
21.
22.
23.
24.
25.
26.
38.
39.
40.
41.
43.
Questions
Consider these questions
when deciding whether to
automate a test
SUT is an application with a
long life cycle.
SUT is mission critical.
Frequent regression testing is
beneficial or essential.
Tests are performance and load
tests.
Tests are smoke and build
verification tests.
Humans are likely to make
errors when performing and
evaluating these tests, e.g. tests
require vigilance in execution.
Computers are likely to make
errors when performing and
evaluating these tests, e.g. test
execution is not deterministic.
The lifetime of the tests is high.
The number of builds is high.
Tests are likely to reveal
defects, i.e. high risk areas.
Tests cover the most important
features, i.e. high importance
areas.
Test results are deterministic.
Test results require human
judgement.
Automated comparison will be
fragile leading to many false
positives.
Tests are instable, e.g., due to
timing. We must perform the
test repeatedly and if it passes
above a threshold we consider
that the test passes.
Tests are instable, e.g., due to
timing. The results cannot be
trusted at all.
There are economic benefits of
test automation.
Tests are easy and straight
forward to automate.
Test results are ease to analyze
automatically.
Test automation will require a lot
of maintenance effort.
We make several releases of
our products.
7.E. Decision Tree
7.E
Decision Tree
The numbers next to the factors show the mean value of the score for the factor from the
interviews, 4 was the highest mean value possible and 3 was the lowest mean value a factor
could have to be included into the checklist.
72
Decision Tree: Which Test to Automate?
Decision Point: F1
Factor 1.1
Test results are deterministic.
Factor 1.2
Test results does not require human judgement.
Factor 1.3
Automated comparison will not be fragile leading to many false
positives.
Factor 1.4
Tests are not instable, e.g., due to timing. Instable meaning: we must
perform the test repeatedly and if it passes above a threshold we
consider that the test passes.
Factor 1.5
Factor 1.6
Tests are not instable, e.g., due to timing. Instable meaning: the
results cannot be trusted at all.
Computers are not likely to make errors when performing and
evaluating these tests, e.g. test execution is not deterministic.
Factor 1.7
Test results are easy to analyze automatically
Factor 2.1
There are economic benefits of automating these tests.
Factor 2.2
Tests are likely to reveal defects, i.e. high risk areas.
Factor 2.3
The product being tested is mission critical.
Factor 2.4
Tests cover the most important features, i.e. high importance areas.
Factor 2.5
Frequent regression testing is beneficial or essential for this product.
Factor 3.1
Tests are easy and straight forward to automate.
Factor 3.2
Test automation will not require a lot of maintenance effort.
Decision Point: F2
Decision Point: F3
Decision Point: F4
Factor 4.1
Humans are likely to make errors when performing and evaluating
these tests, e.g. tests require vigilance in execution.
Factor 5.1
Developers have low knowledge in the product being tested, i.e.
product has not been developed on for a long period of time.
Decision Point: F5
Decision Point: F6
Factor 6.1
Agree/Disagree
Test type is favorable for automation, i.e. tests are performance or
load tests.
Agree/Disagree
Decision Point: F7
Agree/Disagree
Agree/Disagree
TC Information
TC ID:
Factor 7.3
The product being tested is highly customizable, i.e. have
much configurations.
The lifetime of the tests is high.
TC Title:
Factor 7.4
Tests are performed on a product with a long life cycle.
Factor 7.5
The number of builds for this product is high.
Factor 8.1
Test type is favorable for automation, i.e. tests are smoke or
build verification tests.
Factor 7.2
Agree/Disagree
Agree/Disagree
We make several releases of the product.
Factor 7.1
Decision Point: F8
Agree/Disagree
Steps:
Date:
Result
Agree/Disagree
Performed by:
Result:
Comments
7.F. Benefits from Automation Survey
7.F
Benefits from Automation Survey
The survey that was sent out to responders for evaluating the organisational benefits of the
automated tests implemented in this thesis (see section 3.2.2).
74
Survey: Evaluate benefits from test automation
Name:
Team:
Role:
Date:
____________________________
____________________________
____________________________
____________________________
First take a look at the benefits that have been found from test automation1.
1. Improved product quality
Quality in terms of fewer defects present in the software product.
2. Increased test coverage
High coverage of code (e.g. statement, branch, path) is achieved through automation.
3. Reduced testing time
Time required for testing, i.e. the ability to run more tests within a timeframe.
4. Increased test reliability
Automated Software Testing is more reliable when repeating tests as variance in outcomes can
be due to the manual tester running the tests in a different way, but can not make use of the
knowledge of the tester.
5. Increase in confidence
Increase of confidence in the quality of the system (e.g. as perceived by developers).
6. Reusability of tests
When tests are designed with maintenance in mind they can be repeated frequently, a high
degree of repetition of test cases leads to benefits, not a single execution of an automated test
case.
7. Less human effort
Automation reduces human effort that can be used for other activities (in particular ones that lead
to defect prevention).
8. Reduction in cost
With a high degree of automation cost are saved.
9. Shorter release cycles
Test automation is a prerequisite for continuous integration and will allow for shorter release
cycles.
1
As defined in D. Rafi, K. Moses, K. Petersen, and M. Mäntylä, “Benefits and limitations of automated software
testing: Systematic literature review and practitioner survey”, Proceedings of
the 7th International Workshop on Automation of Software Test, Jun. 2012, pp. 36–42.
1
First consider the test: Web Product A
Please see attached TC and code implementation.
Questions
1.
2.
3.
4.
5.
6.
7.
8.
9.
Disagree
Strongly
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
The implementation of the
automated test does to some
degree allow for:
Improved product quality
Increased test coverage
Reduced testing time
Increased test reliability
Increase in confidence
Reusability of tests
Less human effort
Reduction in cost
Shorter release cycles
Now consider the test: Web Product B
Please see attached TC and code implementation.
Questions
1.
2.
3.
4.
5.
6.
7.
8.
9.
2
The implementation of the
automated test does to some
degree allow for:
Improved product quality
Increased test coverage
Reduced testing time
Increased test reliability
Increase in confidence
Reusability of tests
Less human effort
Reduction in cost
Shorter release cycles
Disagree
Strongly
Now consider the test: Interface Rich Desktop Product
Please see attached code implementation, no TC exists for this test.
Questions
1.
2.
3.
4.
5.
6.
7.
8.
9.
3
The implementation of the
automated test does to some
degree allow for:
Improved product quality
Increased test coverage
Reduced testing time
Increased test reliability
Increase in confidence
Reusability of tests
Less human effort
Reduction in cost
Shorter release cycles
Disagree
Strongly
Disagree
Slightly
Agree
Slightly
Agree
Strongly
Do
not
know
Download