Adaptive Automation: Leveraging Machine Learning to

advertisement
Adaptive Automation: Leveraging Machine
Learning to Support Uninterrupted Automated
Testing of Software Applications
arXiv:1508.00671v1 [cs.SE] 4 Aug 2015
Rajesh Mathur, Scott Miles, Miao Du
Awesomation Testing
Melbourne, Australia
http://www.awesomationtesting.com
Email: {rmathur,smiles,mdu}@awesomationtesting.com
Abstract—Checking software application suitability using automated software tools has become a vital element for most
organisations irrespective of whether they produce in-house software or simply customise off-the-shelf software applications for
internal use. As software solutions become ever more complex, the
industry becomes increasingly dependent on software automation
tools, yet the brittle nature of the available software automation
tools limits their effectiveness. Companies invest significantly
in obtaining and implementing automation software but most
of the tools fail to deliver when the cost of maintaining an
effective automation test suite exceeds the cost and time that
would have otherwise been spent on manual testing. A failing in
the current generation of software automation tools is they do
not adapt to unexpected modifications and obstructions without
frequent (and time expensive) manual interference. Such issues
are commonly acknowledged amongst industry practitioners,
yet none of the current generation of tools have leveraged the
advances in machine learning and artificial intelligence to address
these problems.
This paper proposes a framework solution that utilises machine learning concepts, namely fuzzy matching and error
recovery. The suggested solution applies adaptive techniques
to recover from unexpected obstructions that would otherwise
have prevented the script from proceeding. Recovery details
are presented to the user in a report which can be analysed
to determine if the recovery procedure was acceptable and the
framework will adapt future runs based on the decisions of the
user. Using this framework, a practitioner can run the automated
suits without human intervention while minimising the risk of
schedule delays.
I. I NTRODUCTION
In the last decade, the Information Technology industry
has grown multifold. What we could not imagine pre-internet
era is now available to the masses. Laptop and Desktop
computers with gigabytes of memory and terabytes of hard
disk space are available to school kids. Smartphones that
have capabilities of early age supercomputers are adorned by
common people. Wireless network has become a necessity
and is being considered as one of the core components of
Maslow Need Hierarchy theory. Software applications that
were unimaginable at the turn of the century are now available
and are becoming increasingly complex.
Software testing that was once considered a repetitive and
less-intelligent activity is proving to be one of the most
important activities during the Software Development Life
Cycle. The software testing services industry is forecast to
grow at greater than 11% annually [1] and a recent industry
report suggests that the industry will be a $US46 billion
industry within three years [2].
The increasing complexity of the software application combined with the increasing demand for shorter release cycles has
led to the expectation of more testing within an ever shortening
time frame. In order to accomplish this goal organisations
often turn to support from computers that can check their own
applications. The use of software automation tools to test an
application has now become a norm in the software industry.
While testing by humans is considered a complex cognitive
activity, the effectiveness of it can be greatly enhanced when
coupled with automation to perform the tasks that are repetitive
in nature and require frequent regression. When used for the
task of checking, automation tools can provide a fast and
consistent approach to support the conformance of a product
to business requirements.
One of the biggest challenges that the software industry
faces while using commercial automation tools is that most
of these tools are unintelligent and require significant maintenance. Almost all of these tools focus solely on conforming
to information that has been inputted by a human user and
are forced to abort a test when an obstruction occurs. Unfortunately human users are all too susceptible to error and
will commonly input the wrong thing (expecting something
that should not be expected), forget to input something (not
updating an existing script with an intended change) or will not
account for something (users on a different browser version
will get an additional popup dialog when accessing certain
webpages). The software industry has not grown in developing
intelligent tools or equally intelligent users of these tools.
Many companies depend on record and replay features and
trust that the testing will be done completely with sufficient
coverage [3].
II. T EST AUTOMATION C HALLENGES
The promised benefits of test automation are undermined
by serious limitations in the current tools and methodology.
We describe four major challenges with the current tools,
particularly with respect to automated user interface testing.
Challenge 1: Test Case Generation is Manual and Labour
Intensive
There are many different approaches used for generation of
test scripts within the current generation of automation tools.
The majority of tools rely on a tester to script what actions
should occur and what checks to perform, or to record actions
to be replayed in the exact sequence on demand. The tools
make no attempt at understanding the intended function of a
system and generate an automatic suite of possible test scripts.
While a tool may not be able to determine all of the expected
behavior of a system, there are many aspects that are common
to all systems or can be learned based on previously scripted
tests.
Challenge 2: Test Scripts are Not Validated until Runtime
Testing tools commonly use an object repository to map
a user friendly object name with a list of object parameters
used to identify the object. When a script is written within
a test tool, the majority of tools offer no live validation that
items referred to in the test script exist in the repository. For
example if the command to click the object OK Button is
scripted, there is no validation of whether the OK Button exists
in the repository until the script is executed. Likewise, if the
test author enters the command to “Enter Text” on the button,
it again may not be detected as an invalid command until the
script is executed.
Some tools attempt to address this limitation by verifying
the objects used in a script when it is created and modified,
however if the item is later removed or modified from the
object repository there may not be any indication if a script
has been broken.
Challenge 3: Test Scripts are Brittle
When an automation script fails to locate an item on the
current screen the majority of frameworks will do one of two
things (depending on the framework and settings used):
• Skip remaining steps and fail the whole test.
• Fail the single step and continue to next step which will
most likely fail as a result of the missed step.
Often these failures occur when an object has a code change
which affects the parameters being used to identify the object.
Examples of such changes include: changing of the name
parameter, changing the type of an object, or changing the
parent container. Failing a test because of such a change such
is generally not the desired outcome as only a GUI element
has changed, there is no effect on the functionality of the
system. If the element has changed the tool will no longer be
able to be identify the element and therefore the test will not
perform the intended action, potentially causing a cascade of
failures. Maintenance of these changes is costly and involves
a significant amount of investigation, updating and repetition
of sequential tests in order to verify the corrected script.
Another cause of failure can be slight cosmetic changes,
such as: changing a button to rounded corners, changing the
text on a hyperlink, changing the location of a button by 1
pixel, or changing the background in a screenshot comparison.
While these changes appear trivial to the application developer
and end user, these changes impact the tester, as they can break
the automated test case.
A third cause of failure is an unexpected occurrence. These
often occur as popup windows, warning dialogs, or additional
steps. Some examples of these are: the webpage security
certificate has expired and must be accepted, payment was
made recently with the same credit card or, accepting changed
terms and conditions before proceeding. The combinations of
unexpected occurrences that could occur in different locations
of a script are innumerable making these scenarios difficult to
predict and costly to accomodate.
The last frequent cause of automation scripts breaking is
failure of an action to load completely within a timely fashion.
Often a database query will hang without returning a result or
a webpage will not be served correctly on the first attempt.
While it may be significant depending on the circumstances
and the testing being performed this is often due to testing
on restricted infrastructure within a test environment. The
functionality itself has not been affected and the test would
be able to continue if the framework could recover from such
an event.
All of these types of failures should be brought to the testers
attention to determine if the change is acceptable, but should
not cause a test to stop or subsequent steps to fail where an
alternative is available.
Challenge 4: Maintaining a Test Suite is Expensive
Automation is most commonly applied as a regression check
for features that have reached a mature state in their life
cycle. When changes occur to a mature feature, they are
usually minor, such as adding a new field to a drop down
box, changing the text in a paragraph, adding a link to a new
page on a website.
The functionality of these changes can usually be assumed
based on the existing test scripts and the context of the change,
yet can require a lot of effort on the testers behalf to update
every script that touches that page. For example, if an online
store adds a new size option to their Add to Cart functionality
each test script that touches that function would need to
include checks for this option.
It is also possible for a change to be implemented and tested
manually with the parties involved forgetting to ensure that
the automation framework has been updated. This will often
cause a series of failures that were expected and then a rushed
updating of the existing test scripts. Tests must be expanded
by the tester when a new option or field is included.
III. A DAPTIVE T EST AUTOMATION ROADMAP
To address these shortcomings in the current generation of
automation tools, we propose that test automation frameworks
Fig. 1.
The Adaptive Automation Approach
should leverage self adaptive technologies and machine learning methods. A self adaptive automated test framework is
needed to support the design, execution and maintenance of
test cases as shown in Figure II. The remainder of this section
proposes some the adaptive support mechanisms that could be
given across various stages of the test automation lifecycle.
Goal 1: Test Generation - automatic test generation from
screens and user behaviour
Test scripts should be generated automatically where possible. There are two general strategies which may be employed
for automatic user interface test generation:
•
•
Automatically exploring the screens or pages and generating tests which exercise all of the discovered fields.
Heuristics may be used guide the values used in the test
cases. One could employ a database of common field
names and associated formats. For example if a field
contains “email” in its name then check that it only
accepts email formats. If a field is called “age”, then an
appropriate check would be that the field does not accept
negative values.
Data mining the user interaction logs to create test
cases. The values used in the test cases and the paths
tested would be drawn directly from the user data. This
approach has the benefit of focusing on the use case
scenarios most important to the users rather than theoretical execution paths which might rarely be exercised
in practice.
Goal 2: Live Validation - dynamic checking of test scripts at
authoring time
Test case authoring tools should provide the same level of
dynamic checking and assistance as is given by other software interactive development environments (such as Eclipse.)
Specifically:
• If the script makes reference to a deleted or misspelled
object or be missing a parameter, this should be detected
at authoring time and flagged to the script author. This
would prevent many runtime test failures.
• Script authoring tools should know what actions are
possible, and what parameters are valid for each action.
These would be provided by dropdown lists and predictive text.
• If an object is changed or removed in a way that would
cause an existing script to fail, a notification should occur
immediately.
Goal 3: Adaptive Recovery - self adaptive recovery to test
script interruptions
To address the challenge of test script brittleness, we propose the following collection of techniques:
• Fuzzy matching: when an object required by the test script
does not exist, rather than failing because there is no exact
match, the script should adaptively recover by applying
a fuzzy match to find the closest item to the expected
object. The recovery technique would try to find another
object which has the closest match to the attributes of
•
•
the missing object. To define closest, many alternative
similarity measures could be used. If the attributes are
numerical, the Euclidean distance or cosine distance
might be used. For string attributes, the edit distance
could be used. To calculate the overall similarity of two
objects, the weighted sum of their attribute similarity
could be applied, or a more sophisticated method, such
as the Analytical Hierarchical Process.
As a simple concrete example, consider a script which
expects a listbox called “CitiesDropDown”. When the
expected object does not exist, the script might then try
to use the another listbox called CitiesList instead.
Check for unexpected popups: when script is stalled, the
proposed framework would automatically detect whether
there are any unexpected popup windows and attempt
to resolve the popup in the most suitable method. A
popup definition would be frequently updated with known
popups and recommended action to resolve the dialog.
As tests are automated and client specific popups are
encountered these would be added to the definition and
machine learning would be applied to determine the
resolution.
Perform recovery techniques: when the expected objects
do not appear on the current page, the framework should
determine possible recovery techniques and attempt these
in order to continue the test. Some basic examples of
recovery techniques would include: (a) if the current page
appears to be a login page, attempt to log in with previous
details as user session may have expired; (b) If a next
or previous button is available, attempt to explore the
adjacent pages for the object as author may have forgotten
this step; (c) if page has not loaded correctly, refresh
the page and try again; and (d) other custom actions
determined by user or learnt from test scripts.
When the test script adaptively recovers from an unexpected condition, this should be reported. The tester
may then assess whether the adaptation was valid. The
tester may also wish to approve the adaptation being
permanently added to the test script, or manually update
the script for the changed conditions.
Goal 4: Automatic Maintenance - automatic updates of test
scripts for application changes
The final goal of adaptive automated testing is to link testing
scripts directly to changes made to the application under test.
When the application under test version control system
(such as Git) is available, changes to the user interface could
be automatically detected. The types of changes which affect
the test script include: changes to the name of a referenced
object, a referenced object is deleted or a new user interface
object is added or otherwise changed. The test owner shown
be alerted to any such detected changes. The ultimate goal
of adaptive automated testing would be for such changes to
automatically update the associated test scripts, with the test
owner only asked to confirm the changes. Automated updates
to the script would include updating changed object names,
removing code referencing deleted elements and expanding
the test for the new elements.
An alternative approach, where the application under test
version control system is not available is to automatically
analyse the user interface for new versions of the application
binary or SaaS delivered application. The same kinds of
information can be inferred by discovering the elements in
the user interface screens and pages, and comparing it to the
previous version. The actionable steps, such as informing the
test owner or automatically updating the test scripts would be
the same as with a source control system.
When a new element is detected in a page that belongs to
a specified script the framework could take the information
it already has about this field to modify existing test scripts.
For example If an online store adds a new size option to their
“Add to Cart” feature, the tool already knows that when Small,
Medium or Large is selected that the corresponding value is
added to the cart, it would be a simple matter for it to recognise
the new “Extra Large” option and apply the same logic.
IV. P ROTOTYPE
We have a proof-of-concept prototype which implements
some of our goals of adaptive test recovery for automated
user interface testing. The prototype addresses three common
issues: missing user interface elements, disruptions to the user
interface flow caused by pop-up dialogs and recovery of tests
using pre-defined recovery techniques.
As an example of how the prototype works, suppose XYZ
Corp is a fictional organisation which uses Google Apps. XYZ
Corp has a test script to validate the onboarding of their new
employees. Now imagine a scenario where the developers at
Google have added an additional field to the Google account
creation process where a user must enter their gender. The test
script was never updated to include this additional step, and
if this test was to fail in the current generation of Automation
Tools the result would cascade to a large set of test cases that
expect to use the newly created account. Figure III shows what
would happen in the Adaptive Automation proposed solution.
The recovery feature of this framework has detected that
the test case failure was most likely caused due to an element
that was not set. It then proceeds to combine a set of logical
heuristics along with what it can learn from similar elements
within similar test cases and determines the best process to
recover from this error. The remaining tests within the test
suite can then proceed as scheduled and the tester can review
the results to determine if the recovery process was acceptable.
Future recovery attempts would be adjusted to compensate for
the users expectations based on their decision to accept or
reject the recovery attempt.
V. R ELATED W ORK
HP Unified Functional Testing (formerly HP QTP) is possibly the most well known tool available in today’s market.
HP UFT offers a rich set of features in a polished graphical
user interface including: test flow viewer, keyword driven tests,
scripted tests, capture replay and many more.
Fig. 2.
A screenshot of the adaptive automation framework prototype showing the results of a test run with adaptive error recovery.
Selenium is a well known open source test automation
tool which focuses on testing web applications. It allows you
to develop tests through recording tools or through manual
scripting, and to execute those tests on multiple browsers
across various platforms.
Cucumber is another well known open source framework
offering a behavior-driven development approach to test automation. It is a Ruby language framework that is both
powerful and simple at the same time allowing tests to be
described through plain language scenarios.
VI. D ISCUSSION
Automated testing is an important enabler for the rapid
transfer of a software update from Development to Operations.
However the current state of automated testing is limited
because the time investment required at the test authoring and
maintenance stages often outweighs the time savings yielded
from automatic test execution. Automatic test scripts are brittle
to small changes in the application under test or the test
environment. This causes automatic testing to be unreliable
and high maintenance.
We have proposed a framework for adaptive automated testing. Our framework supports automated testing at three stages
of the test automation lifecycle: test design, test execution
and test maintenance. Our framework automatically generates
tests directly from the application under test reducing the
cost of implementation. Test case authoring is supported with
live validation of referenced objects in the test scripts. Test
execution is made robust by self adaptive recovery, using a
combination of techniques to recover from missing elements
or unexpected events. Test maintenance is automated or semiautomated by directly linking changes to the application to
updates in the test script.
We have developed a proof of concept prototype which
demonstrates self adaptive recovery. Our prototype applies
fuzzy matching to find the closest match in the event of
missing elements. It also automatically responds to unexpected
events such as popup warning windows and applies recovery
techniques to failing tests.
The focus of this paper has been on automated user interface
driven testing. Our techniques could also be adapted to work
with other forms of testing, such as API testing, performance
testing and accessibility testing.
ACKNOWLEDGMENT
The authors thank Professor John Grundy for the very
informative discussions and suggested directions.
R EFERENCES
[1] TechNavio, “Global Software Testing Service Market 2014-2018,” 24
September 2014, SKU: IRTNTR4257. [Online]. Available: http://www.
technavio.com/report/global-software-testing-service-market-2014-2018
[2] Micro Focus, “Testing: IT’s Invisible Giant,” Micro Focus, Tech.
Rep., 2011. [Online]. Available: https://www.microfocus.com/assets/
testing-its-invisible-giant tcm6-200416.pdf
[3] C. Kaner, “The impossibility of complete testing,” Software QA, vol. 4,
no. 4, p. 28, 1997.
Download