ArticlePDF Available

A Proposal for Automatic Testing of GUIs Based on Annotated Use Cases

Abstract and Figures

This paper presents a new approach to automatically generate GUI test cases and validation points from a set of annotated use cases. This technique helps to reduce the effort required in GUI modeling and test coverage analysis during the software testing process. The test case generation process described in this paper is initially guided by use cases describing the GUI behavior, recorded as a set of interactions with the GUI elements (e.g., widgets being clicked, data input, etc.). These use cases (modeled as a set of initial test cases) are annotated by the tester to indicate interesting variations in widget values (ranges, valid or invalid values) and validation rules with expected results. Once the use cases are annotated, this approach uses the new defined values and validation rules to automatically generate new test cases and validation points, easily expanding the test coverage. Also, the process allows narrowing the GUI model testing to precisely identify the set of GUI elements, interactions, and values the tester is interested in.
Content may be subject to copyright.
Hindawi Publishing Corporation
Advances in Software Engineering
Volume 2010, Article ID 671284, 8pages
doi:10.1155/2010/671284
Research Article
A Proposal for Automatic Testing of GUIs
Based on Annotated Use Cases
Pedro Luis Mateo Navarro,1, 2 Diego Sevilla Ruiz,1, 2 and Gregorio Mart´
ınez P´
erez1, 2
1Departamento de Ingenier´
ıa de la Informaci´
on y las Comunicaciones, University of Murcia, 30071 Murcia, Spain
2Departamento de Ingenier´
ıa y Tecnolog´
ıa de Computadores, University of Murcia, 30071 Murcia, Spain
Correspondence should be addressed to Pedro Luis Mateo Navarro, pedromateo@um.es
Received 16 June 2009; Accepted 14 August 2009
Academic Editor: Phillip Laplante
Copyright © 2010 Pedro Luis Mateo Navarro et al. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
This paper presents a new approach to automatically generate GUI test cases and validation points from a set of annotated use
cases. This technique helps to reduce the eort required in GUI modeling and test coverage analysis during the software testing
process. The test case generation process described in this paper is initially guided by use cases describing the GUI behavior,
recorded as a set of interactions with the GUI elements (e.g., widgets being clicked, data input, etc.). These use cases (modeled
as a set of initial test cases) are annotated by the tester to indicate interesting variations in widget values (ranges, valid or invalid
values) and validation rules with expected results. Once the use cases are annotated, this approach uses the new defined values
and validation rules to automatically generate new test cases and validation points, easily expanding the test coverage. Also, the
process allows narrowing the GUI model testing to precisely identify the set of GUI elements, interactions, and values the tester is
interested in.
1. Introduction
It is well known that testing the correctness of a Graphical
User Interfaces (GUI) is dicult for several reasons [1]. One
of those reasons is that the space of possible interactions with
a GUI is enormous, which leads to a large number of GUI
states that have to be properly tested (a related problem is
to determine the coverage of a set of test cases); the large
number of possible GUI states results in a large number of
input permutations that have to be considered. Another one
is that validating the GUI state is not straightforward, since
it is dicult to define which objects (and what properties of
these objects) have to be verified.
This paper describes a new approach between Model-less
and Model-Based Testing approaches. This new approach
describes a GUI test case autogeneration process based on a
set of use cases (which are used to describe the GUI behavior)
and the annotation (definition of values, validation rules,
etc.) of the relevant GUI elements. The process generates
automatically all the possible test cases depending on the val-
ues defined during the annotation process and incorporates
new validation points, where the validation rules have been
defined. Then, in a later execution and validation process, the
test cases are automatically executed and all the validation
rules are verified in order to check if they are met or not.
The rest of the paper is structured as follows. Related
work is presented in Section 2.InSection 3 we describe
the new testing approach. Annotation, autogeneration, and
execution/validation processes are described in Sections 4,
5and 6, respectively. Fi nally, Section 8 provides conclusions
and lines of future work.
This paper is an extended version of the submitted con-
tribution to the “Informatik 2009: Workshop MoTes09” [2].
2. Related Work
Model-Based GUI Testing approaches can be classified
depending on the amount of GUI details that are included
in the model. By GUI details we mean the elements which
are chosen by the Coverage Criteria to faithfully represent the
tested GUI (e.g., window properties, widget information and
properties, GUI metadata, etc.).
2Advances in Software Engineering
Many approaches usually choose all window and widget
properties in order to build a highly descriptive model of
the GUI. For example, in [1] (Xie and Memon) and in
[3,4] (Memon et al.) it is described a process based on GUI
Ripping, a method which traverses all the windows of the
GUI and analyses all the events and elements that may appear
to automatically build a model. That model is composed of
a set of graphs which represent all the GUI elements (a tree
called GUI Forest) all the GUI events and their interaction
(Event-Flow Graphs (EFG), and Event Interaction Graphs
(EIG)). At the end of the model building process, it has to be
verified, fixed, and completed manually by the developers.
Once the model is built, the process explores automati-
cally all the possible test cases. Of those, the developers select
the set of test cases identified as meaningful, and the Oracle
Generator creates the expected output( a Test Oracle [5]isa
mechanism which generates outputs that a product should
have for determining, after a comparison process, whether
the product has passed or failed a test (e.g., a previous stored
state that has to be met in future test executions). Test Oracles
also may be based on a set of rules (related to the product)
that have to be validated during test execution). Finally, test
cases are automatically executed and their output compared
with the Oracle expected results.
As said in [6], the primary problem with these
approaches is that as the number of GUI elements increases,
the number of event sequences grows exponentially. Another
problem is that the model has to be verified, fixed, and
completed manually by the testers, with this being a tedious
and error-prone process itself. These problems lead to other
problems, such a scalability and modifications tolerance. In
these techniques, adding a new GUI element (e.g., a new
widget or event) has two worrying side eects. First, it may
cause the set of generated test cases to grow exponentially
(all paths are explored); second, it forces a GUI Model
update (and a manual verification and completion) and the
regeneration of all aected test cases.
Other approaches use a more restrictive coverage criteria
in order to focus the test case autogeneration eorts on only
a section of the GUI which usually includes all the relevant
elements to be tested. In [7] Vieira et al. describe a method
in which enriched UML Diagrams (UML Use Cases and
Activity Diagrams) are used to describe which functionalities
should be tested and how to test them. The diagrams are
enriched in two ways. First, the UML Activity Diagrams are
refined to improve the accuracy; second, these diagrams are
annotated by using custom UML Stereotypes representing
additional test requirements. Once the model is built, an
automated process generates test cases from these enriched
UML diagrams. In [8] Paiva et al. also describe a UML
Diagrams-based model. In this case, however, the model is
translated to a formal specification.
The scalability of this approach is better than the
previously mentioned because it focuses its eorts only on
a section of the model. The diagram refinement also helps
to reduce the number of generated test cases. On the other
hand, some important limitations make this approach not
so suitable for certain scenarios. The building, refining, and
annotation processes require a considerable eort since they
have to be performed manually, which does not suit some
methodologies such as, for instance, Extreme Programming;
these techniques also have a low tolerance to modifications;
finally, testers need to have a knowledge of the design of the
tested application (or have the UML model), which makes
impossible to test binary applications or applications with an
unknown design.
3. Overview of the Annotated Use Case
Guided Approach
In this paper we introduced a new GUI Testing approach
between Mode-less and Model-Based testing. The new
approach is based on a Test Case Autogeneration process
that does not build a complete model of the GUI. Instead,
it models two main elements that are the basis of the test case
autogeneration process.
(i) A Set of Use Cases. These use cases are used to
describe the behavior of the GUI to be tested. The use
cases are used as the base of the future test cases that
are going to be generated automatically.
(ii) A Set of Annotated Elements. This set includes the
GUI elements whose values may vary and those
with interesting properties to validate. The values
define new variaton points for the base use cases; the
validation rules define new validation points for the
widget properties.
With these elements, the approach addresses the needs
of GUI verification, since, as stated in [7], the testing
of a scenario can usually be accomplished in three steps:
launching the GUI, performing several use cases in sequence,
and exiting. The approach combines the benefits from both
“Smoke Testing” [4,9] and “Sanity Testing” [10], as it is able
to assure that the system under test will not catastrophically
fail and test the main functionality (in the first steps of the
development process) and fine-tune checking and properties
validation (in the final steps of the development process) by
an automated script-based process.
The test case generation process described in this paper
takes as its starting point the set of use cases (a use case is a
sequence of events performed on the GUI; in other words,
a use case is a test case) that describe the GUI behavior.
From this set, it creates a new set of autogenerated test
cases, taking into account the variation points (according to
possible dierent values of widgets) and the validation rules
included in the annotations. The resulting set includes all the
new autogenerated test cases.
The test case autogeneration process can be seen, in a
test case level, as the construction of a tree (which initially
represents a test case composed of a sequence of test items)
to which a new branch is added for each new value defined in
the annotations. The validation rules are incorporated later
as validation points.
Therefore, in our approach, modeling the GUI and the
application behavior does not involve building a model
including all the GUI elements and generating a potentially
large amount of test cases exploring all the possible event
Advances in Software Engineering 3
If the annotation
is cancelled
GUI
ready
Widget
interacted Assert
Oracle
State
Oracle
Set value
constraints
Set
validation
rules
Set
enabled
Perform
widget
action
Store
annotations
Figure 1: Schematic representation of the Widget Annotation Process.
sequences. In the contrary, it works by defining a set of
test cases and annotating the most important GUI elements
to include both interesting values (range of valid values,
out-of-range values) and a set of validation rules (expected
results and validation functions) in order to guide the test
case generation process. It is also not necessary to manually
verify, fix, or complete any model in this approach, which
removes this tedious and error-prone process from the
GUI Testing process and eases the work of the testers.
These characteristics help to improve the scalability and the
modifications tolerance of the approach.
Once the new set of test cases is generated, and the
validation rules are incorporated, the process ends with
the test case execution process (that includes the validation
process). The result of the execution is a report including
any relevant information to the tester (e.g., number of test
performed, errors during the execution, values that caused
these errors, etc). In the future, the generated test case set
can be re-executed in order to perform a regression testing
process that checks if the functionality that was previously
working correctly is still working.
4. Annotation Process
The annotation process is the process by which the tester
indicates what GUI elements are important in terms of the
following: First, which values can a GUI element hold (i.e.,
a new set of values or a range), and thus should be tested;
second, what constraints should be met by a GUI element
at a given time (i.e., validation rules), and thus should be
validated. The result of this process is a set of annotated
GUI elements which will be helpful during the test case
autogeneration process in order to identify the elements that
represent a variation point, and the constraints that have to
be met for a particular element or set of elements. From now
on, this set will be called Annotation Test Case.
This process could be implemented, for example,
using a capture and replay (C&R) tool( a Capture and
Replay Tool captures events from the tested application
and use them to generate test cases that replay the
actions performed by the user. Authors of this paper
have worked on the design and implementation of such
tool as part of a previous research work, accessible on-
line at http://sourceforge.net/projects/openhmitester/ and at
http://www.um.es/catedraSAES/) . These tools provide the
developers with access to the widgets information (and
also with the ability to store it), so they could use this
information along with the new values and the validation
rules (provided by the tester in the annotation process)to
build the Annotation Test Case.
As we can see in Figure 1, the annotation process,which
starts with the tested application launched and its GUI ready
for use, can be performed as follows:
(1) For each widget the tester interacts with (e.g., to
perform a click action on a widget or enter some data
by using the keyboard), he or she can choose between
two options: annotate the widget (go to the next step)
or continue as usual (go to step 3).
(2) A widget can be annotated in two ways, depending on
the chosen Test Oracle method. It might be an “Assert
Oracle” (checks a set of validation rules related to the
widget state), or a “State Oracle” (checks if the state
of the widget during the execution process matches the
state stored during the annotation process).
(3) The annotations (if the tester has decided to annotate
the widget) are recorded by the C&R tool as part
of the Annotation Test Case. The GUI performs the
actions triggered by the user interaction as usual.
(4)TheGUIisnowreadytocontinue.Thetestercan
continue interacting with the widgets to annotate
them or just finish the process.
The annotated widgets should be chosen carefully as
too many annotated widgets in a test case may result in an
explosion of test cases. Choosing an accurate value set also
4Advances in Software Engineering
helps to get a reasonable test suite size, since during the
test case autogeneration process, all the possible combinations
of annotated widgets and defined values are explored in
order to generate a complete test suite which explores all the
paths that can be tested. So, these are two important aspects
to consider, since the scalability of the generated test suite
depends directly on the amount of annotated widgets and
the values set defined for them.
Regarding to the definition of the validation rules that are
going to be considered in a future validation process, the tester
has to select the type of the test oracle depending on his or
her needs.
For the annotation process of this approach we consider
two dierent test oracles.
(i) Assert Oracles. These oracles are useful in two ways.
First, if the tester defines a new set of values or a
range, new test cases will be generated to test these
values in the test case autogeneration process; second,
if the tester also defines a set of validation rules,
these rules will be validated during the execution and
validation process.
(ii) State Oracles. These oracles are useful when the tester
has to check if a certain widget property or value
remains constant during the execution and validation
process (e.g., a widget that can not be disabled).
In order to define the new values set and the validation
rules, it is necessary to incorporate to the process a specifi-
cation language which allows the tester to indicate which are
going to be the new values to be tested and what constraints
have to be met. This specification language might be a
constraint language as, for instance, the Object Constraint
Language (OCL) [11], or a script language as, for instance,
Ruby [12]. This kind of languages can be used to allow the
tester to identify the annotated object and specify new values
and validation rules for it. It is also necessary to establish a
mapping between widgets and constructs of the specification
language; both languages have mechanisms to implement
this feature.
Validation rules also can be set to specify if the tester
wants the rules to be validated before (precondition) or after
(postcondition) an action is performed on the annotated
widget. For example, if the tester is annotating a button
(during the annotation process), it might be interesting to
check some values before the button is pressed, as that button
operates with those values; it also might be interesting to
check, after that button is pressed, if the obtained result met
some constraints. The possibility to decide if the validation
rulesaregoingtobecheckedbeforeofafteranaction
is performed (these are the well-known preconditions and
postconditions) allows the tester to perform a more powerful
validation process. This process could be completed with the
definition of an invariant, for example, together with the
state oracles, since the invariant is composed of a set of
constraints that have to be met through the process (an
invariant in this domain would be a condition that is always
met in the context of the current dialog.).
Generate a test case
foreachusecase Tes t s ui t e
Generate an
annotation
test case
Annotated
test suite
Auto-generated
annotated
test suite
Test case
auto-generation
If new validation
rules, range
values or elements
have to be added
If new use cases
have to be added or
the GUI undergoes
acriticalmodication
Figure 2: Schematic representation of the Test-Case Auto Genera-
tion Process.
5. Test Case AutoGeneration Process
The test case autogeneration process is the process that
automatically generates a new set of test cases from two
elements:
(i) a test suite composed of an initial set of test cases
(those corresponding to the use cases that represent
the behavior of the GUI);
(ii) an special test case called Annotation Test Case which
contains all the annotations corresponding to the
widgets of a GUI.
AscanbeseeninFigure 2, the process follows these steps:
(1) As said above, the process is based on an initial test
suite and an Annotation Test Case. Both together
make up the initial Annotated Test Suite.
(2) The test case autogeneration process explores all the
base use cases. For each use case, it generates all
the possible variations depending on the values
previously defined in the annotations. It also adds
validators for ensuring that the defined rules are met.
(This process is properly explained at the end of this
section).
(3) The result is a new Annotated Test Suite which
includes all the auto-generated test cases (one for
each possible combination of values) and the Anno-
tation Test Case used to generate them.
The set of auto-generated test cases can be updated, for
example, if the tester has to add or remove new use cases
due to a critical modification in the GUI, or if new values
or validation rules have to be added or removed. The tester
will then update the initial test case set, the Annotation Test
Case, or both, and will rerun the generation process.
Advances in Software Engineering 5
The algorithm corresponding to the test case autogenera-
tion process is shown in Algorithm 1.
The process will take as its starting point the Annotation
Test Ca s e and the initial set of test cases, from which it will
generate new test cases taking into account the variation
points (the new values) and the validation rules included in
the annotations.
For each test case in the initial set, the process inspects
every test item (a test case is composed of a set of steps called
test items) in order to detect if the widget referred by this
test item is included in the annotated widget list. If so, the
process generates all the possible variations of the test case
(one for each dierent value, if exist), adding also a validation
point if some validation rules have been defined. Once the
process has generated all the variations of a test case, it adds
them to the result set. Finally, the process returns a set of
test cases which includes all the variations of the initial test
cases.
Figure 3 is a graphical representation of how the algo-
rithm works. The figure shows an initial test case which
includes two annotated test items (an Annotated Test Item
is a test item that includes a reference to an annotated
widget). The annotation for the first widget specifies only two
dierent values (15 and 25); the annotation for the second
one specifies two new values (1 and 2) and introduces two
validation rules (one related to the colour property of the
widget and another related to the text property). The result of
the test case autogeneration process will be four new test cases,
one for each possible path (15-1, 15-2, 25-1, and 25-2), and
a validation point in the second annotated test item which
will check if the validation rules mentioned before are met
or not.
6. Execution and Validation Process
The execution and validation process is the process by which
the test cases (auto-generated in the last step) are executed
over the target GUI and the validation rules are asserted to
check whether the constraints are met. The test case execution
process executes all the test cases in order. It is very important
that for each test case is going to be executed, the GUI
must be reset to its initial state in order to ensure that all
the test cases are launched and executed under the same
conditions.
This feature allows the tester to implement dierent test
configurations, ranging from a set of a few test cases (e.g.,
to test a component, a single panel, a use case, etc.), to an
extensive battery of tests (e.g., for a nightly or regression
testing process [4]).
As for the validation process, in this paper we describe a
Test Oracle based validation process, which uses test oracles
[1,5] to perform widget-level validations (since the valida-
tion rules refer to the widget properties) ( A Test Oracle is a
mechanism that generates the expected output that a product
should have for determining, after a comparison process,
whether the product has passed or failed a test) . The features
of the validation process vary depending on the oracle method
selected during the annotation process as we can read below.
(i) Assert Oracles. These oracles check if a set of val-
idation rules related to a widget are met or not.
Therefore, the tester needs to somehow define a set of
validation rules. As said in Section 4 corresponding
to the annotation process, defining these rules is
not straightforward. Expressive and flexible (e.g.,
constraint or script) languages are needed to allow
the tester to define assert rules for the properties
of the annotated widget, and, possibly, to other
widgets. Another important pitfall is that if the GUI
encounters an error, it may reach an unexpected or
inconsistent state. Further executing the test case is
useless; therefore it is necessary to some mechanism
to detect these “bad states” and stop the test case
execution (e.g., a special statement which indicates
that the execution and validation process have to finish
if an error is detected).
(ii) State Oracles. These oracles check if the state of the
widget during the execution process matches the state
stored during the annotation process. To implement
this functionality, the system needs to know how
to extract the state from the widgets, represent it
somehow, and be able to check it for validity. In
our approach, it could be implemented using widget
adapters which, for example, could represent the state
of a widget as a string; so, the validation would be as
simple as a string comparison.
The validation process may be additionally completed
with Crash Oracles, which perform an application-level
validation (as opposed to widget-level) as they can detect
crashes during test case execution. These oracles are used to
signal and identify serious problems in the software; they are
very useful in the first steps of the development process.
Finally, it is important to remember that there are two
important limitations when using test oracles in GUI testing
[5]. First, GUI events have to be deterministic in order to be
able to predict their outcome (e.g., it would not make sense
if the process is validating a property which depends on a
random value); second, since the software back-end is not
modeled (e.g., data in a data base), the GUI may return a
nonexpected state which would be detected as an error (e.g.,
if the process is validating the output in a database query
application, and the content of this database changes during
the process).
7. Example
In order to show this process working on a real example, we
have chosen a fixed-term deposit calculator application. This
example application has a GUI (see Figure 4)composedofa
set of widgets: a menu bar, three number boxes (two integer
and one double), two buttons (one to validate the values and
another to operate with them), and a label to output the
obtained result. Obviously, there are other widgets in that
GUI (i.e., a background panel, text labels, a main window,
etc.), but these elements are not of interest for the example.
6Advances in Software Engineering
Original test case
AW Valu e s :
{15, 25}
Rules:
no validation rules
AW
Valu e s :
{1, 2}
Rules:
assert(widget.color == red)
if (value == 2)
assert(widget.text == “2”)
AW: Annotated widget
=
Equivalent set of test cases
15
15
25
25
1
2
1
2
Validation point
Figure 3: Test case branching.
initial test set ←− ... // initial test case set
auto gen test set ←− { } // auto-generated test case set (empty)
annotated elements ←− ... // user-provided annotations
for all TestCase tc initial test set do
new test cases ←− add test case (new test cases, tc)
for all Tes t It e m ti tc do
if ti.widget annotated elements then
annotations ←− annotations for widget(ti.widget)
new test cases ←− create new test cases (new test cases, annotations.values)
new test cases ←− add validation rules (new test cases, annotations.rules)
end if
end for
auto gen test set ←− auto gen test set new test cases
end for
return auto gen test set
Algorithm 1: Test case autogeneration algorithm.
Figure 4: Example dialog.
A common use case for this application is the following:
(1) start the application (the GUI is ready),
(2) insert the values in the three number boxes,
(3) if so, click the “Calc Interest” button and see the
result,
(4) exit by clicking the “Exit” option in the “File” menu.
The valid values for the number boxes are the following.
(i) Interest Rate. Assume that the interest rate imposed
by the bank is between 2 and 3 percent (both
included).
(ii) Deposit Amount. Assume that the initial deposit
amount has to be greater or equal to 1000, and no
more than 10 000.
(iii) Duration. Assume that the duration in months has to
be greater or equal to 3, and less than or equal to 12
months.
The behavior of the buttons is the following. If a number
box is out of range, the “Calc Interest” button changes its
background colour to red (otherwise, it has to stay white);
once it is pressed, it calculates the result using the values, and
writes it in the corresponding label. If the values are out of
range, the label must read “Data error.” In other case, the
actualinterestamountmustbeshown.
Advances in Software Engineering 7
Set interest Set amount Set duration Validate Operate Exit
Validation points
···
TC 0
TC 1
TC 2
TC 97
TC 98
2
2
2
500
500
500
6
12
24
3
3
8000
8000
12
24
Figure 5: Auto-generated test case representation for the example
dialog.
Therefore, the annotations for widgets are as follows.
(i) “Interest rate” spinbox:asetofvaluesfrom2to3with
a 0.1 increase.
(ii) “Deposit amount” spinbox: a set of values composed
of the three values 500, 1000, and 8000. (Note that
the value of 500 will introduce a validation error in
the test cases.)
(iii) “Duration” spinbox: a set of three values, 6, 12, and
24. Again, the last value will not validate.
(iv) “Calc Interest” button: depending on the values of the
three mentioned text boxes, check the following.
(1) If the values are within the appropriate ranges, the
background color of this button must be white,
and as a postcondition, the value of the label must
hold the calculated interest value (a formula may be
supplied to actually verify the value).
(2) Else, if the values are out of range, the background
color of the button must be red, and as a post-
condition, the value of the label must be “Data error.
Once the initial use case is recorded and the widgets
are properly annotated (as said, both processes might be
performed with a capture/replay tool), they are used to
compose the initial Annotated Test Suite, which will be the
basis for the test case autogeneration process.
We can see the test case autogeneration process result in
Figure 5. The new Annotated Test Suite generated by the
process is composed of 99 test cases (11 values for the
“Interest rate,” 3 dierent “Deposit amounts,” and 3 dierent
“Durations”) and a validation point located at the “Calc
Interest” button clicking (to check if the values are valid and
the background colour accordingly).
The process automatically generates one test case for each
possible path by taking into account all the values defined in
the annotation process; it also adds validation points where
the validation rules have been defined. The new set of auto-
generated test cases allows the tester to test all the possible
variations of the application use cases.
Finally, the execution and validation process will execute
all the test cases included in the generated Annotated Test
Suite and will return a report including all the information
related to the execution and validation process, showing the
number of test cases executed, the time spent, and the values
not equal to those expected.
8. Conclusions and Future Work
Automated GUI test case generation is an extremely resource
intensiveprocessasitisusuallyguidedbyacomplexand
fairly dicult to build GUI model. In this context, this
paper presents a new approach for automatically generating
GUI test cases based on both GUI use cases (required
functionality), and annotations of possible and interesting
variations of graphical elements (which generate families of
test cases), as well as validation rules for their possible values.
This reduces the eort required in test coverage and GUI
modeling processes. Thus, this method would help reducing
the time needed to develop a software product since the
testing and validation processes spend less eorts.
As a statement of direction, we are currently working
on an architecture and the details of an open-source
implementation which allow us to implement these ideas and
future challenges as, for example, to extend the GUI testing
process towards the application logic, or to execute a battery
of tests in parallel in a distributed environment.
Acknowledgments
This paper has been partially funded by the “C´
atedra SAES
of the University of Murcia” initiative, a joint eort between
Sociedad An´
onima de Electr´
onica Submarina (SAES),
http://www.electronica-submarina.com/ and the University
of Murcia to work on open-source software, and real-time
and critical information systems.
References
[1] Q. Xie and A. M. Memon, “Model-based testing of
community-driven open-source GUI applications,” in Pro-
ceedings of the 22nd IEEE International Conference on Software
Maintenance (ICSM ’06), pp. 203–212, Los Alamitos, Calif,
USA, 2006.
[2] P.Mateo,D.Sevilla,andG.Mart
´
ınez, “Automated GUI testing
validation guided by annotated use cases,” in Proceedings
of the 4th Workshop on Model-Based Testing (MoTes ’09) in
Conjunction with the Annual National Conference of Ger-
man Association for Informatics (GI ’09),L
¨
ubeck, Germany,
September 2009.
[3] A. Memon, I. Banerjee, and A. Nagarajan, “GUI ripping:
reverse engineering of graphical user interfaces for testing,” in
Proceedings of the 10th IEEE Working Conference on Reverse
Engineering (WCRE ’03), pp. 260–269, Victoria, Canada,
November 2003.
[4] A. Memon, I. Banerjee, N. Hashmi, and A. Nagarajan, “Dart:
a framework for regression testing “nightly/daily builds” of
GUI applications,” in Proceedings of the IEEE Internacional
Conference on Software Maintenance (ICSM ’03), pp. 410–419,
2003.
8Advances in Software Engineering
[5] Q. Xie and A. M. Memon, “Designing and comparing
automated test oracles for GUI based software applications,
ACM Transactions on Software Engineering and Methodology,
vol. 16, no. 1, p. 4, 2007.
[6] X. Yuan and A. M. Memon, “Using GUI run-time state as
feedback to generate test cases,” in Proceedings of the 29th
International Conference on Software Engineering (ICSE ’07),
Minneapolis, Minn, USA, May 2007.
[7] M. Vieira, J. Leduc, B. Hasling, R. Subramanyan, and J.
Kazmeier, “Automation of GUI testing using a model-driven
approach,” in Proceedings of the International Workshop on
Automation of Software Test, pp. 9–14, Shanghai, China, 2006.
[8] A. Paiva, J. Faria, and R. Vidal, “Towards the integration of
visual and formal models for GUI testing,Electronic Notes in
Theoretical Computer Science, vol. 190, pp. 99–111, 2007.
[9] A. Memon and Q. Xie, “Studying the fault-detection eective-
ness of GUI test cases for rapidly envolving software,IEEE
Transactions on Software Engineering, vol. 31, no. 10, pp. 884–
896, 2005.
[10] R. S. Zybin, V. V. Kuliamin, A. V. Ponomarenko, V. V.
Rubanov,andE.S.Chernov,“Automationofbroadsanitytest
generation,Programming and Computer Software, vol. 34, no.
6, pp. 351–363, 2008.
[11] Object Management Group, “Object constraint language
(OCL),” version 2.0, OMG document formal/2006-05-01,
2006, http://www.omg.org/spec/OCL/2.0/.
[12] Y. Matsumoto, “Ruby Scripting Language,” 2009, http://www
.ruby-lang.org/en/.
... This causes many GUI testing tools to (1) rely on a human tester: for example, capture/replay to recreate manually pre-recorded (or programatically coded) event sequences; (2) perform very limited automated testing tasks: for example, tools such as Android's Monkey 6 and Eclipse-based GUIdancer 7 perform simple random walks of the user interface, executing events as they encounter them; and detect crashes. These approaches are insufficient with the result that GUI quality is often compromised [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [2], [19], [20], [21]. 6. http://developer.android.com/guide/developing/tools/ ...
... Examples include variable FSM [15], Complete Interaction Sequences [16], hierarchical state-machine models represented as UML state diagrams [2], off-normal FSM [17], multiple FSMs called Label Transition Systems [18], and semantic models [14], [19], [20]. AI planning has also been used for test case generation [4]. ...
Article
Full-text available
System testing of software applications with a graphical-user interface (GUI) front-end requires that sequences of GUI events, that sample the application’s input space, be generated and executed as test cases on the GUI. However, the context-sensitive behavior of the GUI of most of today’s non-trivial software applications makes it practically impossible to fully determine the software’s input space. Consequently, GUI testers—both automated and manual—working with undetermined input spaces are, in some sense, blindly navigating the GUI, unknowingly missing allowable event sequences, and failing to realize that the GUI implementation may allow the execution of some disallowed sequences. In this paper, we develop a new paradigm for GUI testing, one that we call Observe-Model-Exercise* (OME*) to tackle the challenges of testing context-sensitive GUIs with undetermined input spaces. Starting with an incomplete model of the GUI’s input space, a set of coverage elements to test, and test cases, OME* iteratively observes the existence of new events during execution of the test cases, expands the model of the GUI’s input space, computes new coverage elements, and obtains new test cases to exercise the new elements. Our experiment with 8 open-source software subjects, more than 500,000 test cases running for almost 1,100 machine-days, shows that OME* is able to expand the test space on average by 464.11 percent; it detected 34 faults that had never been detected before.
... This architecture is based on the interception of GUI events by using GUI introspection. They also presented a solution to automatically generate test cases for GUIs [42]. It uses widget annotations to denote the constraints related to the values a widget can hold. ...
Article
Runtime verification (RV) provides essential mechanisms to enhance software robustness and prevent malfunction. However, RV often entails complex and formal processes that could be avoided in scenarios in which only invariants or simple safety properties are verified, for example, when verifying input data in Graphical User Interfaces (GUIs). This paper describes S-DAVER, a lightweight framework aimed at supporting separate data verification in GUIs. All the verification processes are encapsulated in an independent layer and then transparently integrated into an application. The verification rules are specified in separate files and written in interpreted languages to be changed/reloaded at runtime without recompilation. Superimposed visual feedback is used to assist developers during the testing stage and to improve the experience of users during execution. S-DAVER provides a lightweight, easy-to-integrate and dynamic verification framework for GUI data. It is an integral part of the development, testing and execution stages. An implementation of S-DAVER was successfully integrated into existing open-source applications, with promising results. Copyright © 2015 John Wiley & Sons, Ltd.
... Na literatura existem vários trabalhos relacionados sobre as diferentes abordagens para otimizar os testes de GUI ( [5], [14], [12], [2]). No entanto, neste artigo apenas estamos focados nos vários estudos de comparação de ferramentas de captura e reprodução. ...
Article
Graphical user interface (GUI) is becoming more and more important due to its widespread use in software products and computers. However, testing GUI still remains a big challenge to ensure the quality of the system and therefore to improve the user satisfaction. Manual testing permits ad hoc testing, nevertheless it often results in large amounts of manual labor and thus increases cost. Using capture and replay tools to support the testing process helps to overcome the problems of the manual testing. In this paper we compare five open source capture and replay tools in terms of ease of use, and capture and replay features. We compare the tools Abbot, Jacareto, JFCUnit, Marathon and Pounder, assessing generic, and capture and replay characteristics. After evaluating each tool, we selected the one that showed the best results in almost all criteria. The findings of this study may serve as guidance for novice testers or companies that consider employing open source capture and replay tools to automate the GUI testing process.
... It might be dangerous as well if the whole validation process rests on the ability of human testers. To attempt to solve these problems, a semi-automated alternative for validating GUI output based on the idea described in [146] is proposed. ...
Thesis
Full-text available
1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.3 Data Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1.4 Software Usability . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.5 Quality of Experience . . . . . . . . . . . . . . . . . . . . . . . . . 10 Enhancing Software Quality . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1 Block 1: Achieving Quality in Interaction Components Separately . 12 1.2.2 Block 2: Achieving Quality of User-System Interaction as a Whole . 14 1.3 Goals of this PhD Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 Publications Related to this PhD Thesis . . . . . . . . . . . . . . . . . . . . 19 1.5 Software Contributions of this PhD Thesis . . . . . . . . . . . . . . . . . . 22 1.5.1 OHT: Open HMI Tester . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5.2 S-DAVER: Script-based Data Verification . . . . . . . . . . . . . . 24 1.5.3 PALADIN: Practice-oriented Analysis and Description of Multi-modal Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 CARIM: Context-Aware and Ratings Interaction Metamodel . . . . 25 1.6 Summary of Research Goals, Publications, and Software Contributions . . 25 1.7 Context of this PhD Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.8 Structure of this PhD Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2 Related Work 2.1 Group 1: Approaches Assuring Quality of a Particular Interaction Component 30 2.2 Validation of Software Output . . . . . . . . . . . . . . . . . . . . 30 2.1.1.1 Methods Using a Complete Model of the GUI . . . . . . 31 2.1.1.2 Methods Using a Partial Model of the GUI . . . . . . . . 32 2.1.1.3 Methods Based on GUI Interaction . . . . . . . . . . . . 32 Validation of User Input . . . . . . . . . . . . . . . . . . . . . . . . 33 2.1.2.1 Data Verification Using Formal Logic . . . . . . . . . . . 34 2.1.2.2 Data Verification Using Formal Property Monitors . . . . 35 2.1.2.3 Data Verification in GUIs and in the Web . . . . . . . . . 36 Group 2: Approaches Describing and Analyzing User-System Interaction as a Whole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2.1 Analysis of User-System Interaction . . . . . . . . . . . . . . . . . 37 2.2.1.1 Analysis for the Development of Multimodal Systems . . 37 2.2.1.2 Evaluation of Multimodal Interaction . . . . . . . . . . . 41 2.2.1.3 Evaluation of User Experiences . . . . . . . . . . . . . . 44 Analysis of Subjective Data of Users . . . . . . . . . . . . . . . . . 45 2.2.2.1 User Ratings Collection . . . . . . . . . . . . . . . . . . 45 2.2.2.2 Users Mood and Attitude Measurement . . . . . . . . . . 47 Analysis of Interaction Context . . . . . . . . . . . . . . . . . . . . 49 2.2.3.1 Interaction Context Factors Analysis . . . . . . . . . . . 49 2.2.3.2 Interaction Context Modeling . . . . . . . . . . . . . . . 50 3 Evaluating Quality of System Output 3.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 GUI Testing Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.3 Preliminary Considerations for the Design of a GUI Testing Architecture . 57 3.3.1 Architecture Actors . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.3.2 Organization of the Test Cases . . . . . . . . . . . . . . . . . . . . 57 3.3.3 Interaction and Control Events . . . . . . . . . . . . . . . . . . . . 58 The OHT Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.4.1 The HMI Tester Module Architecture . . . . . . . . . . . . . . . . 60 3.4.2 The Preload Module Architecture . . . . . . . . . . . . . . . . . . . 61 3.4.3 The Event Capture Process . . . . . . . . . . . . . . . . . . . . . . 63 3.4.4 The Event Playback Process . . . . . . . . . . . . . . . . . . . . . . 64 The OHT Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5.1 Implementation of Generic and Final Functionality . . . . . . . . . 66 3.5.1.1 Generic Data Model . . . . . . . . . . . . . . . . . . . . 66 3.5.1.2 Generic Recording and Playback Processes . . . . . . . . 66 Implementation of Specific and Adaptable Functionality . . . . . . 67 3.5.2.1 Using the DataModelAdapter . . . . . . . . . . . . . . . 68 3.5.2.2 The Preloading Process . . . . . . . . . . . . . . . . . . . 68 3.5.2.3 Adapting the GUI Event Recording and Playback Processes 69 3.7 Technical Details About the OHT Implementation . . . . . . . . . 70 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.6.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6.2 The Test Case Generation Process . . . . . . . . . . . . . . . . . . 73 3.6.3 Validation of Software Response . . . . . . . . . . . . . . . . . . . 74 3.6.4 Tolerance to Modifications, Robustness, and Scalability . . . . . . . 75 3.6.5 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 76 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4 Evaluating Quality of Users Input 4.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Practical Analysis of Common GUI Data Verification Approaches . . . . . 82 4.3 Monitoring GUI Data at Runtime . . . . . . . . . . . . . . . . . . . . . . . 83 4.4 Verification Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.4.1 Rule Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.4.2 Using the Rules to Apply Correction . . . . . . . . . . . . . . . . . 87 4.4.3 Rule Arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.4.4 Rule Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4.4.1 88 Loading the Rules . . . . . . . . . . . . . . . . . . . . . xviiContents 4.4.4.2 Evolution of the Rules and the GUI . . . . . . . . . . . . 89 Correctness and Consistency of the Rules . . . . . . . . . . . . . . 90 4.5 The Verification Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.6 S-DAVER Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.6.1 Architecture Details . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.6.2 Architecture Adaptation . . . . . . . . . . . . . . . . . . . . . . . . 94 4.7 S-DAVER Implementation and Integration Considerations . . . . . . . . . 95 4.8 Practical Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.8.1 Integration, Configuration, and Deployment of S-DAVER . . . . . 99 4.8.2 Defining the Rules in Qt Bitcoin Trader . . . . . . . . . . . . . . . 100 4.8.3 Defining the Rules in Transmission . . . . . . . . . . . . . . . . . . 103 4.8.4 Development and Verification Experience with S-DAVER . . . . . 106 4.9 Performance Analysis of S-DAVER . . . . . . . . . . . . . . . . . . . . . . 106 4.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.10.1 A Lightweight Data Verification Approach . . . . . . . . . . . . . 108 4.10.2 The S-DAVER Open-Source Implementation . . . . . . . . . . . . . 110 4.10.3 S-DAVER Compared with Other Verification Approaches . . . . . . 111 4.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5 Modeling and Evaluating Quality of Multimodal User-System Interaction 115 5.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.2 A Model-based Framework to Evaluate Multimodal Interaction . . . . . . . 118 5.2.1 Classification of Dialog Models by Level of Abstraction . . . . . . 119 5.2.2 The Dialog Structure . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.2.3 Using Parameters to Describe Multimodal Interaction . . . . . . . 121 5.2.3.1 Adaptation of Base Parameters . . . . . . . . . . . . . . 121 5.2.3.2 Defining new Modality and Meta-communication Param- eters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.2.3.3 Defining new Parameters for GUI and Gesture Interaction 123 5.2.3.4 Classification of the Multimodal Interaction Parameters . 124 5.3 Design of PALADIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.4 Implementation, Integration, and Usage of PALADIN . . . . . . . . . . . . 129 5.5 Application Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.6 Assessment of PALADIN as an Evaluation Tool . . . . . . . . . . . 132 5.5.1.1 Participants and Material . . . . . . . . . . . . . . . . . 134 5.5.1.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5.5.1.3 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . 137 Usage of PALADIN in a User Study . . . . . . . . . . . . . . . . . 140 5.5.2.1 Participants and Material . . . . . . . . . . . . . . . . . 140 5.5.2.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 144 5.5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.6.1 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5.6.2 Practical Application of PALADIN . . . . . . . . . . . . . . . . . . 147 5.6.3 Completeness of PALADIN According to Evaluation Guidelines . . 148 5.6.4 Limitations in Automatic Logging of Interactions Parameters . . . 151 5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.8 Parameters Used in PALADIN . . . . . . . . . . . . . . . . . . . . . . . . . 152 6 Modeling and Evaluating Mobile Quality of Experience 163 6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 164 6.2 Context- and QoE-aware Interaction Analysis . . . . . . . . . . . . . . . . 166 6.2.1 Incorporating Context Information and User Ratings into Interaction Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 6.2.2 Arranging the Parameters for the Analysis of Mobile Experiences . 168 6.2.3 Using CARIM for QoE Assessment . . . . . . . . . . . . . . . . . . 169 Context Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6.3.1 Quantifying the Surrounding Context . . . . . . . . . . . . . . . . 170 6.3.2 Arranging Context Parameters into CARIM . . . . . . . . . . . . . 173 User Perceived Quality Parameters . . . . . . . . . . . . . . . . . . . . . . 173 6.4.1 Measuring the Attractiveness of Interaction . . . . . . . . . . . . . 173 6.4.2 Measuring Users Emotional State and Attitude toward Technology Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 6.5 Arranging User Parameters into CARIM . . . . . . . . . . . . . . . 177 CARIM Model Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.5.1 The Base Design: PALADIN . . . . . . . . . . . . . . . . . . . . . 177 6.5.2 The New Proposed Design: CARIM . . . . . . . . . . . . . . . . . 178 6.6 CARIM Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . 181 6.7 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 6.7.1 Participants and Material . . . . . . . . . . . . . . . . . . . . . . . 183 6.7.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 6.7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 6.9 Comparing the Two Interaction Designs for UMU Lander 185 Validating the User Behavior Hypotheses . . . . . . . . . 186 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 6.8.1 Modeling Mobile Interaction and QoE . . . . . . . . . . . . . . . . 188 6.8.3 CARIM Implementation and Experimental Validation . . . . . . . 190 CARIM Compared with Other Representative Approaches . . . . . 191 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 7 Conclusions and Further Work 7.2 Conclusions of this PhD Thesis . . . . . . . . . . . . . . . . . . . . . . . . 196 7.1.2 Driving Forces of this PhD Thesis . . . . . . . . . . . . . . . . . . 196 Work and Research in User-System Interaction Assessment . . . . 197 7.1.3 Goals Achieved in this PhD Thesis . . . . . . . . . . . . . . . . . . 200 Future Lines of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Bibliography 205 A List of Acronyms 231
... Na literatura existem vários trabalhos relacionados sobre as diferentes abordagens para otimizar os testes de GUI ( [5], [14], [12], [2]). No entanto, neste artigo apenas estamos focados nos vários estudos de comparação de ferramentas de captura e reprodução. ...
Conference Paper
Testing the graphical user interface (GUI) of a software product is important to ensure the quality of the system and therefore to improve the user satisfaction of using the software. Using tools to support the testing process solves the problems of the manual testing which is tedious and time consuming. Capture and replay tools are commonly used in GUI testing. In this paper we compare five open source capture and replay tools, namely Abbot, Jacareto, JFCUnit, Marathon and Pounder, in terms of ease of use and capture and replay capabilities. In order to compare the tools, we defined comparison characteristics and after evaluating each tool, we selected the one that showed the best results in almost all criteria. The results of our study may serve as guidance for any novice tester or company that pretends to automate the GUI testing process using capture and replay tools.
... SUTs such as commercially available software-Microsoft WordPad [72], physical hardware such as vending machines [73], mobile phones [74] and open source systems [25] have been classified as large-scale systems. SUTs such as a set of GUI windows [75], a set of web pages [60], small applications developed specifically for demonstration [76,77] have been classified as small-scale SUTs. Of the 118 articles which used one or more SUTs, 89 articles (75.42%) used a large-scale SUT. ...
Article
Software testing is an activity conducted to test the software under test. It has two approaches: manual testing and automation testing. Automation testing is an approach of software testing in which programming scripts are written to automate the process of testing. There are some software development projects under development phase for which automated testing is suitable to use and other requires manual testing. It depends on factors like project requirements nature, team which is working on the project, technology on which software is developing and intended audience that may influence the suitability of automated testing for certain software development project. In this paper we have developed machine learning model for prediction of automated testing adoption. We have used chi-square test for finding factors’ correlation and PART classifier for model development. Accuracy of our proposed model is 93.1624%.
Thesis
Full-text available
The object of this master thesis is to address the practical implementation of visual quality assurance framework for the layout of the software being tested and to improve the finding of style bugs. In this thesis, a custom visual GUI testing (VGT) framework and a test suite was developed for the target company. The general idea of the framework is to track visual changes and assure that unwanted visual differences are covered before changes end up in production. The main goal of the thesis work was to enable rewriting of the visual layout and decrease the technical debt. The pros and cons of visual regression testing were evaluated with open source tools and frameworks. The evaluation was done with practical implementation and piloting. The outcome of this thesis is a visual testing system and a framework. iii TIIVISTELMÄ LUT University School of Engineering Science Ohjelmistotuotannon laitos Master's Programme in Software Engineering and Digital Transformation Joonas Heinonen Automatisoidun visuaalisen regressiotestauksen suunnittelu ja implementointi suureen ohjelmistotuotannon tuotteeseen Diplomityö 2020 70 sivua, 17 kuvaa, 3 taulukkoa Työn tarkastajat: Professori Jari Porras Professori Ari Happonen Hakusanat: CSS, puppeteer, jest, nodejs, chromium, docker, testaus, testiautomaatio, visuaalinen testaus, regressiotestaus, visuaalinen regressiotestaus, laadunvarmistus, avoin lähdekoodi Tässä diplomityössä on tavoitteena avata visuaalisen laadunvarmistuksen käytännön toteutusta ja sen merkitystä. Tässä työssä kehitettiin räätälöity visuaaliseen testaamiseen tarkoitettu ohjelmistokehys yritykselle. Kehyksen avulla ohjelmiston visuaalista laatua ja tyylibugien löytämistä voidaan parantaa. Työn tarkoituksena oli mahdollistaa visuaalisen ulkoasun uudelleen kirjoittaminen ja teknisen velan vähentäminen koodikannasta. Visuaalisen regressiotestauksen hyötyjä ja haittoja tutkittiin vapaan lähdekoodin työkaluilla käytännönläheisellä toteutuksella. Työn lopputuloksena oli päivittäiseen kehitystyöhön integroitu visuaalisten aspektien testiautomaatioratkaisu ja testiautomaatiokehys. iv ACKNOWLEDGEMENTS I am eternally grateful for having such a supporting wife and a life partner to share my achievements and the hardest moments with.
Article
Testing is one of the most important phases in the development of any product or software. Various types of software testing exist that have to be done to meet the need of the software. Regression testing is one of the crucial phases of testing where testing of a program is done for the original test build along with the modifications. In this article, various studies proposed by the authors have been analysed focusing on test cases generation and their approach toward web application. A detailed study was conducted on Regression Test Case Generation and its approaches toward web application. From our detailed study, we have found that very few approaches and methodologies have been found that provide the real tool for test case generation. There is a need of an automated regression testing tool to generate the regression test cases directly based on user requirements. These test cases have to be generated and implemented by the tool so that the reduction in the overall effort and cost can be achieved. From our study, we have also found that regression testing for web applications was not investigated much, but in today's scenario web applications are an integral part of our daily life and so that needs to be tested for regression testing.
Conference Paper
The Test-Duo framework for generating and executing acceptance tests from use cases is presented. In Test-Duo, annotations are added to use cases to explicate system behaviors. ROBOT framework compatible test cases are then generated and applied to test the system under a search regime. Tool support is discussed.
Article
Full-text available
The technology for the broad generation of sanity tests for complex software developed in the Institute for System Programming (Russian Academy of Sciences) is presented. This technology is called Azov; it is based on using a database containing structured information about the interface operations of the system under test and on a procedure for enriching this information by refining constraints imposed on parameter types and results of operations. Results of a practical application of this technology prove its high efficiency in generating sanity tests for systems with a large number of functions.
Article
Full-text available
This paper presents an approach to diminish the effort required in GUI modelling and test coverage analysis within a model-based GUI testing process. A familiar visual notation a subset of UML with minor extensions is used to model the structure, behaviour and usage of GUIs at a high level of abstraction and to describe test adequacy criteria. The GUI visual model is translated automatically to a model-based formal specification language (e.g., Spec♯), hiding formal details from the testers. Then, additional behaviour may be added to the formal model to be used as a test oracle. The adequacy of the test cases generated automatically from the formal model is accessed based on the structural coverage of the UML behavioural diagrams.
Conference Paper
Full-text available
This paper describes an ongoing research on test case generation based on Unified Modeling Language (UML). The described approach builds on and combines existing techniques for data and graph coverage. It first uses the Category-Partition method to introduce data into the UML model. UML Use Cases and Activity diagrams are used to respectively describe which functionalities should be tested and how to test them. This combination has the potential to create a very large number of test cases. This approach offers two ways to manage the number of tests. First, custom annotations and guards use the Category- Partition data which allows the designer tight control over possible, or impossible, paths. Second, automation allows different configurations for both the data and the graph coverage. The process of modeling UML activity diagrams, annotating them with test data requirements, and generating test scripts from the models is described. The goal of this paper is to illustrate the benefits of our model-based approach for improving automation on software testing. The approach is demonstrated and evaluated based on use cases developed for testing a graphical user interface (GUI).
Conference Paper
Although the World-Wide-Web (WWW) has significantly enhanced open-source software (OSS) development, it has also created new challenges for quality assurance (QA), especially for OSS with a graphical-user interface (GUI) front-end. Distributed communities of developers, connected by the WWW, work concurrently on loosely-coupled parts of the OSS and the corresponding GUI code. Due to the unprecedented code churn rates enabled by the WWW, developers may not have time to determine whether their recent modifications have caused integration problems with the overall OSS; these problems can often be detected via GUI integration testing. However, the resource-intensive nature of GUI testing prevents the application of existing automated QA techniques used during conventional OSS evolution. In this paper we develop new process support for three nested techniques that leverage developer communities interconnected by the WWW to automate model-based testing of evolving GUI-based OSS. The "innermost" technique (crash testing) operates on each code check-in of the GUI software and performs a quick and fully automatic integration test. The second technique {smoke testing) operates on each day's GUI build and performs functional "reference testing" of the newly integrated version of the GUI. The third (outermost) technique (comprehensive GUI testing) conducts detailed integration testing of a major GUI release. An empirical study involving four popular OSS shows that (1) the overall approach is useful to detect severe faults in GUI-based OSS and (2) the nesting paradigm helps to target feedback and makes effective use of the WWW by implicitly distributing QA
Article
Test designers widely believe that the overall effectiveness and cost of software testing depends largely on the type and number of test cases executed on the software. This article shows that the test oracle, a mechanism that determines whether a software is executed correctly for a test case, also significantly impacts the fault detection effectiveness and cost of a test case. Graphical user interfaces (GUIs), which have become ubiquitous for interacting with today's software, have created new challenges for test oracle development. Test designers manually “assert” the expected values of specific properties of certain GUI widgets in each test case; during test execution, these assertions are used as test oracles to determine whether the GUI executed correctly. Since a test case for a GUI is a sequence of events, a test designer must decide: (1) what to assert; and (2) how frequently to check an assertion, for example, after each event in the test case or after the entire test case has completed execution. Variations of these two factors significantly impact the fault-detection ability and cost of a GUI test case. A technique to declaratively specify different types of automated GUI test oracles is described. Six instances of test oracles are developed and compared in an experiment on four software systems. The results show that test oracles do affect the fault detection ability of test cases in different and interesting ways: (1) Test cases significantly lose their fault detection ability when using “weak” test oracles; (2) in many cases, invoking a “thorough” oracle at the end of test case execution yields the best cost-benefit ratio; (3) certain test cases detect faults only if the oracle is invoked during a small “window of opportunity” during test execution; and (4) using thorough and frequently-executing test oracles can compensate for not having long test cases.
Conference Paper
This paper presents a new automated model-driven technique to generate test cases by using feedback from the execution of a .seed test suite. on an application under test (AUT). The test cases in the seed suite are designed to be generated automatically and executed very quickly. During their execution, feedback obtained from the AUT's run-time state is used to generate new, .improved. test cases. The new test cases subsequently become part of the seed suite. This .anytime technique. continues iteratively, generating and executing additional test cases until resources are exhausted or testing goals have been met. The feedback-based technique is demonstrated for automated testing of graphical user interfaces (GUIs). An existing abstract model of the GUI is used to automatically generate the seed test suite. It is executed; during its execution, state changes in the GUI pinpoint important relationships between GUI events, which evolve the model and help to generate new test cases. Together with a reverse-engineering algorithm used to obtain the initial model and seed suite, the feedback-based technique yields a fully automatic, end-to-end GUI testing process. A feasibility study on four large fielded open-source software (OSS) applications demonstrates that this process is able to significantly improve existing techniques and help identify/report serious problems in the OSS. In response, these problems have been fixed by the developers of the OSS in subsequent versions.
Conference Paper
Graphical user interfaces (GUIs) are important parts of today's software and their correct execution is required to ensure the correctness of the overall software. A popular technique to detect defects in GUIs is to test them by executing test cases and checking the execution results. Test cases may either be created manually or generated automatically from a model of the GUI While manual testing is unacceptably slow for many applications, our experience with GUI testing has shown that creating a model that can be used for automated test case generation is difficult. We describe a new approach to reverse engineer a model represented as structures called a GUI forest, event-flow graphs and an integration tree directly from the executable GUL We describe "GUI Ripping", a dynamic process in which the software's GUI is automatically "traversed" by opening all its windows and extracting all their widgets (GUI objects), properties, and values. The extracted information is then verified by the test designer and used to automatically generate test cases. We present algorithms for the ripping process and describe their implementation in a tool suite that operates on Java and Microsoft Windows' GUIs. We present results of case studies which show that our approach requires very little human intervention and is especially useful for regression testing of software that is modified frequently. We have successfully used the "GUI Ripper" in several large experiments and have made it available as a downloadable tool.
Conference Paper
"Nightly/daily building and smoke testing" have become widespread since they often reveal bugs early in the software development process. During these builds, software is compiled, linked, and (re)tested with the goal of validating its basic functionality. Although successful for conventional software, smoke tests are difficult to develop and automatically rerun for software that has a graphical user interface (GUI). In this paper, we describe a framework called DART (daily automated regression tester) that addresses the needs of frequent and automated re-testing of GUI software. The key to our success is automation: DART automates everything from structural GUI analysis; test case generation; test oracle creation; to code instrumentation; test execution; coverage evaluation; regeneration of test cases; and their re-execution. Together with the operating system's task scheduler, DART can execute frequently with little input from the developer/tester to retest the GUI software. We provide results of experiments showing the time taken and memory required for GUI analysis, test case and test oracle generation, and test execution. We also empirically compare the relative costs of employing different levels of detail in the GUI test cases.
Article
Software is increasingly being developed/maintained by multiple, often geographically distributed developers working concurrently. Consequently, rapid-feedback-based quality assurance mechanisms such as daily builds and smoke regression tests, which help to detect and eliminate defects early during software development and maintenance, have become important. This paper addresses a major weakness of current smoke regression testing techniques, i.e., their inability to automatically (re)test graphical user interfaces (GUIs). Several contributions are made to the area of GUI smoke testing. First, the requirements for GUI smoke testing are identified and a GUI smoke test is formally defined as a specialized sequence of events. Second, a GUI smoke regression testing process called daily automated regression tester (DART) that automates GUI smoke testing is presented. Third, the interplay between several characteristics of GUI smoke test suites including their size, fault detection ability, and test oracles is empirically studied. The results show that: 1) the entire smoke testing process is feasible in terms of execution time, storage space, and manual effort, 2) smoke tests cannot cover certain parts of the application code, 3) having comprehensive test oracles may make up for not having long smoke test cases, and 4) using certain oracles can make up for not having large smoke test suites.