Content uploaded by Wishnu Prasetya
Author content
All content in this area was uploaded by Wishnu Prasetya on Mar 04, 2015
Content may be subject to copyright.
Testing of Future Internet Applications
Running in the Cloud
Tanja Vos
Universidad Politécnica de Valencia, Spain
Paolo Tonella
Fondazione Bruno Kessler, Trento, Italy
Joachim Wegener
Berner & Mattner
Mark Harman
University College London
Wishnu Prasetya
Univeristy of Utrecht
Yarden Nir-Buchbinder
IBM Israel
Shmuel Ur
Bristol University
ABSTRACT The cloud will be populated by Future Internet (FI) software, comprising advanced,
dynamic and largely autonomic interactions among services, end-user applications, content and media.
The complexity of the technologies involved in the cloud makes testing extremely challenging and
demands novel approaches and major advancements in the field.
This chapter describes the main challenges associated with the test of FI applications running in the cloud.
We present a research agenda that was defined in order to address the FI testing challenges. The goal of
the agenda is investigating the technologies for the development of a FI automated testing environment,
which can monitor the FI applications under test and can react dynamically to the observed changes.
Realization of this environment involves substantial research in areas such as search based testing, model
inference, oracle learning and anomaly detection.
INTRODUCTION
The FI will be a complex interconnection of services, applications, content and media running in the
cloud. It will offer a rich user experience, extending and improving current hyperlink-based navigation.
Key technologies contributing to the development of FI services and applications include a rich, complex,
dynamic and stateful client. This client interacts asynchronously with the server, where applications are
organized as services and run in the cloud, taking advantage of dynamic service discovery, replacement
and composition. Adaptivity and autonomy improve the user experience, by dynamically changing both
the client and the server side, through capabilities such as self-configuration and self-healing. As a
consequence, FI applications will exhibit emergent behavior which makes them hard to predict.
Our society will become increasingly dependent on services built on top of this complex and emerging
Future Internet. Critical activities such as public utilities, social services, government, learning, finance,
business, but also entertainment will depend on the underlying software and services. As a consequence,
the applications running on top of the Future Internet will have to meet high quality and dependability
demands. Not only the functional quality aspect is important, but non-functional aspects like performance,
security, and privacy will become increasingly more important. All these make verification and validation
for quality assurance of FI applications extremely important. In this chapter, we discuss how to address
the FI testing challenges, by describing the features of an integrated environment for continuous
evolutionary automated testing, which can monitor the FI application under test and adapt to the dynamic
2
changes observed. FI testing will require continuous post-release testing since the application under test
does not remain fixed after its release. Services and components could be dynamically added by
customers and the intended use could change significantly. Therefore, testing is done continuously after
deployment to the customer, either in vitro or in vivo. The testing environment we describe integrates,
adapts and automates techniques for continuous FI testing (e.g. dynamic model inference, log-based
diagnosis, oracle learning, classification trees and combinatorial testing, concurrent testing, regression
testing). To make it possible for the above mentioned techniques to deal with the huge search space
associated with FI testing, evolutionary search based testing will be used. Search-based algorithms will be
used to guide the solution of identified problems so as to optimize properly defined objective functions. In
this way, we can address the ultimate challenge of FI applications: testing unexpected behavior that may
originate from the dynamism, autonomy and self-adaptation involved.
BACKGROUND
FI testing demands for major advancements in several areas of software testing. We discuss the state of
the art in each of these areas separately, in the following.
Beyond the state of the art of search based techniques
The current state of the art in search based techniques is formed by a wide interest (Harman, 2007). The
area of testing is the most prominent software engineering domain for the application of search
techniques. Search based testing techniques have been applied to various real world complex systems
(e.g., embedded systems) (Vos et al., 2010; Baars et al., 2010) to deal with automated test case generation
for structural (white-box) as well as functional (black-box) testing. While these testing targets remain
relevant for FI applications as well, the continuous, autonomous testing framework that we envision
introduces new opportunities for search based exploration of the solution space. Correspondingly, novel
fitness function definitions and search algorithms will be required.
Innovative approaches to genetic programming applied to testing may also contribute to FI testing. So far,
genetic programming has received limited attention in testing. It has been successfully used to conduct
unit testing of object oriented code, by providing a simple and effective mechanism to bring the object
under test to a proper internal state (Tonella, 2004). We think genetic programming can be pushed beyond
such simple applications, by considering it as a powerful technique to co-evolve the testing engine
together with the self-modifying, adaptive FI application under test.
Beyond the state of the art of Web testing
The vast, existing literature (Ricca & Tonella, 2001; Elbaum et al., 2005; Sampath et al., 2007) on Web
testing is focused on client-server applications which implement a strictly serialized model of interaction,
based on form submission–server response sequences. Testing of Ajax and rich client Web applications
has been considered only recently (Mesbah & van Deursen, 2009; Marchetto et al., 2008). To address FI
testing, such results should be extended in the direction of increasing the level of automation and of
supporting continuous, unattended test generation and execution. These extensions require research in the
area of automated model and invariant inference, anomaly detection and input data generation. If these
involve monitoring or logging production runs, the introduced overhead may be significant; so ultimately
the issue of performance also needs to be addressed.
Beyond the state of the art of model inference
FI testing requires major advancements of the state of the art for model inference. Existing techniques
(Lorenzoli et al., 2008; Dallmeier et al., 2006) rely either on algorithms for regular language learning or
on predefined, hardcoded abstraction functions. These techniques produce a state model which is hardly
interpretable by humans. In fact, while event sequences are meaningful, states do not necessarily
correspond to an internal state of the FI application. Since we want to provide meaningful feedback to
3
testers, this approach is not attractive. The other option is hardly viable in the dynamic, adaptive context
of the FI. No abstraction defined in advance can be adequate for the dynamic nature of the FI.
Beyond the state of the art of log based diagnosis and oracle learning
There has been substantial research in the use of logs for testing (Andrews & Zhang, 2000; Rozinat & van
der Aalst, 2007). This research focuses on observing errors from logs. FI testing needs to extend this with
the capability to infer oracles, likely oracles, and atypical executions from logs. State of the art tools for
log based diagnosis (Lorenzoli et al., 2008; Hangal & Lam, 2002) rely on invariant inference algorithms
such as the one implemented in Daikon (Ernst et al., 2007). These algorithms should be extended to
support richer temporal logics, used to express the inferred invariants. Temporal invariants are more
appropriate than Daikon’s static invariants for the finite state models, since the correctness of their
behavior is strictly dependent on the sequence of states traversed over time. In one hand this demands
increased information in the logs. On the other hand, in practice it happens that an unauthorized 3rd party
manages to acquire logs. Therefore the issue of privacy must also be addressed.
Beyond the state of the art of coverage testing
FI applications are (ultra-)large systems. For such a system current coverage criteria are not meaningful
and require substantial extensions (Adler et al., 2009). One difficulty is that while the application specific
code is expected to be fully utilized, not all the capabilities of the off-the-shelf components are used, or
can be used, in the application. Consequently, a coverage measure of for example 50%, does not make
clear whether we have performed a really good test that covered all the relevant functionality or there is
still is a lot of relevant functionality left to be tested. In order for coverage to be useful we need a
methodology for severely limiting the false reporting, and for filtering out the code which is not relevant
to the solution.
Beyond the state of the art of concurrency testing
State of the art techniques for concurrency testing are focused on single applications that exhibit some
degree of internal parallelism and distribution (Kȓena et al., 2009). We need solutions to the FI context,
where applications are not self-contained and act more as service composers and integrators than as
simple functionality providers. Such integration is characterized by a class of concurrency which is harder
to treat and control than in traditional concurrency testing. Substantial enhancements are required to
approaches based on explicit, direct control of timings and scheduling, since the integration itself is
comprised of multiple concurrently executing parts and therefore must be augmented with capabilities of
altering the ongoing communications as well as the distribution over multiple machines of the different
components. Novel mechanisms for debugging and novel record-replay functionalities are also needed to
support testers trying to identify the causes of bugs. Artificial load creation is another way to exercise an
FI integrator application in a way that increases the chances of revealing faults.
Beyond the state of the art of combinatorial testing
Combinatorial testing using classification trees is a test case generation technique widely used in industry
(Kruse & Luniak, 2010). For FI technologies, the number of classes in a classification tree could become
very large, resulting in a tremendous amount of possible test cases. To address FI testing, more research is
needed to increase the level of automation and support for continuous, dynamic generation of
classification trees and fault-sensitive test case generation. These extensions require research in the area
of new combinatorial techniques for test case generation, e.g. the inclusion of statistical information like
operational profiles of internet applications to generate a representative set of test cases, the application of
evolutionary search techniques to search for the optimal test suites, and the combination of classification
trees with oracle learning to automate expected values prediction.
Beyond the state of the art of empirical evaluation
4
In order to assess the testing techniques developed, evaluative research must involve realistic FI systems
and realistic subjects. It should be done with thoroughness to ensure that any benefits identified during the
evaluation study are clearly derived from the testing technique studied. It should be done in such a way
that different studies can be compared. This type of research is time-consuming, expensive and difficult.
However, they are fundamental since claims made by analytical advocacy are insupportable (Fenton et
al., 1994). What is needed (Hesari et al., 2010) is a general methodology evaluation framework that will
simplify the evaluation procedure and make the results more accurate and reliable.
CHALLENGES
FI applications will be characterized by an extremely high level of dynamism. Most decisions, normally
made at design time, are deferred to execution time, when the application can take advantage of
monitoring (self-observation, as well as data collection from the environment and logging of the
interactions) to adapt itself to a changed usage context. The realization of this vision involves a number of
technologies, including: observational reflection and monitoring; dynamic discovery and composition of
services; hot component loading and update; structural reflection, to support self-adaptation and
modification; asynchronous communication; high configurability and context awareness; composability
into large-scale systems of systems.
While offering major improvements over the currently available Web experience, such features pose
several challenges to testing, summarized in Table 1.
CH1
Self modification
Rich clients have increased capability to dynamically adapt the structure of the
Web pages; server-side services are replaced and recomposed dynamically
based on Service Level Agreements (SLA), taking advantage of services
newly discovered in the cloud; components are dynamically loaded.
CH2
Autonomic
behavior
FI applications are highly autonomous; their correct behavior cannot be
specified precisely at design-time.
CH3
Low observability
FI applications are composed of an increasing number of 3rd-party
components and services running in the cloud, accessed as a black box, which
are hard to test.
CH4
Asynchronous
interactions
FI applications are highly asynchronous and hence hard to test. Each client
submits multiple requests asynchronously; multiple clients run in parallel;
server-side computation is distributed over the cloud and concurrent.
CH5
Time and load
dependent behavior
For FI applications, factors like timing and load conditions make it hard to
reproduce errors during debugging.
CH6
Huge feature
configuration space
FI applications are highly customizable and self-configuring, and contain a
huge number of configurable features, such as user-, context-, and
environment-dependent parameters.
CH7
Ultra-large scale
FI applications are often systems of systems running in the cloud; traditional
testing adequacy criteria cannot be applied, since even in good testing
situations low coverage will be achieved.
Table 1. Main testing challenges for FI applications
RESEARCH AGENDA
The testing challenges enumerated in Table 1 can be addressed by developing tools and techniques for
continuous evolutionary automated testing, which can monitor the FI application and adapt themselves to
5
the dynamic changes observed. FI testing will be continuous post-release testing since the application
under test does not remain fixed after its release. Services and components could be dynamically added
by customers and the intended use could change. Therefore, testing has to be performed continuously
after deployment to the customer.
The underlying technology we devise for FI testing, that will make it possible for the above mentioned
techniques to cope with the FI testing challenges like dynamism, self-adaptation and partial observability,
will be based on evolutionary search based testing. The impossibility of anticipating all possible
behaviors of FI applications suggests a prominent role for evolutionary testing techniques, because this
relies on very few assumptions about the underlying problem it is attempting to solve. In addition,
stochastic optimization and search techniques are adaptive and, therefore, able to modify their behavior
when faced with new unforeseen situations. These two properties – their freedom from limiting
assumptions and their inherent adaptiveness – make evolutionary testing approaches ideal for handling FI
applications testing, with their dynamic, self-adapting, autonomous and unpredictable behavior.
To achieve this overall aim, a number of research objectives that directly map to the identified challenges
should be investigated in the future. A summary of such research objectives is in Table 2. This book
chapter describes the approaches and techniques that can be followed to conduct research and achieve
practical results in each of these areas.
OBJ1
Evolutionary,
search based
testing approach
To cope with dynamism, self-adaptation and partial observability that
characterize FI applications, search based software testing will be used.
Evolutionary algorithms themselves exhibit dynamic and adaptive behavior
and, as such, are ideally suited to the nature of the problem. Moreover,
evolutionary algorithms have proved to be very efficient for solving general
undecidable problems and provide a robust framework.
OBJ2
Continuous,
automated
testing approach
Since the range of behaviors is not known in advance, testing will be done
continuously. Feedback from post-release executions will be used to co-
evolve the test cases for the self-adaptive FI application. Humans alone
cannot achieve the desired levels of dependability, so automation is required.
OBJ3
Dynamic model
inference
Self-adapting applications with low observability demand dynamic analysis;
models will be inferred continuously rather than being fixed upfront.
OBJ4
Model based
test case
derivation
Behavioral models inferred from monitored executions will be the basis for
automated test case generation. Paths in the model associated with semantic
interactions will be regarded as interesting execution sequences. Test case
generation will proceed fully unattended, including the generation of input
data and the verification of feasibility for the test adequacy criteria of choice.
OBJ5
Log-based
diagnosis and
oracle learning
Since correct behaviour cannot be fixed upfront, executions will be analysed
to identify atypical ones, indicating likely faults or potential vulnerabilities.
OBJ6
Dynamic
classification
tree generation
The huge configuration space will be dealt with by testing combinatorially,
using dynamically and continuously generated classification trees.
OBJ7
Test for
concurrency
bugs
A mechanism to control and record factors like communication noise, delays,
message timings, load conditions, etc., in a concurrent, cloud centric setup
will need to be developed.
OBJ8
Testing the
unexpected
Due to the high dynamism, it is impossible to define the expected
interactions upfront. Genetic programming can be used to simulate
6
unpredicted, odd, or even malicious interactions.
OBJ9
Coverage and
regression
testing
Novel coverage and regression testing criteria and analytical methods will be
defined for ultra-large scale FI applications running in the cloud, for which
the standard criteria and analysis techniques are not applicable since they just
do not scale.
OBJ10
General
evaluation
framework for
FI testing
Large scale case studies will be performed using realistic systems and
software testing practitioners. The studies will be executed using an
instantiation and/or refinement of the general evaluation framework to fit
specific software testing techniques and tools and evaluation situations.
Table 2. Research objectives in FI testing
CONCLUSION
The challenges involved in testing FI applications running in the cloud can be addressed by resorting to a
combination of advanced testing technologies (i.e. dynamic model inference, model-based test case
derivation, combinatorial testing, concurrent testing, regression testing, etc.) adapted to ensure a level of
automation that enables testing in a continuous mode. As a consequence, the testing time will increase
without any negative impact on the human resources involved, hence improving substantially the
affordability of the proposed FI testing techniques. Moreover, extended testing time results in improved
reliability and helps to cope with adaptive and dynamic changes of the FI software, which can be
observed only when the software is operated for long periods of time. The observed behaviors of FI
applications are automatically classified as normal or anomalous, so as to automatically and
autonomously trigger additional testing when needed.
The oracle component of the FI testing environment is key to ensure high quality and reliability of the
software running in the cloud. Hence, methods that support automated learning of oracles from the
observations have a big potential to reduce the cost of testing in the cloud scenario.
The flexibility of the search based approach to testing with respect to the testing goal makes it a suitable
unifying technology to address the FI testing challenges. The fitness function that guides test case
generation can be adapted to the features of FI applications and can track their continuous evolution and
autonomous modification over time. Genetic programming is an appealing option to allow the creation of
test programs that co-evolve with the FI application under test.
REFERENCES
Adler, Y., Farchi, E., Klausner, M., Pelleg, D., Raz, O., Shochat, M., Ur, S., & Zlotnick, A.: (2009)
Advanced code coverage analysis using substring holes, In Proceedings of the eighteenth international
symposium on Software testing and analysis (ISSTA '09). ACM, New York, NY, USA, 37-46.
Andrews J. H., & Zhang, Y (2000). Broad-spectrum studies of log file analysis. In Proceedings of the
International Conference on Software Engineering (pp. 105-114).
Baars, A., Vos, T., & Dimitrov, D. (2010) Using Evolutionary Testing to Find Test Scenarios for Hard to
Reproduce Faults, Software Testing Verification and Validation Workshop, IEEE International
Conference on, pp. 173-181, 2010 Third International Conference on Software Testing, Verification, and
Validation Workshops, 2010
Dallmeier, V., Lindig, C., Wasylkowski, A., Zeller, A. (2006). Mining Object Behavior with ADABU, In
Proceedings of the ICSE Workshop on Dynamic Analysis.
Edelstein, O. Farchi, E., Goldin, E., Nir, Y., Ratsaby, G., & Ur, S.: (2003) Framework for testing multi-
threaded Java programs. Concurrency and Computation: Practice & Experience 15(3-5): 485-499.
7
Elbaum, S. G., Rothermel, G., Karre, S., & Fisher II, M. (2005). Leveraging User-Session Data to
Support Web Application Testing. IEEE Transactions on Software Engineering, 31(3), 187-202.
Ernst, M., Perkins, J., Guo, P., McCamant, S., Pacheco, C., Tschantz, M. & Xiao, C. (2007). The Daikon
system for dynamic detection of likely invariants. Science of Computer Programing, 69 (1-3), 35-45.
Fenton, N., Pfleeger, S.L. & Glass, R.L., (1994) "Science and Substance: A Challenge to Software
Engineers," IEEE Software, pp. 86-95, July/August, 1994.
Hangal S., & Lam, M. S. (2002). Tracking down software bugs using automatic anomaly detection. In
Proceedings of the International Conference on Software Engineering.
Harman, M. (2007). The Current State and Future of SBSE. In Proceedings of Foundations of Software
Engineering (pp. 342-357).
Hesari, S., Mashayekhi, H., & Ramsin, R. (2010): Towards a General Framework for Evaluating Software
Development Methodologies. COMPSAC 2010: 208-217
Kȓena, B., Letko, Z., Nir-Buchbinder, Y., Tzoref-Brill, R., Ur, S., & Vojnar, T., (2009). A concurrency
testing tool and its plug-ins for dynamic analysis and runtime healing. In Runtime Verification. Lecture
Notes in Computer Science, 2009, Volume 5779/2009, 101-114
Kruse, P., & Luniak, M., (2010) Automated Test Case Generation Using Classification Trees. In
proceedings of STAREAST- Software Testing Conference, Analysis & Review.
Lorenzoli, D., Mariani, L., & Pezzè, M. (2008). Automatic generation of software behavioral models. In
Proceedings of the International Conference on Software Testing (pp. 501-510).
Marchetto, A., Tonella, P., & Ricca F. (2008). State-Based Testing of Ajax Web Applications. In
Proceedings of the International Conference on Software Testing (pp. 121-130).
Mesbah, A., & van Deursen, A. (2009). Invariant-based automatic testing of AJAX user interfaces. In
Proceedings of the International Conference on Software Engineering (pp. 210-220).
Rozinat A., & van der Aalst, W. M. P. (2007). Conformance checking of processes based on monitoring
real behavior. Information Systems, 33(1), 64-95.
Ricca, F., & Tonella, P. (2001). Analysis and Testing of Web Applications. In Proceedings of the
International Conference on Software Engineering (pp. 25-34).
Sampath, S., Sprenkle, S., Gibson, E., Pollock, L. L., & Greenwald, A. S. (2007). Applying Concept
Analysis to User-Session-Based Testing of Web Applications. IEEE Transactions on Software
Engineering, 33(10), 643-658.
Tonella, P. (2004). Evolutionary Testing of Classes. In Proceedings of the International Symposium on
Software Testing and Analysis (pp. 119-128).
Vos, T., Baars, A., Lindlar, F., Kruse, P., Windisch, A., & Wegener, J., (2010). Industrial Scaled
Automated Structural Testing with the Evolutionary Testing Tool. In Proceedings of the 2010 Third
International Conference on Software Testing, Verification and Validation (ICST '10). IEEE Computer
Society, Washington, DC, USA, 175-184.