Conference Paper

Combinatorial Testing in an Industrial Environment--Analyzing the Applicability of a Tool

Authors:
  • Assystem Germany GmbH
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Numerous combinatorial testing tools are available for generating test cases. However, many of them are never used in practice. One of the reasons is the lack of empirical studies that involve human subjects applying testing techniques. This paper aims to investigate the applicability of a combinatorial testing tool in the company SOFTEAM. A case study is designed and conducted within the development team responsible for a new product. The participants consist of 3 practitioners from the company. The applicability of the tool has been examined in terms of efficiency, effectiveness and learning effort.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Similar to the other studies [1,[12][13][14]17], we use the methodological framework from the existing literature [16], this framework is based on well-known general case study design guidelines, e.g., by Runeson and Höst [15]. The following sections describe the design of our study. ...
... We measured the learning process using the level-based strategy method [1], as well as the time required to learn TESTAR [17]. In this way, we can investigate the learnability step by step in each level. ...
... Efficiency was measured by monitoring the time required for the test activities during the preparation, implementation, and evaluation phases. During the execution, all observations were also recorded in a working diary [1,8,12,13,[17][18][19][21][22][23][24]. ...
Chapter
TESTAR is a traversal-based and scriptless tool for test automation at the Graphical User Interface (GUI) level. It is different from existing test approaches because no test cases need to be defined before testing. Instead, the tests are generated during the execution, on-the-fly. This paper presents an empirical case study in a realistic industrial context where we compare TESTAR to a manual test approach of a web-based application in the rail sector. Both qualitative and quantitative research methods are used to investigate learnability, effectiveness, efficiency, and satisfaction. The results show that TESTAR was able to detect more faults and higher functional test coverage than the used manual test approach. As far as efficiency is concerned, the preparation time of both test approaches is identical, but TESTAR can realize test execution without the use of human resources. Finally, TESTAR turns out to be a learnable test approach. As a result of the study described in this paper, TESTAR technology was successfully transferred and the company will use both test approaches in a complementary way in the future.
... Several frameworks, protocols and checklists have been proposed, some for primary studies such as [6,7]; some in the software industry [6][7][8][9]; some in academic environments [1,10]; and others mentioned on secondary studies. In both cases, the aim is to support the selection of testing techniques and tools based on the characterization of the software to be tested, as well as the technique and/or a test tool to be used. ...
Chapter
Software testing is a key factor on any software project; testing costs are significant in relation to development costs. Therefore, it is essential to select the most suitable testing techniques for a given project to find defects at the lower cost possible in the different testing levels. However, in several projects, testing practitioners do not have a deep understanding of the full array of techniques available, and they adopt the same techniques that were used in prior projects or any available technique without taking into consideration the attributes of each testing technique. Currently, there are researches oriented to support selection of software testing techniques; nevertheless, they are based on static catalogues, whose adaptation to any niche software application may be slow and expensive. In this work, we introduce a content-based recommender system that offer a ranking of software testing techniques based on a target project characterization and evaluation of testing techniques in similar projects. The repository of projects and techniques was completed through the collaborative effort of a community of practitioners. It has been found that the difference between recommendations of SoTesTeR and recommendations of a human expert are similar to the difference between recommendations of two different human experts.
... Mencionaremos de forma separada los indicadores utilizados en [3], dado que sus factores de evaluación difieren de los otros frameworks por lo menos a alto nivel; así la habilidad para detección de defectos se determina a partir de: porcentaje de defectos detectados independientemente de su tipo; el costo de detección de defectos se determina por: defectos detectados en un tiempo dado respecto al esfuerzo (personas/hora), a su vez la actitud hacia el tipo de defectos es determinado por: % de defectos detectados por tipo y la propensión a detectar cierto tipo de defecto.  Instanciaciones: se han encontrado múltiples instanciaciones aunque a excepción del framework propuesto en [17] que ha sido instanciado en la industria en proyectos reales mencionados en [18]:2011, [17]:2012, [1]:2014 y [2], [9]:2014, los demás han sido instanciados en ambientes académicos según se indica en [7]:2006 y [3]:2012. ...
Article
Full-text available
Software testing is a vital activity and a determining factor for success of a given project because the testing allow know to the stakeholders if the product meets their expectations and the requirements, in addition, the cost of testing is very significant in relation to development cost, therefore is important to select the most suitable testing techniques and tools in a given project to find defects at less cost possible; however in many projects, the practitioners don't know the techniques and they adopt the same used in other projects or any available whit out know the testing technique attributes. In this article we present an art state about the testing techniques evaluation and selection which contain a comparison between approaches for lead primary studies leading to build knowledge bases or repositories whit information about the suitability, efficiency and efficacy and in other way we present the existing methods and approaches about the informed selection of testing techniques for a given project based on repositories information. • Software and its engineering ➝ Software verification and validation ➝Software defect analysis➝ Software testing and debugging • Information systems ➝Information retrieval➝ Retrieval tasks and goals ➝ Recommender systems.
Article
Full-text available
Studies have shown that combinatorial testing (CT) can be effective for detecting faults in software systems. By focusing on the interactions between different factors of a system, CT shows its potential for detecting faults, especially those that can be revealed only by the specific combinations of values of multiple factors (multi-factor faults). However, is CT practical enough to be applied in the industry? Can it be more effective than other industry-favored techniques? Are there any challenges when applying CT in practice? These research questions remain in the context of industrial settings. In this paper, we present an empirical study of CT on five industrial systems with real faults. The details of the input space model (ISM) construction, such as factor identification and value assignment, are included. We compared the faults detected by CT with those detected by the in-house testing teams using other methods, and the results suggest that despite some challenges, CT is an effective technique to detect real faults, especially multi-factor faults, of software systems in industrial settings. Observations and lessons learned are provided to further improve the fault detection effectiveness and overcome various challenges.
Chapter
Full-text available
The cloud will be populated by Future Internet (FI) software, comprising advanced, dynamic and largely autonomic interactions among services, end-user applications, content and media. The complexity of the technologies involved in the cloud makes testing extremely challenging and demands novel approaches and major advancements in the field. This chapter describes the main challenges associated with the test of FI applications running in the cloud. We present a research agenda that was defined in order to address the FI testing challenges. The goal of the agenda is investigating the technologies for the development of a FI automated testing environment, which can monitor the FI applications under test and can react dynamically to the observed changes. Realization of this environment involves substantial research in areas such as search based testing, model inference, oracle learning and anomaly detection.
Conference Paper
Full-text available
Since our society is becoming increasingly dependent on applications emerging on the Future Internet, quality of these applications becomes a matter that cannot be neglected. However, the complexity of the technologies involved in Future Internet applications makes testing extremely challenging. The EU FP7 FITTEST project has addressed some of these challenges by developing and evaluating a Continuous and Integrated Testing Environment that monitors a Future Internet application when it runs such that it can automatically adapt the testware to the dynamically changing behaviour of the application.
Article
Full-text available
Combinatorial testing can detect hard-to-find software faults more efficiently than manual test case selection methods. While the most basic form of combinatorial testing-pairwise-is well established, and adoption by software testing practitioners continues to increase, industry usage of these methods remains patchy at best. However, the additional training required is well worth the effort.
Conference Paper
Full-text available
Software testing is expensive for the industry, and always constrained by time and effort. Although there is a multitude of test techniques, there are currently no scientifically based guidelines for the selection of appropriate techniques of different domains and contexts. For large complex systems, some techniques are more efficient in finding failures than others and some are easier to apply than others are. From an industrial perspective, it is important to find the most effective and efficient test design technique that is possible to automate and apply. In this paper, we propose an experimental framework for comparison of test techniques with respect to efficiency, effectiveness and applicability. We also plan to evaluate ease of automation, which has not been addressed by previous studies. We highlight some of the problems of evaluating or comparing test techniques in an objective manner. We describe our planned process for this multi-phase experimental study. This includes presentation of some of the important measurements to be collected with the dual goals of analyzing the properties of the test technique, as well as validating our experimental framework.
Article
Full-text available
This position paper aims at discussing a number of issues that typically arise when performing empirical studies with software testing techniques. Though some problems are general to all empirical disciplines, software testing studies face a number of specific challenges. Some of the main ones are discussed in sequence below.
Article
Full-text available
Abstract,Case study is a suitable research methodology,for software engineering,research since it studies contemporary phenomena in its natural context. However, the understanding of what constitutes a case study varies, and hence the quality of the resulting studies. This paper aims,at providing,an introduction to case study methodology,and,guidelines for researchers,conducting,case studies and,readers studying,reports of such,studies. The content is based on the authors’ own,experience from conducting,and reading case studies. The terminology,and,guidelines are compiled,from,different methodology,handbooks,in other research domains, in particular social science and information systems, and adapted to the needs,in software,engineering. We,present,recommended,practices for software engineering,case studies as well,as empirically,derived,and,evaluated,checklists for researchers and readers of case study research. Keywords,Casestudy.Research methodology.Checklists .Guidelines
Article
Full-text available
Techniques for detecting defects in source code are fundamental to the success of any software development approach. A software development organization therefore needs to understand the utility of techniques such as reading or testing in its own environment. Controlled experiments have proven to be an effective means for evaluating software engineering techniques and gaining the necessary understanding about their utility. This paper presents a characterization scheme for controlled experiments that evaluate defect-detection techniques. The characterization scheme permits the comparison of results from similar experiments and establishes a context for crossexperiment analysis of those results. The characterization scheme is used to structure a detailed survey of four experiments that compared reading and testing techniques for detecting defects in source code. We encourage educators, researchers, and practition- Also with the Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.
Conference Paper
Testing a web application is typically very complicated. Imposing simple coverage criteria such as function or line coverage is often not sufficient to uncover bugs due to incorrect components integration. Combinatorial testing can enforce a stronger criterion, while still giving us the ability to prioritize the test cases to keep the overall effort feasible. To do it requires the whole testing domain to be classified and formalized, e.g. in terms of classification trees. At the system testing level, these trees are quite large. This short paper presents our preliminary work to automatically construct classification trees by logging the system and to subsequently calculate the coverage of our test runs against various combinatorial criteria. We use the tool CTE, it allows such a criteria to be custom specified, e.g. to take semantic constraints of the system into account. Furthermore, it comes with a graphical interface to let users specify new test sequences as needed.
Conference Paper
The EU funded FITTEST FP7 project aims to address the Future Internet (FI) testing challenges. FITTEST will be integrated in three pilot applications provided by three industrial partners, IBM, Sulake and Soft am. This paper presents the Modelio SaaS product and case study context selected by Soft am as FITTEST Project industrial application and the usage of the Object Management Group (OMG) UML Testing Profile module. In the paper, researchers present the advanced software engineering methods proposed by FITTEST and the usage of the OMG UML Testing Profile (UTP) in a real industrial environment within Soft am Modelio SaaS.
Conference Paper
The classification tree methods allows for the automated test suite generation for given test objects. The important generation of test sequence can only be done manually. In this paper, we present an approach to use classification trees for test sequence generation. We will use existing approaches, e.g. from the field of researches on state machines. First results from our work are presented in this paper, including new dependency and generation rules.
Conference Paper
This paper aims at evaluating a set of automated tools of the FITTEST EU project within an industrial case study. The case study was conducted at the IBM Research lab in Haifa, by a team responsible for building the testing environment for future development versions of an IBM system management product. The main function of that product is resource management in a networked environment. This case study has investigated whether current IBM Research testing practices could be improved or complemented by using some of the automated testing tools that were developed within the FITTEST EU project. Although the existing Test Suite from IBM Research (TSibm) that was selected for comparison is substantially smaller than the Test Suite generated by FITTEST (TSfittest), the effectiveness of TSfittest, measured by the injected faults coverage is significantly higher (50% vs 70%). With respect to efficiency, by normalizing the execution times, we found the TSfittest runs faster (9.18 vs. 6.99). This is due to the fact that the TSfittest includes shorter tests. Within IBM Research and for the testing of the target product in the simulated environment: the FITTEST tools can increase the effectiveness of the current practice and the test cases automatically generated by the FITTEST tools can help in more efficient identification of the source of the identified faults. Moreover, the FITTEST tools have shown the ability to automate testing within a real industry case.
Chapter
Case studies for evaluating tools in software engineering are powerful. Although they cannot achieve the scientific rigour of formal experiments, the results can provide sufficient information to help companies judge if a specific technology being evaluated will benefit their organization. This paper reports on a case study done for evaluating a combinatorial testing tool in a realistic industrial environment with real objects and subjects. The case study has been executed at Sulake, a company that develops social entertainment games and whose main product is Habbo Hotel, a social network community in the shape of an online Hotel that is visited by millions of teenagers every week all around the world. This paper describes the experimental design of the case study together with the results and decisions that Sulake has taken about the further study, adoption and implantation of these type of tools.
Article
The basic problem in software testing is selecting a set of test cases. This article presents a test case design using the Classification Tree Editor, a test case generation software tool. It shows how to integrate weighting factors into classification trees and generate prioritized test suites. The classification tree editor includes classical approaches such as minimal combination and pair-wise as well as weighted counterparts, statistical testing and new generating rules. The software allows for prioritization by occurrence probability, error probability or risk.
Article
Evolutionary structural testing has been researched and promising results have been presented. However, it has hardly been applied to real-world complex systems and as such, little is known about the scalability, applicability and acceptability of it in an industrial setting. The European project EvoTest (IST-33472) team has been working from 2006 till 2009 to improve this situation and this paper informs about the results. We start with an overview of tools and techniques which we have developed for automated evolutionary structural testing. Subsequently, we describe the empirical setup used to study the applicability of evolutionary structural testing in industry through two case studies. The test objects used for the studies are selected functions (handwritten and generated) from production systems at Daimler and Berner & Mattner Systemtechnik (BMS) like, for example, Rear Window Defroster, Global Powertrain Engine Controller, Window Lift Control System, etc. The results of the case studies are described and research questions are assessed based on the obtained results. In summary, the results indicate that evolutionary structural testing in an industrial setting is worthwhile and profitable. Hardly any detailed knowledge of evolutionary computation is required to search for interesting test data. The case studies also research the benefits of using techniques like automated parameter tuning and search space smoothing.
Conference Paper
[Context] Numerous combinatorial testing techniques are available for generating test cases. However, many of them are never used in practice. [Objective] Considering that learn ability plays a vital role in initial adoption or rejection of a technology, in this paper we aim to investigate the learnability of a combinatorial testing tool in an industrial environment. [Method] A case study research method was designed and conducted, by including i) the definition of learnability measures for test cases models built using a combinatorial testing tool. ii) A training program was also implemented. iii) Qualitative and quantitative evaluation based on a three-level strategy was carried out (Reaction, Learning, and Performance). [Results] At the first level, the tool was perceived as easy to learn by the trainees (from a five-point ordinal scale). However, at the second level, during hands-on learning, it changed slightly: According to the working diaries, there were major difficulties. At third level, analyzing the learning curve of each trainee, we observe that semantic errors made per each subject were reduced slightly over the time.
Article
The combinatorial test design and combinatorial interaction testing are well studied topics. For the generation of dynamic test sequences from a formal specification of combinatorial problems, there has not been much work yet. The classification tree method implements aspects from the field of combinatorial testing. This paper extends the classification tree with additional information to allow the interpretation of the classification tree as a hierarchical concurrent state machine. Using this state machine, our new approach then uses a Multi-agent System to generate test sequences by finding and rating valid paths through the state machine.
Article
During the past years, evolutionary testing research has reported encouraging results for automated functional (i.e. black-box) testing. However, despite promising results, these techniques have hardly been applied to complex, real-world systems and as such, little is known about their scalability, applicability, and acceptability in industry. In this paper, we describe the empirical setup used to study the use of evolutionary functional testing in industry through two case studies, drawn from serial production development environments at Daimler and Berner & Mattner Systemtechnik, respectively. Results of the case studies are presented, and research questions are assessed based on them. In summary, the results indicate that evolutionary functional testing in an industrial setting is both scalable and applicable. However, the creation of fitness functions is time-consuming. Although in some cases, this is compensated by the results, it is still a significant factor preventing functional evolutionary testing from more widespread use in industry.
Article
Combinatorial Testing (CT) can detect failures triggered by interactions of parameters in the Software Under Test (SUT) with a covering array test suite generated by some sampling mechanisms. It has been an active field of research in the last twenty years. This article aims to review previous work on CT, highlights the evolution of CT, and identifies important issues, methods, and applications of CT, with the goal of supporting and directing future practice and research in this area. First, we present the basic concepts and notations of CT. Second, we classify the research on CT into the following categories: modeling for CT, test suite generation, constraints, failure diagnosis, prioritization, metric, evaluation, testing procedure and the application of CT. For each of the categories, we survey the motivation, key issues, solutions, and the current state of research. Then, we review the contribution from different research groups, and present the growing trend of CT research. Finally, we recommend directions for future CT research, including: (1) modeling for CT, (2) improving the existing test suite generation algorithm, (3) improving analysis of testing result, (4) exploring the application of CT to different levels of testing and additional types of systems, (5) conducting more empirical studies to fully understand limitations and strengths of CT, and (6) combining CT with other testing techniques.
Article
The most important prerequisite for a thorough software test is the determination of relevant test cases. The classification-tree method suggested in this paper supports the systematic determination and description of test cases. It is based on the idea of partition testing. By means of the classification-tree method, the input domain of a test object is regarded under various aspects assessed as relevant by the tester. For each aspect, disjoint and complete classifications are formed. Classes resulting from these classifications may be further classified—even recursively. Test cases are formed by combining classes of different classifications. The stepwise partition of the input domain by means of classifications is represented graphically in the form of a tree. This tree is subsequently used to form a combination table in which the test cases are marked. Extensions of the graphical notation and tool support allow the use of the method even for extensive test problems.
Conference Paper
In the foreseeable future, software testing will remain one of the best tools we have at our disposal to ensure software dependability. Empirical studies are crucial to software testing research in order to compare and improve software testing techniques and practices. In fact, there is no other way to assess the cost-effectiveness of testing techniques, since all of them are, to various extents, based on heuristics and simplifying assumptions. However, when empirically studying the cost and fault- detection rates of a testing technique, a number of validity issues arise. Further, there are many ways in which empirical studies can be performed, ranging from simulations to controlled experiments with human subjects. What are the strengths and drawbacks of the various approaches? What is the best option under which circumstances? This paper presents a critical analysis of empirical research in software testing and will attempt to highlight and clarify the issues above in a structured and practical manner.
Article
The number of ways a system must be tested can often be overwhelming. There are a number of automatic testcase generation tools available but these can suffer from combinatorial explosions in the number of possibilities to test. We have combined table based testing, code coverage with Bellcore's Automatic Efficient Testcase Generator (AETG) to generate small efficient sets of testcases. AETG uses a pair-wise testcase generation technique to generate tables of test vectors, which a test driver can then execute immediately. Code coverage is then used to indicate missing functionality from AETG's model. Our initial trial of this was on a subset Nortel's internal e-mail system where we able cover 97% of branches with less than 100 valid and invalid testcases, as opposed to 27 trillion exhaustive testcases. Since then we have used it to implement low level white box testing of individual procedures up to high level black box testing of interactions between different call processing feature...
A framework for comparing efficiency, effectiveness and applicability of software testing techniques
  • S Eldh
  • H Hansson
  • S Punnekkat
  • A Pettersson
  • D Sundmark
S. Eldh, H. Hansson, S. Punnekkat, A. Pettersson, and D. Sundmark, "A framework for comparing efficiency, effectiveness and applicability of software testing techniques," in Testing: Academic and Industrial Conference-Practice And Research Techniques, 2006. TAIC PART 2006. Proceedings. IEEE, 2006, pp. 159-170.
Evaluating applicability of combinatorial testing in an industrial environment: a case study
  • E Puoskari
  • T E Vos
  • N Condori-Fernandez
  • P M Kruse
E. Puoskari, T. E. Vos, N. Condori-Fernandez, and P. M. Kruse, "Evaluating applicability of combinatorial testing in an industrial environment: a case study," in Proceedings of the 2013 International Workshop on Joining AcadeMiA and Industry Contributions to testing Automation. ACM, 2013, pp. 7-12.
Evaluating the fittest automated testing tools: an industrial case study
  • C D Nguyen
  • B Mendelson
  • D Citron
  • O Shehory
  • T E Vos
  • N Condori-Fernandez
C. D. Nguyen, B. Mendelson, D. Citron, O. Shehory, T. E. Vos, and N. Condori-Fernandez, "Evaluating the fittest automated testing tools: an industrial case study," in Empirical Software Engineering and Measurement, 2013 ACM/IEEE International Symposium on. IEEE, 2013, pp. 332-339.
Combinatorial testing tool learnability in an industrial environment
  • P M Kruse
  • N Condori-Fernández
  • T E Vos
  • A Bagnato
  • E Brosse
P. M. Kruse, N. Condori-Fernández, T. E. Vos, A. Bagnato, and E. Brosse, "Combinatorial testing tool learnability in an industrial environment," in Empirical Software Engineering and Measurement, 2013 ACM/IEEE International Symposium on. IEEE, 2013, pp. 304-312.