Conference Paper

To Mock or Not to Mock? An Empirical Study on Mocking Practices

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When writing automated unit tests, developers often deal with software artifacts that have several dependencies. In these cases, one has the possibility of either instantiating the dependencies or using mock objects to simulate the dependencies' expected behavior. Even though recent quantitative studies showed that mock objects are widely used in OSS projects, scientific knowledge is still lacking on how and why practitioners use mocks. Such a knowledge is fundamental to guide further research on this widespread practice and inform the design of tools and processes to improve it. The objective of this paper is to increase our understanding of which test dependencies developers (do not) mock and why, as well as what challenges developers face with this practice. To this aim, we create MockExtractor, a tool to mine the usage of mock objects in testing code and employ it to collect data from three OSS projects and one industrial system. Sampling from this data, we manually analyze how more than 2,000 test dependencies are treated. Subsequently, we discuss our findings with developers from these systems, identifying practices, rationales, and challenges. These results are supported by a structured survey with more than 100 professionals. The study reveals that the usage of mocks is highly dependent on the responsibility and the architectural concern of the class. Developers report to frequently mock dependencies that make testing difficult and prefer to not mock classes that encapsulate domain concepts/rules of the system. Among the key challenges, developers report that maintaining the behavior of the mock compatible with the behavior of original class is hard and that mocking increases the coupling between the test and the production code.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Despite their advantages, using mocks within tests requires considerable engineering effort. Developers must first identify the components that may be replaced with mocks [3]. They must also determine how these mocks behave when triggered with a certain input, i.e., how they are stubbed [4], [5]. ...
... Mocking allows individual functionalities to be tested independently. The process of unit testing with mocks is faster and more focused [3]. Since the test intention is to verify the behavior of one individual unit, mocking can facilitate fault localization. ...
... It is not possible to generate test cases with mocks for all methods with nested method calls. For example, a static method invoked within another method is typically not mocked [3]. Also, it is not feasible to replace an object created within the body of a method, and subsequently mock the interactions made with it. ...
Article
Full-text available
Mocking allows testing program units in isolation. A developer who writes tests with mocks faces two challenges: design realistic interactions between a unit and its environment; and understand the expected impact of these interactions on the behavior of the unit. In this paper, we propose to monitor an application in production to generate tests that mimic realistic execution scenarios through mocks. Our approach operates in three phases. First, we instrument a set of target methods for which we want to generate tests, as well as the methods that they invoke, which we refer to as mockable method calls. Second, in production, we collect data about the context in which target methods are invoked, as well as the parameters and the returned value for each mockable method call. Third, offline, we analyze the production data to generate test cases with realistic inputs and mock interactions. The approach is automated and implemented in an open-source tool called RICK. We evaluate our approach with three real-world, opensource Java applications. RICK monitors the invocation of 128 methods in production across the three applications and captures their behavior. Based on this captured data, RICK generates test cases that include realistic initial states and test inputs, as well as mocks and stubs. All the generated test cases are executable, and 52.4% of them successfully mimic the complete execution context of the target methods observed in production. The mock-based oracles are also effective at detecting regressions within the target methods, complementing each other in their fault-finding ability. We interview 5 developers from the industry who confirm the relevance of using production observations to design mocks and stubs. Our experimental findings clearly demonstrate the feasibility and added value of generating mocks from production interactions.
... In order to overcome these challenges, practitioners devised a mechanism called mocking, which replaces test dependencies of the FUT by creating mock objects (Spadini et al. 2017(Spadini et al. , 2019. That is, developers create a faked object and control its behavior to mimic the behavior of a dependency for the testing purpose. ...
... Note that despite the advantages of mocking frameworks, not all developers fully support their use. For instance, Spadini's study (Spadini et al. 2017(Spadini et al. , 2019 points out that using mocks can bring several challenges such as maintaining the compatibility of the mock's behavior with the original class, the relationship between the amount of mocking required for a test class and its code quality, and the (unfavorable) excessive use of mock objects to test legacy systems. In communities like StackOverflow (Why is it so bad to mock classes? ...
... Studies that are closest to ours include (Mostafa and Wang 2014;Spadini et al. 2017Spadini et al. , 2019, which all presented empirical studies regarding how mocking is practiced. Of particular note, our study is most similar to study (Mostafa and Wang 2014) in that both investigated how mocking frameworks are used in open-source communities. ...
Article
Full-text available
Mocking frameworks provide convenient APIs, which create mock objects, manipulate their behavior, and verify their execution, for the purpose of isolating test dependencies in unit testing. This study contributes an in-depth empirical study of whether and how mocking frameworks are used in Apache projects. The key findings and insights of this study include: First, mocking frameworks are widely used in 66% of Apache Java projects, with Mockito, EasyMock, and PowerMock being the top three most popular frameworks. Larger-scale and more recent projects tend to observe a stronger need to use mocking frameworks. This underscores the importance of mocking in practice and related future research. Second, mocking is overall practiced quite selectively in software projects—not all test files use mocking, nor all dependencies of a test target are mocked. It calls for more future research to gain a more systematic understanding of when and what to mock to provide formal guidance to practitioners. On top of this, the intensity of mocking in different projects shows different trends in the projects’ evolution history—implying the compound effects of various factors, such as the pace of a project’s growth, the available resources, time pressure, and priority, etc. This points to an important future research direction in facilitating best mocking practices in software evolution. Furthermore, we revealed the most frequently used APIs in the three most popular frameworks, organized based on the function types. The top five APIs in each functional type of the three mocking frameworks usually take the majority (78% to 100%) of usage in Apache projects. This indicates that developers can focus on these APIs to quickly learn the common usage of these mocking frameworks. We further investigated informal methods of mocking, which do not rely on any mocking framework. These informal mocking methods point to potential sub-optimal mocking practices that could be improved, as well as limitations of existing mocking frameworks. Finally, we conducted a developer survey to collect additional insights regarding the above analysis based on their experience, which complements our analysis based on repository mining. Overall, this study offers practitioners profound empirical knowledge of how mocking frameworks are used in practice and sheds light on future research directions to enhancing mocking in practice.
... The primary goal of this research is to explore the practices of employing mock objects in software testing. Unlike other studies [2,7,13,14,16,20] that focused solely on a specific programming language or were limited in scope, this research aims to encompass several languages without restricting the domain of the analyzed projects. ...
... This information complements the findings of [16], which reported that 77.2% of test classes feature up to three simulated dependencies. ...
... As observed in other studies [16], most of the observed tests made use of mock objects. However, this practice may require adjustments when the test becomes tightly coupled to the implementation, as it can lead to a flaky test. ...
... Despite their advantages, using mocks within tests requires considerable engineering effort. While designing the tests for one component, developers must first decide which interactions may be replaced with mocks [3]. They must also determine how these mocks behave when triggered with a certain input, i.e., how they are stubbed [4]. ...
... Mocking allows individual functionalities to be tested independently. The process of unit testing with mocks is faster and more focused [3]. Since the test intention is to verify the behaviour of one individual unit, mocking can facilitate fault localization. ...
... It is not possible to generate test cases with mocks for all methods with nested method calls. For example, a static method invoked within another method is typically not mocked [3]. Also, it is not feasible to replace an object created within the body of a method, and subsequently mock the interactions made with it. ...
Preprint
Mocking in the context of automated software tests allows testing program units in isolation. Designing realistic interactions between a unit and its environment, and understanding the expected impact of these interactions on the behavior of the unit, are two key challenges that software testers face when developing tests with mocks. In this paper, we propose to monitor an application in production to generate tests that mimic realistic execution scenarios through mocks. Our approach operates in three phases. First, we instrument a set of target methods for which we want to generate tests, as well as the methods that they invoke, which we refer to mockable method calls. Second, in production, we collect data about the context in which target methods are invoked, as well as the parameters and the returned value for each mockable method call. Third, offline, we analyze the production data to generate test cases with realistic inputs and mock interactions. The approach is automated and implemented in an open-source tool called RICK. We evaluate our approach with three real-world, open-source Java applications. RICK monitors the invocation of 128 methods in production across the three applications and captures their behavior. Next, RICK analyzes the production observations in order to generate test cases that include rich initial states and test inputs, mocks and stubs that recreate actual interactions between the method and its environment, as well as mock-based oracles. All the test cases are executable, and 52.4% of them successfully mimic the complete execution context of the target methods observed in production. We interview 5 developers from the industry who confirm the relevance of using production observations to design mocks and stubs.
... By means of these attributes, we propose to classify the components into 4 categories "Mock", "Testable", "Testable in Isolation", "Code Review", to help developers dress their test plan. To select relevant quality attributes, we firstly studied the papers dealing with the use of mocks for testing, e.g., [10,11,21,3,16,12,2,23,22]. In particular, we took back the conclusions of the recent surveys of Spadini et al. [22,21], which intuitively report that developers often use mocks to replace the components that are difficult to interpret, complex, not testable, or those that are called by others (e.g., external components like Web services). ...
... To select relevant quality attributes, we firstly studied the papers dealing with the use of mocks for testing, e.g., [10,11,21,3,16,12,2,23,22]. In particular, we took back the conclusions of the recent surveys of Spadini et al. [22,21], which intuitively report that developers often use mocks to replace the components that are difficult to interpret, complex, not testable, or those that are called by others (e.g., external components like Web services). Then, we studied some papers related to Testability [7,19,8,9] and Dependability [20,5]. ...
... In reference to [14,22], we recall that developing a mock comes down to creating a component that mimics the behaviours of another real component (H1). A mocks should be easily created, easily set up, and directly queriable (H2). ...
... Often, during testing activities, developers are faced with dependencies (e.g., web services, database, etc) that make the test harder to be implemented [1]. In this scenario, developers can either instantiate these dependencies inside the test or use mock objects to emulate the dependencies' behavior [2], [3]. The use of mock objects can contribute to make the test fast, isolated, repeatable, and deterministic [1]. ...
... Past research showed that mocking frameworks are largely adopted by software projects [8] and that they may indeed support the creation of unit tests [2], [3], [9], [10]. Moreover, recent research showed how and why practitioners use mocks and the challenges faced by developers [2], [3]. ...
... Past research showed that mocking frameworks are largely adopted by software projects [8] and that they may indeed support the creation of unit tests [2], [3], [9], [10]. Moreover, recent research showed how and why practitioners use mocks and the challenges faced by developers [2], [3]. However, those researches are restricted to the context of mocking frameworks. ...
Conference Paper
Full-text available
During testing activities, developers frequently rely on dependencies (e.g., web services, etc) that make the test harder to be implemented. In this scenario, they can use mock objects to emulate the dependencies' behavior, which contributes to make the test fast and isolated. In practice, the emulated dependency can be dynamically created with the support of mocking frameworks or manually hand-coded in mock classes. While the former is well-explored by the research literature, the latter has not yet been studied. Assessing mock classes would provide the basis to better understand how those mocks are created and consumed by developers and to detect novel practices and challenges. In this paper, we provide the first empirical study to assess mock classes. We analyze 12 popular software projects, detect 604 mock classes, and assess their content, design, and usage. We find that mock classes: often emulate domain objects, external dependencies, and web services; are typically part of a hierarchy; are mostly public, but 1/3 are private; and are largely consumed by client projects, particularly to support web testing. Finally, based on our results, we provide implications and insights to researchers and practitioners working with mock classes.
... Generating effective mock assertions requires understanding their usage in practice. Although several studies emphasize the importance of mock assertions [10,28,29,35,39], none provides such insights. To bridge this gap, we conducted the first empirical study on the usage of mock assertions. ...
... Mostafa and Wang [19] analyzed the usage of mocking frameworks in a vast number of open-source Java projects, revealing that while mock objects are widely used, only a subset of test dependencies are mocked. Spadini et al. [28] explored developers' mocking decisions and found that classes contributing to testing difficulties are often mocked. They further investigated the evolution of mocking framework usage [29], highlighting the frequent evolution of API usage related to mock assertions. ...
Preprint
Full-text available
Mock assertions provide developers with a powerful means to validate program behaviors that are unobservable to test assertions. Despite their significance, they are rarely considered by automated test generation techniques. Effective generation of mock assertions requires understanding how they are used in practice. Although previous studies highlighted the importance of mock assertions, none provide insight into their usages. To bridge this gap, we conducted the first empirical study on mock assertions, examining their adoption, the characteristics of the verified method invocations, and their effectiveness in fault detection. Our analysis of 4,652 test cases from 11 popular Java projects reveals that mock assertions are mostly applied to validating specific kinds of method calls, such as those interacting with external resources and those reflecting whether a certain code path was traversed in systems under test. Additionally, we find that mock assertions complement traditional test assertions by ensuring the desired side effects have been produced, validating control flow logic, and checking internal computation results. Our findings contribute to a better understanding of mock assertion usages and provide a foundation for future related research such as automated test generation that support mock assertions.
... (2) Often, they are applicable only at the componentlevel to simulate a component fully, while for debugging purposes, developers may need to simulate only parts of a component. (3) More importantly, while in the code-base development context, there are several mocking frameworks (e.g., Mockito, EasyMock, JMock, Opmock, etc.) that can be used to simulate components of a system [30], there is a lack of facilities, guidelines, and frameworks in the context of MDE to help to create mockers [31]. ...
... fix it compared to states that already have incoming transitions. 3) If the rule handles a broken chain, the algorithm tries items c) and d), in the same way as the previous heuristic (line# [30][31]. ...
Preprint
Full-text available
The iterative and incremental nature of software development using models typically makes a model of a system incomplete (i.e., partial) until a more advanced and complete stage of development is reached. Existing model execution approaches (interpretation of models or code generation) do not support the execution of partial models. Supporting the execution of partial models at the early stages of software development allows early detection of defects, which can be fixed more easily and at a lower cost. This paper proposes a conceptual framework for the execution of partial models, which consists of three steps: static analysis, automatic refinement, and input-driven execution. First, a static analysis that respects the execution semantics of models is applied to detect problematic elements of models that cause problems for the execution. Second, using model transformation techniques, the models are refined automatically, mainly by adding decision points where missing information can be supplied. Third, refined models are executed, and when the execution reaches the decision points, it uses inputs obtained either interactively or by a script that captures how to deal with partial elements. We created an execution engine called PMExec for the execution of partial models of UML-RT (i.e., a modeling language for the development of soft real-time systems) that embodies our proposed framework. We evaluated PMExec based on several use-cases that show that the static analysis, refinement, and application of user input can be carried out with reasonable performance and that the overhead of approach, which is mostly due to the refinement and the increase in model complexity it causes, is manageable. We also discuss the properties of the refinement formally and show how the refinement preserves the original behaviors of the model.
... While mocking is commonly used among developers [34,35], we argue that developers considering whether to use mocks may divert them from our main goal which is to observe how they reflect about deriving test cases. 1 After manually exploring the Apache Commons Lang 2 , a library known for its utility methods for string manipulation, we end up with four programs. ...
... Mocking is a common technique used by developers when they face more complicated pieces of code to test. In particular, Spadini et al. [34,35] explored how different Java systems make use of mocking. The authors observe that developers tend to mock infrastructure classes (e.g., classes that access databases or webservices) and/or classes that are too complex to be instantiated in the test code. ...
Preprint
One of the main challenges that developers face when testing their systems lies in engineering test cases that are good enough to reveal bugs. And while our body of knowledge on software testing and automated test case generation is already quite significant, in practice, developers are still the ones responsible for engineering test cases manually. Therefore, understanding the developers' thought- and decision-making processes while engineering test cases is a fundamental step in making developers better at testing software. In this paper, we observe 13 developers thinking-aloud while testing different real-world open-source methods, and use these observations to explain how developers engineer test cases. We then challenge and augment our main findings by surveying 72 software developers on their testing practices. We discuss our results from three different angles. First, we propose a general framework that explains how developers reason about testing. Second, we propose and describe in detail the three different overarching strategies that developers apply when testing. Third, we compare and relate our observations with the existing body of knowledge and propose future studies that would advance our knowledge on the topic.
... To select relevant quality metrics, we firstly studied the papers dealing with the use of mocks for testing, e.g., [10, 11, 21, 3, 16, 12, 2, ?,22]. In particular, we took back the conclusions of the recent surveys of Spadini et al. [22,21], which intuitively report that developers often use mocks to replace the components that are difficult to interpret, complex, not testable, or those that are called by others (e.g., external components like Web services). Then, we studied some papers related to Testability [7,19,8,9] and Dependability [20, ?]. ...
... In reference to [14,22], we recall that developing a mock comes down to creating a component that mimics the behaviours of another real component (H1). A mocks should be easily created, easily set up, and directly queriable (H2). ...
Conference Paper
Full-text available
Mocking objects is a common technique that substitutesparts of a program to simplify the test case development, to increasetest coverage or to speed up performance. Today, mocks are almost ex-clusively used with object oriented programs. But mocks could offer thesame benefits with communicating systems to make them more reliable.This paper proposes a model-based approach to help developers generatemocks for this kind of system, i.e. systems made up of components inter-acting with each other by data networks and whose communications canbe monitored. The approach combines model learning to infer modelsfrom event logs, quality metric measurements to help chose the compo-nents that may be replaced by mocks, and mock generation and executionalgorithms to reduce the mock development time. The approach has beenimplemented as a tool chain with which we performed experimentationsto evaluate its benefits in terms of usability and efficiency.
... Firstly, it records the network interaction between a component under development and its requisite services in numeral test scenarios via tools, such as Wireshark [10]. Second, it uses data mining techniques to generate executable models for requisite services [4,9,[11][12][13]. The SV solutions are also known as record-and-replay technique because they find the most similar request from the recorded network interactions to the incoming request and replace some of the response fields to generate the new response. ...
... Multiple methods have been proposed to ensure the availability of the requisite services and the environment to test each service. To imitate the server-side systems' interactive behaviour, mock objects and stubs [3,4] are commonly used. However, this method requires coding in a specific language; as a result, each service modification leads to the manual change of the code. ...
Article
Full-text available
Software services communicate with different requisite services over the computer network to accomplish their tasks. The requisite services may not be readily available to test a specific service. Thus, service virtualisation has been proposed as an industry solution to ensure the availability of the interactive behaviour of the requisite services. However, the existing techniques of virtualisation cannot satisfy the required accuracy or time constraints to keep up with the competitive business world. These constraints sacrifice quality and testing coverage, thereby delaying the delivery of software. We proposed a novel technique to improve the accuracy of the existing service virtualisation solutions without sacrificing time. This method generates the service response and predicts categorical fields in virtualised responses, extending existing research with lower complexity and higher accuracy. The proposed service virtualisation approach uses conditional entropy to identify the fields that can be used to drive the value of each categorical field based on the historical messages. Then, it uses a joint probability distribution to find the best values for the categorical fields. The experimental evaluation illustrates that the proposed approach can generate responses with the required fields and accurate values for categorical fields over four data sets with stateful nature.
... A number of techniques have been proposed for providing 'replicated' services in development and testing environments in order to ensure that they can be readily accessed by software engineers. Mock objects and stubs [5,6] are widely used to mimic the interactive behaviour of actual server-side systems. However, this approach is language-specific, and any modification in the systems leads to a change in the code. ...
... This task requires significant knowledge regarding the functionality of a requisite service, which is something that may not always be readily available (e.g., in case of legacy software) and practical when a requisite service's behaviour is quite complex. Some shortcomings of service emulation are addressed in SV by (i) recording the interactions between a component under development and its requisite services (e.g., using tools, such as Wireshark [12]), from a number of test scenarios and (ii) generating executable models of the requisite services by applying data mining techniques [6,11,13,14]. Consequently, some of these techniques are referred to as "record-and-replay" techniques. For example, the technique that was presented by Du et al. [14] "matches" an incoming request with those found in the aforementioned recordings and substitutes fields to generate a response for the incoming request. ...
Article
Full-text available
Continuous delivery has gained increased popularity in industry as a development approach to develop, test, and deploy enhancements to software components in short development cycles. In order for continuous delivery to be effectively adopted, the services that a component depends upon must be readily available to software engineers in order to systematically apply quality assurance techniques. However, this may not always be possible as (i) these requisite services may have limited access and (ii) defects that are introduced in a component under development may cause ripple effects in real deployment environments. Service virtualisation (SV) has been introduced as an approach to address these challenges, but existing approaches to SV still fall short of delivering the required accuracy and/or ease-of-use to virtualise services for adoption in continuous delivery. In this work, we propose a novel machine learning based approach to predict numeric fields in virtualised responses, extending existing research that has provided a way to produce values for categorical fields. The SV approach introduced here uses machine learning techniques to derive values of numeric fields that are based on a variable number of pertinent historic messages. Our empirical evaluation demonstrates that the Cognitive SV approach can produce responses with the appropriate fields and accurately predict values of numeric fields across three data sets, some of them based on stateful protocols.
... Originating from PHP, it combines a JIT-based runtime [46] with a gradual, object-oriented type system (providing static correctness guarantees similar to TypeScript for JavaScript). Hack developers can depend on commonly shared frameworks, including mocking for testing [50]. While CI forces test execution for every diff, developers can additionally run them locally if they so wish. ...
Preprint
Full-text available
This paper introduces Diff Authoring Time (DAT), a powerful, yet conceptually simple approach to measuring software development productivity that enables rigorous experimentation. DAT is a time based metric, which assess how long engineers take to develop changes, using a privacy-aware telemetry system integrated with version control, the IDE, and the OS. We validate DAT through observational studies, surveys, visualizations, and descriptive statistics. At Meta, DAT has powered experiments and case studies on more than 20 projects. Here, we highlight (1) an experiment on introducing mock types (a 14% DAT improvement), (2) the development of automatic memoization in the React compiler (33% improvement), and (3) an estimate of thousands of DAT hours saved annually through code sharing (> 50% improvement). DAT offers a precise, yet high-coverage measure for development productivity, aiding business decisions. It enhances development efficiency by aligning the internal development workflow with the experiment-driven culture of external product development. On the research front, DAT has enabled us to perform rigorous experimentation on long-standing software engineering questions such as "do types make development more efficient?"
... In such cases, developers typically create virtual objects using mocking techniques [29,39] to simulate the behavior of real external dependencies. However, to effectively generate such complex test code, LLMs must grasp the principles and applicable scenarios related to mocking [35,36]. ...
Preprint
Full-text available
Unit testing plays a pivotal role in the software development lifecycle, as it ensures code quality. However, writing high-quality unit tests remains a time-consuming task for developers in practice. More recently, the application of large language models (LLMs) in automated unit test generation has demonstrated promising results. Existing approaches primarily focus on interpreted programming languages (e.g., Java), while mature solutions tailored to compiled programming languages like C++ are yet to be explored. The intricate language features of C++, such as pointers, templates, and virtual functions, pose particular challenges for LLMs in generating both executable and high-coverage unit tests. To tackle the aforementioned problems, this paper introduces CITYWALK, a novel LLM-based framework for C++ unit test generation. CITYWALK enhances LLMs by providing a comprehensive understanding of the dependency relationships within the project under test via program analysis. Furthermore, CITYWALK incorporates language-specific knowledge about C++ derived from project documentation and empirical observations, significantly improving the correctness of the LLM-generated unit tests. We implement CITYWALK by employing the widely popular LLM GPT-4o. The experimental results show that CITYWALK outperforms current state-of-the-art approaches on a collection of eight popular C++ projects. Our findings demonstrate the effectiveness of CITYWALK in generating high-quality C++ unit tests.
... This operation occurs when the test verifies the current OS and performs mock-related operations to emulate missing or hard to test objects(Pereira and Hora 2020;Meszaros 2007;Spadini et al. 2017Spadini et al. , 2019. For example,Fig. ...
Article
Full-text available
Context Real-world software systems are often tested in multiple operating systems (OSs). Consequently, developers may need to handle specific OS requirements in tests. For example, different OSs have distinct file path name conventions (e.g., between Windows and Unix), thus, the tests should be adapted to run differently depending on whether the OS is Windows or Unix. In this context, an OS-specific test is a test that identifies the OS it will be executed. OS-specific tests may execute different lines of code of the application depending on the OS they are running. Objective In this paper, we provide the first empirical study to assess OS-specific tests, exploring how and why developers implement this kind of test. This knowledge can help us understand OS-specific tests and the challenges faced by developers when testing for multiple operating systems. Method We mine 100 popular Python systems and assess their OS-specific tests both quantitatively and qualitatively. We propose five research questions to assess the frequency, location, target, operations, and reasons. Results (1) We find that OS-specific tests are common: 56% of the analyzed Python projects have OS-specific tests and Windows is the most targeted OS. (2) We detect that OS verification happens more frequently in test decorators (65%) than in test code (35%). (3) OS-specific tests target a diversity of code, including file/directory, network, and permission/privilege. (4) Developers may perform multiple operations in OS-specific tests, including calling OS-specific APIs, mocking OS-specific objects, and suspending execution. (5) We find that OS-specific tests are implemented mostly to overcome unavailable external resources, unsupported standard libraries, and flaky tests. Conclusions Finally, based on our findings, we discuss practical implications for practitioners and researchers, including the relation of OS-specific tests with test smells, CI/CD, technical debt, and flaky tests. We also discuss the efforts to test on Windows properly and propose a novel refactoring to improve some instances of OS-specific tests.
... • These results show that APT generates maintainable and readable tests, with significantly fewer code style violations, especially in key features like code duplication and naming conventions. Mock Density evaluates the use of mock objects, as excessive mocking can lead to lower test readability, higher maintenance difficulty, and overly fine-grained test cases [41]. To further analyze the usability of the generated unit tests, we use PMD [36] used by Tang et al. [43] for mock detection. ...
Preprint
Automated unit test generation has been widely studied, with Large Language Models (LLMs) recently showing significant potential. Moreover, in the context of unit test generation, these tools prioritize high code coverage, often at the expense of practical usability, correctness, and maintainability. In response, we propose Property-Based Retrieval Augmentation, a novel mechanism that extends LLM-based Retrieval-Augmented Generation (RAG) beyond basic vector, text similarity, and graph-based methods. Our approach considers task-specific context and introduces a tailored property retrieval mechanism. Specifically, in the unit test generation task, we account for the unique structure of unit tests by dividing the test generation process into Given, When, and Then phases. When generating tests for a focal method, we not only retrieve general context for the code under test but also consider task-specific context such as pre-existing tests of other methods, which can provide valuable insights for any of the Given, When, and Then phases. This forms property relationships between focal method and other methods, thereby expanding the scope of retrieval beyond traditional RAG. We implement this approach in a tool called APT, which sequentially performs preprocessing, property retrieval, and unit test generation, using an iterative strategy where newly generated tests guide the creation of subsequent ones. We evaluated APT on 12 open-source projects with 1515 methods, and the results demonstrate that APT consistently outperforms existing tools in terms of correctness, completeness, and maintainability of the generated tests. Moreover, we introduce a novel code-context-aware retrieval mechanism for LLMs beyond general context, offering valuable insights and potential applications for other code-related tasks.
... Therefore, testing microservices is a hard task due to microservices variety and depencencies between them when constituting a whole functional service. When writing automated unit tests, developers often deal with software artifacts that have several dependencies [19]. In these cases, one has the possibility of either instantiating the dependencies or using mock objects to simulate the dependencies' expected behavior. ...
Conference Paper
Full-text available
In a large Smart Grid, smart meters produce tremendous amount of data that are hard to process, analyze and store. Fog computing is an environment that offers a place for collecting, computing and storing smart meter data before transmitting them to the cloud. Due to the distributed, heterogeneous and resource constrained nature of the fog computing nodes, fog applications need to be developed as a collection of interdependent, lightweight modules. Since this concept aligns with the goals of microservices architecture (MSA), efficient placement of microservices-based Smart Grid applications within fog environments has the potential to fully leverage capabilities of fog devices. Microservice architecture is an emerging software architectural style. It is based on microservices to provide several advantages over a monolithic solution, such as autonomy, composability, scalability, and fault-tolerance. However, optimizing the migration of microservices from one fog environment to other while assuring certain quality is still a big issue that needs to be addressed. In this paper, we propose an approach for assisting the migration of microservices in MSA-based Smart Grid systems, based on the analysis of their performance within the possible candidate destinations. Developers create microservices that will be eventually deployed at a given infrastructure. Either the developer, cosidering the design, or the entity deploying the service have a good knowledge of the quality required by the microservice. Due to that, they can create tests that determine if a destination meets the requirements of a given microservice and embed these tests as part of the microservice. Our goal is to automate the execution of performance tests by attaching a specification that contains the test parameters to each microservice.
... We present DockerMock, a pre-build fault detector for Dockerfiles, via mocking instruction execution, as shown in Fig. 6. Mocking is a unit testing practice, replacing dependencies with mock objects [16]. DockerMock simulates the build process of the Dockerfile and warns the violation between mock context and the requirement of any mock instruction. ...
Preprint
Continuous Integration (CI) and Continuous Deployment (CD) are widely adopted in software engineering practice. In reality, the CI/CD pipeline execution is not yet reliably continuous because it is often interrupted by Docker build failures. However, the existing trial-and-error practice to detect faults is time-consuming. To timely detect Dockerfile faults, we propose a context-based pre-build analysis approach, named DockerMock, through mocking the execution of common Dockerfile instructions. A Dockerfile fault is declared when an instruction conflicts with the approximated and accumulated running context. By explicitly keeping track of whether the context is fuzzy, DockerMock strikes a good balance of detection precision and recall. We evaluated DockerMock with 53 faults in 41 Dockerfiles from open source projects on GitHub and 130 faults in 105 Dockerfiles from student course projects. On average, DockerMock detected 68.0% Dockerfile faults in these two datasets. While baseline hadolint detected 6.5%, and baseline BuildKit detected 60.5% without instruction execution. In the GitHub dataset, DockerMock reduces the number of builds to 47, outperforming that of hadolint (73) and BuildKit (74).
... Several approaches aimed to provide the required components and environments ready for testing each component. The first approach is the commonly used mock objects such as stubs [2,3]. The server-side interactive behaviour of each requisite component is The referenced SV solutions cover some simple stateless protocols that do not require more than the current request to generate an accurate response message. ...
Article
Full-text available
Continuous delivery is an industry software development approach that aims to reduce the delivery time of software and increase the quality assurance within a short development cycle. The fast delivery and improved quality require continuous testing of the developed software service. Testing services are complicated and costly and postponed to the end of development due to unavailability of the requisite services. Therefore, an empirical approach that has been utilised to overcome these challenges is to automate software testing by virtualising the requisite services’ behaviour for the system being tested. Service virtualisation involves analysing the behaviour of software services to uncover their external behaviour in order to generate a light-weight executable model of the requisite services. There are different research areas which can be used to create such a virtual model of services from network interactions or service execution logs, including message format extraction, inferring control model, data model and multi-service dependencies. This paper reviews the state-of-the-art of how these areas have been used in automating the service virtualisation to make available the required environment for testing software. This paper provides a review of the relevant research within these four fields by carrying out a structured study on about 80 research works. These studies were then categorised according to their functional context as, extracting the message format, control model, data model and multi-service dependencies that can be employed to automate the service virtualisation activity. Based on our knowledge, this is the first structural review paper in service virtualisation fields.
Preprint
Full-text available
GitHub, renowned for facilitating collaborative code version control and software production in software teams, expanded its services in 2017 by introducing GitHub Marketplace. This online platform hosts automation tools to assist developers with the production of their GitHub-hosted projects, and it has become a valuable source of information on the tools used in the Open Source Software (OSS) community. In this exploratory study, we introduce GitHub Marketplace as a software marketplace by comprehensively exploring the platform's characteristics, features, and policies and identifying common themes in production automation. Further, we explore popular tools among practitioners and researchers and highlight disparities in the approach to these tools between industry and academia. We adopted the conceptual framework of software app stores from previous studies to examine 8,318 automated production tools (440 Apps and 7,878 Actions) across 32 categories on GitHub Marketplace. We explored and described the policies of this marketplace as a unique platform where developers share production tools for the use of other developers. Furthermore, we systematically mapped 515 research papers published from 2000 to 2021 and compared open-source academic production tools with those available in the marketplace. We found that although some of the automation topics in literature are widely used in practice, they have yet to align with the state of practice for automated production. We discovered that practitioners often use automation tools for tasks like "Continuous Integration" and "Utilities," while researchers tend to focus more on "Code Quality" and "Testing". Our study illuminates the landscape of open-source tools for automation production in industry and research.
Article
Infrastructure as Code (IaC) enables efficient deployment and operation, which are crucial to releasing software quickly. As setups can be complex, developers implement IaC programs in general-purpose programming languages like TypeScript and Python, using PL-IaC solutions like Pulumi and AWS CDK. The reliability of such IaC programs is even more relevant than in traditional software because a bug in IaC impacts the whole system. Yet, even though testing is a standard development practice, it is rarely used for IaC programs. For instance, in August 2022, less than 1% of the public Pulumi IaC programs on GitHub implemented tests. Available IaC program testing techniques severely limit the development velocity or require much development effort. To solve these issues, we propose Automated Configuration Testing (ACT), a methodology to test IaC programs in many configurations quickly and with low effort. ACT automatically mocks all resource definitions in the IaC program and uses generator and oracle plugins for test generation and validation. We implement ACT in ProTI , a testing tool for Pulumi TypeScript with a type-based generator and oracle, and support for application specifications. Our evaluation with 6 081 programs from GitHub and artificial benchmarks shows that ProTI can directly be applied to existing IaC programs, quickly finds bugs where current techniques are infeasible, and enables reusing existing generators and oracles thanks to its pluggable architecture.</p
Chapter
Testing large and complex enterprise software systems can be a challenging task. This is especially the case when the functionality of the system depends on interactions with other external services over a network (e.g., external REST APIs). Although several techniques in the research literature have been shown to be effective at generating test cases in many different software testing contexts, dealing with external services is still a major research challenge. In industry, a common approach is to mock external web services for testing purposes. However, generating and configuring mock web services can be a very time-consuming task. Furthermore, external services may not be under the control of the same developers of the tested application. In this paper, we present a novel search-based approach aimed at fully automated mocking external web services as part of white-box, search-based fuzzing. We rely on code instrumentation to detect all interactions with external services, and how their response data is parsed. We then use such information to enhance a search-based approach for fuzzing. The tested application is automatically modified (by manipulating DNS lookups) to rather interact with instances of mock web servers. The search process not only generates inputs to the tested applications, but also it automatically setups responses in those mock web server instances, aiming at maximizing code coverage and fault-finding. An empirical study on 3 open-source REST APIs from EMB, and one industrial API from an industry partner, shows the effectiveness of our novel techniques, i.e., significantly improves code coverage and fault detection.
Chapter
Accurate matching of test and production files plays a critical role in the analysis and evaluation of Test-Driven Development (TDD). However, current approaches often yield unsatisfactory results due to their reliance on filename-based matching. The purpose of this study is to compare the performance of a statement-based matching algorithm with the traditional filename-based approach. A comprehensive evaluation was conducted using 500 tests from 16 open-source Java projects, wherein the weighted F1-scores of both methods were assessed. Subsequently, the 95% confidence intervals were determined using a pseudosample size of 500. The statement-based approach achieved a 95% confidence interval of [0.6815, 0.7347], while the filename-based method had a notably lower interval of [0.1931, 0.2459]. These results demonstrate the superior performance of the statement-based matching algorithm, providing a more accurate and reliable solution for matching test and production files in TDD research. In conclusion, the statement-based matching algorithm significantly outperforms the filename-based method, which will benefit TDD research by offering a more accurate method of matching production files to test files.KeywordsTraceability LinksAssertion AnalysisF1-Score
Article
Unit testing focuses on verifying the functions of individual units of a software system. It is challenging due to the high inter dependencies among software units. Developers address this by mocking—replacing the dependency by a “fake” object. Despite the existence of powerful, dedicated mocking frameworks, developers often turn to a “hand-rolled” approach—inheritance. That is, they create a subclass of the dependent class and mock its behavior through method overriding. However, this requires tedious implementation and compromises the design quality of unit tests. This work contributes a fully automated refactoring framework to identify and replace the usage of inheritance by using Mockito—a well received mocking framework. Our approach is built upon the empirical experience from five open source projects that use inheritance for mocking. We evaluate our approach on nine other projects. Results show that our framework is efficient, generally applicable to new datasets, mostly preserves test case behaviors in detecting defects (in the form of mutants), and decouples test code from production code. The qualitative evaluation by experienced developers suggests that the auto-refactoring solutions generated by our framework improve the quality of the unit test cases in various aspects, such as making test conditions more explicit, as well as improved cohesion, readability, understandability, and maintainability with test cases. Finally, we submit 23 pull requests containing our refactoring solutions to the open source projects. It turns our that, 9 requests are accepted/merged, 6 requests are rejected, the remaining requests are pending (5 requests), with unexpected exceptions (2 requests), or undecided (1 request). In particular, among the 21 open source developers that are involved in the reviewing process, 81% give positive votes. This indicates that our refactoring solutions are quite well received by the open source projects and developers.
Article
Tests that fail inconsistently, without changes to the code under test, are described as flaky . Flaky tests do not give a clear indication of the presence of software bugs and thus limit the reliability of the test suites that contain them. A recent survey of software developers found that 59% claimed to deal with flaky tests on a monthly, weekly, or daily basis. As well as being detrimental to developers, flaky tests have also been shown to limit the applicability of useful techniques in software testing research. In general, one can think of flaky tests as being a threat to the validity of any methodology that assumes the outcome of a test only depends on the source code it covers. In this article, we systematically survey the body of literature relevant to flaky test research, amounting to 76 papers. We split our analysis into four parts: addressing the causes of flaky tests, their costs and consequences, detection strategies, and approaches for their mitigation and repair. Our findings and their implications have consequences for how the software-testing community deals with test flakiness, pertinent to practitioners and of interest to those wanting to familiarize themselves with the research area.
Article
One of the main challenges that developers face when testing their systems lies in engineering test cases that are good enough to reveal bugs. And while our body of knowledge on software testing and automated test case generation is already quite significant, in practice, developers are still the ones responsible for engineering test cases manually. Therefore, understanding the developers' thought- and decision-making processes while engineering test cases is a fundamental step in making developers better at testing software. In this paper, we observe 13 developers thinking-aloud while testing different real-world open-source methods, and use these observations to explain how developers engineer test cases. We then challenge and augment our main findings by surveying 72 software developers on their testing practices. We discuss our results from three different angles. First, we propose a general framework that explains how developers reason about testing. Second, we propose and describe in detail the three different overarching strategies that developers apply when testing. Third, we compare and relate our observations with the existing body of knowledge and propose future studies that would advance our knowledge on the topic.
Conference Paper
Background: Automated unit and integration tests allow software development teams to continuously evaluate their application's behavior and ensure requirements are satisfied. Interest in explicitly testing security at the unit and integration levels has risen as more teams begin to shift security left in their workflows, but there is little insight into any potential pain points developers may experience as they learn to adapt their existing skills to write these tests. Aims: Identify security unit and integration testing pain points that could negatively impact efforts to shift security (testing) left to this level. Method: An mixed-method empirical study was conducted on 525 Stack Overflow and Security Stack Exchange posts related to security unit and integration testing. Latent Dirichlet Allocation (LDA) was applied to identify commonly discussed topics, pain points were learned through qualitative analysis, and links were analyzed to study commonly-shared resources. Results: Nine topics representing security controls, components, and scenarios were identified; Authentication was the most commonly tested control. Developers experienced seven pain points unique to security unit and integration testing, which were all influenced by the complexity of security control designs and implementations. Most linked resources were other Q&A posts, but repositories and documentation for security tools and libraries were also common. Conclusions: Developers may experience several unique pain points when writing tests at this level involving security controls. Additional resources are needed to guide developers through these challenges, which should also influence the creation of strategies and tools to help shift security testing to this level. To accelerate this, actionable recommendations for practitioners and future research directions based on these findings are highlighted.
Article
In continuous testing, developers execute automated test cases once or even several times per day to ensure the quality of the integrated code. Although continuous testing helps ensure the quality of the code and reduces maintenance effort, it also significantly increases test execution overhead. In this paper, we empirically evaluate the effectiveness of test impact analysis from the perspective of code dependencies in the continuous testing setting. We first applied test impact analysis to one year of software development history in 11 large-scale open-source systems. We found that even though the number of changed files is small in daily commits (median ranges from 3 to 28 files), around 50% or more of the test cases are still impacted and need to be executed. Motivated by our finding, we further studied the code dependencies between source code files and test cases, and among test cases. We found that 1) test cases often focus on testing the integrated behaviour of the systems and 15% of the test cases have dependencies with more than 20 source code files; 2) 18\% of the test cases have dependencies with other test cases, and test case inheritance is the most common cause of test case dependencies; and 3) we documented four dependency-related test smells that we uncovered in our manual study. Our study provides the first step towards studying and understanding the effectiveness of test impact analysis in the continuous testing setting and provides insights on improving test design and execution.
Article
Full-text available
Components designed for reuse are expected to lower costs and shorten the development life cycle, but this may not prove so simple. The author emphasizes the need to closely examine a problematic aspect of component reuse: the necessity and potential expense of validating components in their new environments
Article
Full-text available
Despite a prevalent industry perception to the contrary, the agile practices of Test-Driven Development and Continuous Integration can be successfully applied to embedded software. We present here a holistic set of practices, platform independent tools, and a new design pattern (Model Conductor Hardware -MCH) that together produce: good design from tests programmed first, logic decoupled from hardware, and systems testable under automation. Ultimately, this approach yields an order of magnitude or more reduction in software flaws, predictable progress, and measurable velocity for data-driven project management. We use the approach discussed herein for real-world production systems and have included a full C-based sample project (using an Atmel AT91SAM7X ARM7) to illustrate it. This example demonstrates transforming requirements into test code, system, integration, and unit tests driving development, daily "micro design" fleshing out a system's architecture, the use of the MCH itself, and the use of mock functions in tests.
Conference Paper
Full-text available
Unit testing is a technique of testing a single unit of a program in isolation. The testability of the unit under test can be reduced when the unit interacts with its environment. The construction of high-covering unit tests and their execution require appropriate interactions with the environment such as a file system or database. To help set up the required environment, developers can use mock objects to simulate the behavior of the environment. In this paper, we present an empirical study to analyze the use of mock objects to test file-system-dependent software. We use a mock object of the FileSystem API provided with the Pex automatic testing tool in our study. We share our insights gained on the benefits of using mock objects in unit testing and discuss the faced challenges.
Conference Paper
Full-text available
Software testing has been commonly used in assuring the quality of database applications. It is often prohibitively expensive to manually write quality tests for complex database applications. Automated test generation techniques, such as Dynamic Symbolic Execution (DSE), have been proposed to reduce human efforts in testing database applications. However, such techniques have two major limitations: (1) they assume that the database that the application under test interacts with is accessible, which may not always be true; and (2) they usually cannot create necessary database states as a part of the generated tests. To address the preceding limitations, we propose an approach that applies DSE to generate tests for a database application. Instead of using the actual database that the application interacts with, our approach produces and uses a mock database in test generation. A mock database mimics the behavior of an actual database by performing identical database operations on itself. We conducted two empirical evaluations on both a medical device and an open source software system to demonstrate that our approach can generate, without producing false warnings, tests with higher code coverage than conventional DSE-based techniques.
Article
Full-text available
RESUMEN RESUMEN Most companies and testing books use the term unit testing, but its semantics varies widely in different organizations. Lund University researchers launched a survey in a company network to explore this variation . The survey , which involved companies in focus - group discussions and a questionnaire , focused on defining unit testing , discussing its practices , and evaluating a company's strengths and weaknesses . The results indicate a common understanding of unit testing's scope , although the aim and practices vary across companies . Practitioners can use the survey instruments as a baseline for measuring their unit testing practices and to start improvement initiatives.This article is part of a special issue on Software Testing .
Conference Paper
Full-text available
Engineering software systems is a multidisciplinary activity, whereby a number of artifacts must be created - and maintained - synchronously. In this paper we investigate whether production code and the accompanying tests co- evolve by exploring a project's versioning system, code coverage reports and size-metrics. Our main aim for studying this co-evolution is to create awareness with developers and managers alike about the testing process that is followed. We explore the possibilities of our technique through two open source case studies and observe a number of different co-evolution scenarios. We evaluate our results both with the help of log-messages and the original developers of the software system.
Article
Full-text available
Unit testing is a fundamental practice in Extreme Programming, but most non-trivial code is difficult to test in isolation. It is hard to avoid writing test suites that are complex, incomplete, and difficult to maintain and interpret. Using Mock Objects for unit testing improves both domain code and test suites. They allow unit tests to be written for everything, simplify test structure, and avoid polluting domain code with testing infrastructure. Keywords: Extreme Programming, Unit Testing, Mock Objects, Stubs 1
Article
We catalog and describe Google's key software engineering practices.
Article
In software testing, especially unit testing, it is very common that software testers need to test a class or a component without integration with some of its dependencies. Typical reasons for excluding dependencies in testing include the unavailability of some dependency due to concurrent software development and callbacks in frameworks, high cost of invoking some dependencies (e.g., slow network or database operations, commercial third-party web services), and the potential interference of bugs in the dependencies. In practice, mock objects have been used in software testing to simulate such missing dependencies, and a number of popular mocking frameworks (e.g., Mockito, EasyMock) have been developed for software testers to generate mock objects more conveniently. However, despite the wide usage of mocking frameworks in software practice, there have been very few academic studies to observe and understand the usage status of mocking frameworks, and the major issues software testers are facing when using such mocking frameworks. In this paper, we report on an empirical study on the usage of four most popular mock frameworks (Mockito, EasyMock, JMock, and JMockit) in 5,000 open source software projects from GitHub. The results of our study show that the above mentioned mocking frameworks are used in a large portion (about 23%) of software projects that have test code. We also find that software testers typically create mocks for only part of the software dependencies, and there are more mocking of source code classes than library classes.
Book
Unit test frameworks are a key element of popular development methodologies such as eXtreme Programming (XP) and Agile Development. But unit testing has moved far beyond eXtreme Programming; it is now common in many different types of application development. Unit tests help ensure low-level code correctness, reduce software development cycle time, improve developer productivity, and produce more robust software. Until now, there was little documentation available on unit testing, and most sources addressed specific frameworks and specific languages, rather than explaining the use of unit testing as a language-independent, standalone development methodology. This invaluable new book covers the theory and background of unit test frameworks, offers step-by-step instruction in basic unit test development, provides useful code examples in both Java and C++, and includes details on some of the most commonly used frameworks today from the XUnit family, including JUnit for Java, CppUnit for C++, and NUnit for .NET. Unit Test Frameworks includes clear, concise, and detailed descriptions of: The theory and design of unit test frameworks Examples of unit tests and frameworks Different types of unit tests Popular unit test frameworks And more It also includes the complete source code for CppUnit for C++, and NUnit for .NET.
Conference Paper
Test-driven methodologies encourage testing early and often. "Mock objects" support this approach by allowing a component to be tested before all depended-upon components are available. Today mock objects typically reflect little to none of an object's intended functionality, which makes it difficult and error-prone for developers to test rich properties of their code. This paper presents "declarative mocking", which enables the creation of expressive and reliable mock objects with relatively little effort. In our approach, developers write method specifications in a high-level logical language for the API being mocked, and a constraint solver dynamically executes these specifications when the methods are invoked. In addition to mocking functionality, this approach seamlessly allows data and other aspects of the environment to be easily mocked. We have implemented the approach as an extension to an existing tool for executable specifications in Java called PBnJ. We have performed an exploratory study of declarative mocking on several existing Java applications, in order to understand the power of the approach and to categorize its potential benefits and limitations. We also performed an experiment to port the unit tests of several open-source applications from a widely used mocking library to PBnJ. We found that more than half of these unit tests can be enhanced, in terms of the strength of properties and coverage, by exploiting executable specifications, with relatively little additional developer effort.
Article
From the Book:In the spring of 1999 I flew to Chicago to consult on a project being done by ThoughtWorks, a small but rapidly growing application development company. The project was one of those ambitious enterprise application projects: a back-end leasing system. Essentially what this system does is to deal with everything that happens to a lease after you've signed on the dotted line. It has to deal with sending out bills, handling someone upgrading one of the assets on the lease, chasing people who don't pay their bills on time, and figuring out what happens when someone returns the assets early. That doesn't sound too bad until you realize that leasing agreements are infinitely varied and horrendously complicated. The business "logic" rarely fits any logical pattern, because after all its written by business people to capture business, where odd small variations can make all the difference in winning a deal. Each of those little victories is yet more complexity to the system. That's the kind of thing that gets me excited: how to take all that complexity and come up with system of objects that can make more tractable. Indeed I belive that the primary benefit of objects is in making complex logic tractable. Developing a good Domain Model (133) for a complex business problem is difficult, but wonderfully satisfying. Yet that's not the end of the problem. Such a domain model has to persisted to a database, and like many projects we were using a relational database. We also had to connect this model to a user interface, provide support to allow remote applications to use our software, and integrate our software with third party packages. All of this on a newtechnology called J2EE which nobody in the world had any real experience in using. Even though this technology was new, we did have the benefit of experience. I'd been doing this kind of thing for ages now with C++, Smalltalk, and CORBA. Many of the ThoughtWorkers had a lot of experience with Forte. We already had the key architectural ideas in our heads, we just had to figure out how to apply them to J2EE. Looking back on it three years later the design is not perfect, but it's stood the test of time pretty damn well. That's the kind of situation that is where this book comes in. Over the years I've seen many enterprise application projects. These projects often contain similar design ideas which have proven to be effective ways to deal with the inevitable complexity that enterprise applications possess. This book is a starting point to capture these design ideas as patterns. The book is organized in two parts. The first part is a set of narrative chapters on a number of important topics in the design of enterprise applications. They introduce various problems in the architecture of enterprise applications and their solutions. However the narrative chapters don't go into much detail on these solutions. The details of the solutions are in the second part, organized as patterns. These patterns are a reference and I don't expect you to read them cover to cover. My intention is that you can read the narrative chapters in part one from start to finish to get a broad picture of what the book covers, then you can dip into the patterns chapters of part two as your interest and needs drive you. So this book is a short narrative book and a longer reference book combined into one. This is a book on enterprise application design. Enterprise applications are about the display, manipulation and storage of large amounts of often complex data and the support or automation of business processes with that data. Examples include reservation systems, financial systems, supply chain systems, and many of the systems that run modern business. Enterprise applications have their own particular challenges and solutions. They are a different animal to embedded systems, control systems, telecoms, or desktop productivity software. So if you work in of these other fields, there's nothing really in this book for you (unless you want to get a feel for what enterprise applications are like.) For a general book on software architecture I'd recommend POSA. There are many architectural issues in building enterprise applications. I'm afraid this book can't be a comprehensive guide to them. In building software I'm a great believer in iterative development. At the heart of iterative development is the notion that you should deliver software as soon as you have something useful to the user, even if it's not complete. Although there are many differences between writing a book and writing software, this notion is one that I think the two share. So this book is an incomplete but (I trust) useful compendium of advice on enterprise application architecture. The primary topics I talk about are: layering of enterprise applications how to structure domain (business) logic the structure of a web user interface how to link in-memory modules (particularly objects) to a relational database how to handle session state in stateless environments some principles of distributionThe list of things I don't talk about is rather longer. I really fancied writing about organizing validation, incorporating messaging and asynchronous communication, security, error handling, clustering, application integration, architectural refactoring, structuring rich-client user interfaces, amongst others. I can only hope to see some patterns appear for this work in the near future. However due to space, time, and lack of cogitation you won't find them in this book. Perhaps I'll do a second volume someday and get into these topics, or maybe someone else will fill these, and other, gaps. Of these, dealing with message based communication is a particularly big issue. Increasingly people who are integrating multiple applications are making use of asynchronous message based communication approaches. There's much to said for using them within an application as well. This book is not intended to be specific for any particular software platform. I first came across these patterns while working with Smalltalk, C++, and CORBA in the late 80's and early 90's. In the late 90's I started to do extensive work in Java and found these patterns applied well both to early Java/CORBA systems and later J2EE based work. More recently I've been doing some initial work with Microsoft's .NET platform and find the patterns apply again. My ThoughtWorks colleagues have also introduced their experiences, particularly with Forte. I can't claim generality across all platforms that ever have been or will be used for enterprise applications, but so far these patterns have shown enough recurrence to be useful. I have provided code exampl for most of these patterns. My choice of language for the code examples is based on what I think most readers are likely to be able to read and understand. Java's a good choice here. Anyone who can read C or C++ can read Java, yet Java is much less complex than C++. Essentially most C++ programmers can read Java but not vice versa. I'm an object bigot, so inevitably lean to an OO language. As a result most of the code examples are in Java. As I was working on the book Microsoft started stabilizing their .NET environment, and their C# language has most of the same properties as Java for an author. So I did some of the code examples in C# as well, although that does introduce some risk since developers don't have much experience with .NET yet and so the idioms for using it well are less mature. Both are C-based languages so if you can read one you should be able to read both, even if you aren't deeply into that language or platform. My aim was to use a language that the largest amount of software developers can read, even if it's not their primary or preferred language. (My apologies to those who like Smalltalk, Delphi, Visual Basic, Perl, Python, Ruby, COBOL or any other language. I know you think you know a better language than Java or C#, all I can say is I do too!) The examples are there for inspiration and explanation of the ideas in the patterns. They are not canned solutions, in all cases you'll need to do a fair bit of work to fit them into your application. Patterns are useful starting points, but they are not destinations. Who This book Is ForI've written this book for programmers, designers, and architects who are building enterprise applications and who want to either improve their understanding of these architectural issues or improve their communication about them. I'm assuming that most of my readers will fall into two groups: either those with modest needs who are looking to build their own software to handle these issues, or readers with more demanding needs who will be using a tool. For those of modest needs, my intention is that these patters should get you started. In many areas you'll need more than the patterns will give you, but my intention is to provide more of a head start in this field than I got. For tool users I hope this book will be useful to give you some idea of what's happening under the hood, but also help you in making choices between which of the tool supported patterns to use. Using, say, an object-relational mapping tool still means you have to make decisions about how to map certain situations. Reading the patterns should give you some guidance in making the choices. There is a third category, those with demanding needs who want to build their own software for these problems. The first thing I'd say here is look carefully at using tools. I've seen more than one project get sucked into a long exercise at building frameworks which weren't what project was really about. If you're still convinced, go ahead. Remember in this case that many of the code examples in this book are deliberately simplified to help understanding, and you'll find you'll need to do a lot tweaking to handle the greater demands that you'll face. Since patterns are common solutions to recurring problems, there's a good chance that you'll have already come across some of them. If you've been working in enterprise applicat while, you may well know most of them. I'm not claiming to have anything new in this book. Indeed I claim the opposite—this is a book of (for our industry) old ideas. If you're new to this field I hope you'll like this book to help you learn about these techniques. If you're more familiar with the techniques I hope you'll like this book because it helps you communicate and teach these ideas to others. An important part of patterns is trying to build a common vocabulary, so you can say that this class is a Remote Facade (413) and other designers know what you mean. Martin Fowler, Melrose MA, May 2002 http://martinfowler.com
Article
Automated testing is a cornerstone of agile development. An effective testing strategy will deliver new functionality more aggressively, accelerate user feedback, and improve quality. However, for many developers, creating effective automated tests is a unique and unfamiliar challenge.xUnit Test Patterns is the definitive guide to writing automated tests using xUnit, the most popular unit testing framework in use today. Agile coach and test automation expert Gerard Meszaros describes 68 proven patterns for making tests easier to write, understand, and maintain. He then shows you how to make them more robust and repeatable--and far more cost-effective.Loaded with information, this book feels like three books in one. The first part is a detailed tutorial on test automation that covers everything from test strategy to in-depth test coding. The second part, a catalog of 18 frequently encountered "test smells," provides trouble-shooting guidelines to help you determine the root cause of problems and the most applicable patterns. The third part contains detailed descriptions of each pattern, including refactoring instructions illustrated by extensive code samples in multiple programming languages.Topics covered include Writing better tests--and writing them faster The four phases of automated tests: fixture setup, exercising the system under test, result verification, and fixture teardown Improving test coverage by isolating software from its environment using Test Stubs and Mock Objects Designing software for greater testability Using test "smells" (including code smells, behavior smells, and project smells) to spot problems and know when and how to eliminate them Refactoring tests for greater simplicity, robustness, and execution speedThis book will benefit developers, managers, and testers working with any agile or conventional development process, whether doing test-driven development or writing the tests last. While the patterns and smells are especially applicable to all members of the xUnit family, they also apply to next-generation behavior-driven development frameworks such as RSpec and JBehave and to other kinds of test automation tools, including recorded test tools and data-driven test tools such as Fit and FitNesse.Visual Summary of the Pattern Languageï¾ Foreword Preface Acknowledgments Introduction Refactoring a TestPART I: The Narratives Chapter 1 A Brief Tour Chapter 2 Test Smells Chapter 3 Goals of Test Automation Chapter 4 Philosophy of Test Automation Chapter 5 Principles of Test Automation Chapter 6 Test Automation Strategy Chapter 7 xUnit Basics Chapter 8 Transient Fixture Management Chapter 9 Persistent Fixture Management Chapter 10 Result Verification Chapter 11 Using Test Doubles Chapter 12 Organizing Our Tests Chapter 13 Testing with Databases Chapter 14 A Roadmap to Effective Test Automation PART II: The Test Smellsï¾ Chapter 15 Code Smells Chapter 16 Behavior Smells Chapter 17 Project Smells PART III: The Patternsï¾ Chapter 18 Test Strategy Patterns Chapter 19 xUnit Basics Patterns Chapter 20 Fixture Setup Patterns Chapter 21 Result Verification Patterns Chapter 22 Fixture Teardown Patterns Chapter 23 Test Double Patterns Chapter 24 Test Organization Patterns Chapter 25 Database Patterns Chapter 26 Design-for-Testability Patterns Chapter 27 Value Patterns PART IV: Appendixes Appendix A Test Refactorings ï¾ Appendix B xUnit Terminologyï¾ Appendix C xUnit Family Members Appendix D Tools Appendix E Goals and Principles Appendix F Smells, Aliases, and Causes Appendix G Patterns, Aliases, and Variations Glossary Referencesï¾ Indexï¾
Article
Social desirability is one of the most common sources of bias affecting the validity of experimental and survey research findings. From a self-presentational perspective, social desirability can be regarded as the resultant of two separate factors: self-deception and other-deception. Two main modes of coping with social desirability bias are distinguished. The first mode comprises two methods aimed at the detection and measurement of social desirability bias: the use of social desirability scales, and the rating of item desirability. A second category comprises seven methods to prevent or reduce social desirability bias, including the use of forced-choice items, the randomized response technique, the bogus pipeline, self-administration of the questionnaire, the selection of interviewers, and the use of proxy subjects. Not one method was found to excel completely and under all conditions in coping with both other-deceptive and self-deceptive social desirability bias. A combination of prevention and detection methods offers the best choice available.
Article
This paper describes a graph-theoretic complexity measure and illustrates how it can be used to manage and control program complexity. The paper first explains how the graph-theory concepts apply and gives an intuitive explanation of the graph concepts in programming terms. The control graphs of several actual Fortran programs are then presented to illustrate the correlation between intuitive complexity and the graph-theoretic complexity. Several properties of the graph-theoretic complexity are then proved which show, for example, that complexity is independent of physical size (adding or subtracting functional statements leaves complexity unchanged) and complexity depends only on the decision structure of a program.
Article
CORE ...........................................................................................................................204 Deep Models .......................................................................................................................................205 Refactoring and Distillation ................................................................................................................205 17. Large-Scale Structure.................................................................................................................206 EVOLVING ORDER .........................................................................................................................208 SYSTEM METAPHOR [BECK 2000] .................................................................................................209 PLUGGABLE COMPONENTS.............................................................................................................210 ABSTRACT DOMAIN FRAMEWORK .................................................................................................212 RESPONSIBILITY LAYERS...............................................................................................................212 Large-Scale Structure, Unification Contexts, and Distillation ............................................................223 Refactoring Toward a Fitting Structure...............................................................................................225 Architecture, Architecture Teams, and Large-Scale Structure ............................................................227 18. Game Plans ..................................................................................................................................230 Looking Forward.....
Practical Unit Testing with TestNG and Mockito
  • T Kaczanowski
T. Kaczanowski. Practical Unit Testing with TestNG and Mockito. Tomasz Kaczanowski, 2012.
Mocking embedded hardware for software validation
  • S S Kim
S. S. Kim. Mocking embedded hardware for software validation. PhD thesis, 2016.
A rticle picture sorts and item sorts
  • G Rugg
G. Rugg. A rticle picture sorts and item sorts. Computing, 22(3), 2005.
Card sorting: a definitive guide
  • D Spencer
D. Spencer. Card sorting: a definitive guide.
Declarative Mocking Categories and Subject Descriptors
  • H Samimi
  • R Hicks
  • A Fogel
  • T Millstein
To Mock or Not To Mock? Online Appendix
To Mock or Not To Mock? Online Appendix. https://doi.org/10.4121/ uuid:fce8653c-344c-4dcb-97ab-c9c1407ad2f0.
Endo-testing: unit testing with mock objects. Extreme programming examined
  • T Mackinnon
  • S Freeman
  • P Craig
T. Mackinnon, S. Freeman, and P. Craig. Endo-testing: unit testing with mock objects. Extreme programming examined, pages 287-301, 2001.