Michael D. Ernst

University of Washington Seattle, Seattle, Washington, United States

Are you Michael D. Ernst?

Claim your profile

Publications (161)11.79 Total impact

  • René Just, Michael D. Ernst, Gordon Fraser
    [Show abstract] [Hide abstract]
    ABSTRACT: Mutation analysis evaluates a testing technique by measur- ing how well it detects seeded faults (mutants). Mutation analysis is hampered by inherent scalability problems — a test suite is executed for each of a large number of mutants. Despite numerous optimizations presented in the literature, this scalability issue remains, and this is one of the reasons why mutation analysis is hardly used in practice. Whereas most previous optimizations attempted to stati- cally reduce the number of executions or their computational overhead, this paper exploits information available only at run time to further reduce the number of executions. First, state infection conditions can reveal — with a single test execution of the unmutated program — which mutants would lead to a different state, thus avoiding unnecessary test executions. Second, determining whether an infected execution state propagates can further reduce the number of executions. Mutants that are embedded in compound expressions may infect the state locally without affecting the outcome of the compound expression. Third, those mutants that do infect the state can be partitioned based on the resulting infected state — if two mutants lead to the same infected state, only one needs to be executed as the result of the other can be inferred. We have implemented these optimizations in the Major mu- tation framework and empirically evaluated them on 14 open source programs. The optimizations reduced the mutation analysis time by 40% on average.
    07/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In a test suite, all the test cases should be independent: no test should affect any other test’s result, and running the tests in any order should produce the same test results. Techniques such as test prioritization generally assume that the tests in a suite are independent. Test dependence is a little-studied phenomenon. This paper presents five results related to test dependence. First, we characterize the test dependence that arises in practice. We studied 96 real-world dependent tests from 5 issue tracking systems. Our study shows that test dependence can be hard for programmers to identify. It also shows that test dependence can cause non-trivial consequences, such as masking program faults and leading to spurious bug reports. Second, we formally define test dependence in terms of test suites as ordered sequences of tests along with explicit environments in which these tests are executed. We formulate the problem of detecting dependent tests and prove that a useful special case is NP-complete. Third, guided by the study of real-world dependent tests, we propose and compare four algorithms to detect dependent tests in a test suite. Fourth, we applied our dependent test detection algorithms to 4 real-world programs and found dependent tests in each human-written and automatically-generated test suite. Fifth, we empirically assessed the impact of dependent tests on five test prioritization techniques. Dependent tests affect the output of all five techniques; that is, the reordered suite fails even though the original suite did not.
    07/2014;
  • René Just, Darioush Jalali, Michael D. Ernst
    07/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Concurrent systems are notoriously difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process, and documentation is often incomplete and out of sync with the implementation. To provide developers with more insight into concurrent systems, we developed CSight. CSight mines logs of a system's executions to infer a concise and accurate model of that system's behavior, in the form of a communicating finite state machine (CFSM). Engineers can use the inferred CFSM model to understand complex behavior, detect anomalies, debug, and increase confidence in the correctness of their implementations. CSight's only requirement is that the logged events have vector timestamps. We provide a tool that automatically adds vector timestamps to system logs. Our tool prototypes are available at http://synoptic.googlecode.com/. This paper presents algorithms for inferring CFSM models from traces of concurrent systems, proves them correct, provides an implementation, and evaluates the implementation in two ways: by running it on logs from three different networked systems and via a user study that focused on bug finding. Our evaluation finds that CSight infers accurate models that can help developers find bugs.
    05/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Contracts are a popular tool for specifying the functional behavior of software. This paper characterizes the contracts that developers write, the contracts that developers could write, and how a developer reacts when shown the difference. This paper makes three research contributions based on an investigation of open-source projects' use of Code Contracts. First, we characterize Code Contract usage in practice. For example, approximately three-fourths of the Code Contracts are basic checks for the presence of data. We discuss similarities and differences in usage across the projects, and we identify annotation burden, tool support, and training as possible explanations based on developer interviews. Second, based on contracts automatically inferred for four of the projects, we find that developers underutilize contracts for expressing state updates, object state indicators, and conditional properties. Third, we performed user studies to learn how developers decide which contracts to enforce. The developers used contract suggestions to support their existing use cases with more expressive contracts. However, the suggestions did not lead them to experiment with other use cases for which contracts are better-suited. In support of the research contributions, the paper presents two engineering contributions: (1) Celeriac, a tool for generating traces of .NET programs compatible with the Daikon invariant detection tool, and (2) Contract Inserter, a Visual Studio add-in for discovering and inserting likely invariants as Code Contracts.
    05/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In a distributed system, the hosts execute concurrently, generating asynchronous logs that are challenging to comprehend. We present two tools: ShiVector to transparently add vector timestamps to distributed system logs, and ShiViz to help developers understand distributed system logs by visualizing them as space-time diagrams. ShiVector is the first tool to offer automated vector timestamp instrumentation without modifying source code. The vector-timestamped logs capture partial ordering information, useful for analysis and comprehension. ShiViz space-time diagrams are simple to understand and interactive — the user can explore the log through the visualization to understand complex system behavior. We applied ShiVector and ShiViz to two systems and found that they aid developers in understanding and debugging.
    05/2014;
  • Sai Zhang, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern software often exposes configuration options that enable users to customize its behavior. During software evolution, developers may change how the configuration options behave. When upgrading to a new software version, users may need to re-configure the software by changing the values of certain configuration options. This paper addresses the following question during the evolution of a configurable software system: which configuration options should a user change to maintain the software's desired behavior? This paper presents a technique (and its tool implementation, called ConfSuggester) to troubleshoot configuration errors caused by software evolution. ConfSuggester uses dynamic profiling, execution trace comparison, and static analysis to link the undesired behavior to its root cause - a configuration option whose value can be changed to produce desired behavior from the new software version. We evaluated ConfSuggester on 8 configuration errors from 6 configurable software systems written in Java. For 6 errors, the rootcause configuration option was ConfSuggester's first suggestion. For 1 error, the root cause was ConfSuggester's third suggestion. The root cause of the remaining error was ConfSuggester's sixth suggestion. Overall, ConfSuggester produced significantly better results than two existing techniques. ConfSuggester runs in just a few minutes, making it an attractive alternative to manual debugging.
    05/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Students and faculty alike at all education levels are clearly spending much more of their time interacting with computing and communication tools than with each other. Is this good? Are all uses of computational technology in education helpful, and ...
    Proceedings of the 45th ACM technical symposium on Computer science education; 03/2014
  • Brian Burg, Richard Bailey, Andrew J. Ko, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: During debugging, a developer must repeatedly and manually reproduce faulty behavior in order to inspect different facets of the program's execution. Existing tools for reproducing such behaviors prevent the use of debugging aids such as breakpoints and logging, and are not designed for interactive, random-access exploration of recorded behavior. This paper presents Timelapse, a tool for quickly recording, reproducing, and debugging interactive behaviors in web applications. Developers can use Timelapse to browse, visualize, and seek within recorded program executions while simultaneously using familiar debugging tools such as breakpoints and logging. Testers and end-users can use Timelapse to demonstrate failures in situ and share recorded behaviors with developers, improving bug report quality by obviating the need for detailed reproduction steps. Timelapse is built on Dolos, a novel record/replay infrastructure that ensures deterministic execution by capturing and reusing program inputs both from the user and from external sources such as the network. Dolos introduces negligible overhead and does not interfere with breakpoints and logging. In a small user evaluation, participants used Timelapse to accelerate existing reproduction activities, but were not significantly faster or more successful in completing the larger tasks at hand. Together, the Dolos infrastructure and Timelapse developer tool support systematic bug reporting and debugging practices.
    Proceedings of the 26th annual ACM symposium on User interface software and technology; 10/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses — ones that modify the code on which it runs — because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis — even an impure one — into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses — FindBugs, PMD, and unit testing — into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action.
    Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering; 08/2013
  • Sai Zhang, Hao Lü, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: A workflow is a sequence of UI actions to complete a specific task. In the course of a GUI application's evolution, changes ranging from a simple GUI refactoring to a complete rearchitecture can break an end-user's well-established workflow. It can be challenging to find a replacement workflow. To address this problem, we present a technique (and its tool implementation, called FlowFixer) that repairs a broken workflow. FlowFixer uses dynamic profiling, static analysis, and random testing to suggest a replacement UI action that fixes a broken workflow. We evaluated FlowFixer on 16 broken workflows from 5 realworld GUI applications written in Java. In 13 workflows, the correct replacement action was FlowFixer's first suggestion. In 2 workflows, the correct replacement action was FlowFixer's second suggestion. The remaining workflow was un-repairable. Overall, FlowFixer produced significantly better results than two alternative approaches.
    Proceedings of the 2013 International Symposium on Software Testing and Analysis; 07/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Most graphical user interface (GUI) libraries forbid accessing UI elements from threads other than the UI event loop thread. Violating this requirement leads to a program crash or an inconsistent UI. Unfortunately, such errors are all too common in GUI programs. We present a polymorphic type and effect system that prevents non-UI threads from accessing UI objects or invoking UI-thread-only methods. The type system still permits non-UI threads to hold and pass references to UI objects. We implemented this type system for Java and annotated 8 Java programs (over 140KLOC) for the type system, including several of the most popular Eclipse plugins. We confirmed bugs found by unsound prior work, found an additional bug and code smells, and demonstrated that the annotation burden is low. We also describe code patterns our effect system handles less gracefully or not at all, which we believe offers lessons for those applying other effect systems to existing code.
    Proceedings of the 27th European Conference on Object-Oriented Programming (ECOOP'13); 07/2013
  • Source
    Colin S Gordon, Michael D Ernst, Dan Grossman
    [Show abstract] [Hide abstract]
    ABSTRACT: Reasoning about side effects and aliasing is the heart of verifying imperative programs. Unrestricted side effects through one reference can invalidate assumptions about an alias. We present a new type system approach to reasoning about safe assumptions in the presence of aliasing and side effects, unifying ideas from reference immutability type systems and rely-guarantee program logics. Our approach, rely-guarantee references, treats multiple references to shared objects similarly to multiple threads in rely-guarantee program logics. We propose statically associating rely and guarantee conditions with individual references to shared objects. Multiple aliases to a given object may coexist only if the guarantee condition of each alias implies the rely condition for all other aliases. We demonstrate that existing reference immutability type systems are special cases of rely-guarantee references. In addition to allowing precise control over state modification, rely-guarantee references allow types to depend on mutable data while still permitting flexible aliasing. Dependent types whose denotation is stable over the actions of the rely and guarantee conditions for a reference and its data will not be invalidated by any action through any alias. We demonstrate this with refinement (subset) types that may depend on mutable data. As a special case, we derive the first reference immutability type system with dependent types over immutable data. We show soundness for our approach and describe experience using rely-guarantee references in a dependently-typed monadic DSL in Coq.
    Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'13); 06/2013
  • Source
    04/2013;
  • Source
    René Just, Michael D. Ernst, Gordon Fraser
    [Show abstract] [Hide abstract]
    ABSTRACT: Mutation analysis evaluates test suites and testing techniques by measuring how well they detect seeded defects (mutants). Even though well established in research, mutation analysis is rarely used in practice due to scalability problems --- there are multiple mutations per code statement leading to a large number of mutants, and hence executions of the test suite. In addition, the use of mutation to improve test suites is futile for mutants that are equivalent, which means that there exists no test case that distinguishes them from the original program. This paper introduces two optimizations based on state infection conditions, i.e., conditions that determine for a test execution whether the same execution on a mutant would lead to a different state. First, redundant test execution can be avoided by monitoring state infection conditions, leading to an overall performance improvement. Second, state infection conditions can aid in identifying equivalent mutants, thus guiding efforts to improve test suites.
    03/2013;
  • Source
    Colin S Gordon, Michael D Ernst, Dan Grossman
    03/2013;
  • Conference Paper: Immutability
    Alex Potanin, Johan Östlund, Yoav Zibin, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the main reasons aliasing has to be controlled, as highlighted in another chapter [1] of this book [2], is the possibility that a variable can unexpectedly change its value without the referrer's knowledge. This book will not be complete without a discussion of the impact of immutability on reference-abundant imperative object-oriented languages. In this chapter we briefly survey possible definitions of immutability and present recent work by the authors on adding immutability to object-oriented languages and how it impacts aliasing.
    Aliasing in Object-Oriented Programming; 01/2013
  • Todd W. Schiller, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: Formally verifying a program requires significant skill not only because of complex interactions between program subcomponents, but also because of deficiencies in current verification interfaces. These skill barriers make verification economically unattractive by preventing the use of less-skilled (less-expensive) workers and distributed workflows (i.e., crowdsourcing). This paper presents VeriWeb, a web-based IDE for verification that decomposes the task of writing verifiable specifications into manageable subproblems. To overcome the information loss caused by task decomposition, and to reduce the skill required to verify a program, VeriWeb incorporates several innovative user interface features: drag and drop condition construction, concrete counterexamples, and specification inlining. To evaluate VeriWeb, we performed three experiments. First, we show that VeriWeb lowers the time and monetary cost of verification by performing a comparative study of VeriWeb and a traditional tool using 14 paid subjects contracted hourly from Exhedra Solution's vWorker online marketplace. Second, we demonstrate the dearth and insufficiency of current ad-hoc labor marketplaces for verification by recruiting workers from Amazon's Mechanical Turk to perform verification with VeriWeb. Finally, we characterize the minimal communication overhead incurred when VeriWeb is used collaboratively by observing two pairs of developers each use the tool simultaneously to verify a single program.
    Proceedings of the ACM international conference on Object oriented programming systems languages and applications; 11/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Modern integrated development environments make recommendations and automate common tasks, such as refactorings, auto-completions, and error corrections. However, these tools present little or no information about the consequences of the recommended changes. For example, a rename refactoring may: modify the source code without changing program semantics; modify the source code and (incorrectly) change program semantics; modify the source code and (incorrectly) create compilation errors; show a name collision warning and require developer input; or show an error and not change the source code. Having to compute the consequences of a recommendation -- either mentally or by making source code changes -- puts an extra burden on the developers. This paper aims to reduce this burden with a technique that informs developers of the consequences of code transformations. Using Eclipse Quick Fix as a domain, we describe a plug-in, Quick Fix Scout, that computes the consequences of Quick Fix recommendations. In our experiments, developers completed compilation-error removal tasks 10% faster when using Quick Fix Scout than Quick Fix, although the sample size was not large enough to show statistical significance.
    Proceedings of the ACM international conference on Object oriented programming systems languages and applications; 11/2012
  • Source
    Wei Huang, Ana Milanova, Werner Dietl, Michael D. Ernst
    [Show abstract] [Hide abstract]
    ABSTRACT: Reference immutability ensures that a reference is not used to modify the referenced object, and enables the safe sharing of object structures. A pure method does not cause side-effects on the objects that existed in the pre-state of the method execution. Checking and inference of reference immutability and method purity enables a variety of program analyses and optimizations. We present ReIm, a type system for reference immutability, and ReImInfer, a corresponding type inference analysis. The type system is concise and context-sensitive. The type inference analysis is precise and scalable, and requires no manual annotations. In addition, we present a novel application of the reference immutability type system: method purity inference. To support our theoretical results, we implemented the type system and the type inference analysis for Java. We include a type checker to verify the correctness of the inference result. Empirical results on Java applications and libraries of up to 348kLOC show that our approach achieves both scalability and precision.
    Proceedings of the ACM international conference on Object oriented programming systems languages and applications; 10/2012

Publication Stats

4k Citations
11.79 Total Impact Points

Institutions

  • 1997–2014
    • University of Washington Seattle
      • Department of Computer Science and Engineering
      Seattle, Washington, United States
  • 2012
    • Rensselaer Polytechnic Institute
      Troy, New York, United States
  • 2003–2009
    • Massachusetts Institute of Technology
      • Computer Science and Artificial Intelligence Laboratory
      Cambridge, MA, United States
    • Distributed Artificial Intelligence Laboratory
      Berlín, Berlin, Germany