Conference Paper

Golden implementation driven software debugging.

DOI: 10.1145/1882291.1882319 Conference: Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2010, Santa Fe, NM, USA, November 7-11, 2010
Source: DBLP

ABSTRACT The presence of a functionally correct golden implementation has a significant advantage in the software development life cycle. Such a golden implementation is exploited for software development in several domains, including embedded software --- a low resource-consuming version of the golden implementation. The golden implementation gives the functionality that the program is supposed to implement, and is used as a guide during the software development process. In this paper, we investigate the possibility of using the golden implementation as a reference model in software debugging. We perform a substantial case study involving the Busybox embedded Linux utilities while treating the GNU Core Utilities as the golden or reference implementation. Our debugging method consists of dynamic slicing with respect to the observable error in both the implementations (the golden implementation as well as the buggy software). During dynamic slicing we also perform a step-by-step weakest precondition computation of the observable error with respect to the statements in the dynamic slice. The formulae computed as weakest pre-condition in the two implementations are then compared to accurately locate the root cause of a given observable error. Experimental results obtained from Busybox suggest that our method performs well in practice and is able to pinpoint all the bugs recently published in [8] that could be reproduced on Busybox version 1.4.2. The bug report produced by our approach is concise and pinpoints the program locations inside the Busybox source that contribute to the difference in behavior.


Available from: Abhik Roychoudhury, May 31, 2015
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multiple tools can assist developers when debugging programs, but only a few solutions specifically target the common case of regression failures, to provide a more focused and effective support to debugging. In this paper we present RADAR, a tool that combines change identification and dynamic analysis to automatically explain regression problems with a list of suspicious differences in the behavior of the base and upgraded version of a program. The output produced by the tool is particularly beneficial to understand why an application failed. A demo video is available at
    Software Engineering (ICSE), 2013 35th International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Delta debugging has been proposed to isolate failure-inducing changes when regressions occur. In this work, we focus on evaluating delta debugging in practical settings from developers' perspectives. A collection of real regressions taken from medium-sized open source programs is used in our evaluation. Towards automated debugging in software evolution, a tool based on delta debugging is created and both the limitations and costs are discussed. We have evaluated two variants of delta debugging. Different from successful isolation in Zeller's initial studies, the results in our experiments vary wildly. Two thirds of isolated changes in studied programs provide direct or indirect clues in locating regression bugs. The remaining results are superfluous changes or even wrong isolations. In the case of wrong isolations, the isolated changes cause the same behaviour of the regression but are failure-irrelevant. Moreover, the hierarchical variant does not yield definite improvements in terms of the efficiency and accuracy.
    Journal of Systems and Software 10/2012; 85(10):2305-2317. DOI:10.1016/j.jss.2011.10.016 · 1.25 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite of indisputable progress, automated debugging methods still face difficulties in terms of scalability and runtime efficiency. To reach large-scale projects, we propose an approach which reports small sets of suspicious code changes. Its essential strength is that size of these reports is proportional to the amount of changes between code commits, and not the total project size. In our method we combine version comparison and information on failed tests with static and dynamic analysis. We evaluate our method on real bugs from Apache Hadoop, an open source project with over 2 million LOC1. In 2 out of 4 cases, the set of suspects produced by our approach contains exactly the location of the defective code (and no false positives). Another defect could be pinpointed by small approach extensions. Moreover, the time overhead of our approach is moderate, namely 3-4 times the duration of a failed software test.
    Software Reliability Engineering Workshops (ISSREW), 2013 IEEE International Symposium on; 01/2013