Predicting Bugs' Components via Mining Bug Reports

Computing Research Repository - CORR 10/2010; 7(5). DOI: 10.4304/jsw.7.5.1149-1154
Source: arXiv

ABSTRACT The number of bug reports in complex software increases dramatically. Now
bugs are triaged manually, bug triage or assignment is a labor-intensive and
time-consuming task. Without knowledge about the structure of the software,
testers often specify the component of a new bug wrongly. Meanwhile, it is
difficult for triagers to determine the component of the bug only by its
description. We dig out the components of 28,829 bugs in Eclipse bug project
have been specified wrongly and modified at least once. It results in these
bugs have to be reassigned and delays the process of bug fixing. The average
time of fixing wrongly-specified bugs is longer than that of
correctly-specified ones. In order to solve the problem automatically, we use
historical fixed bug reports as training corpus and build classifiers based on
support vector machines and Na\"ive Bayes to predict the component of a new
bug. The best prediction accuracy reaches up to 81.21% on our validation corpus
of Eclipse project. Averagely our predictive model can save about 54.3 days for
triagers and developers to repair a bug. Keywords: bug reports; bug triage;
text classification; predictive model

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Software maintenance starts as soon as the first artifacts are delivered and is essential for the success of the software. However, keeping maintenance activities and their related artifacts on track comes at a high cost. In this respect, change request (CR) repositories are fundamental in software maintenance. They facilitate the management of CRs and are also the central point to coordinate activities and communication among stakeholders. However, the benefits of CR repositories do not come without issues, and commonly occurring ones should be dealt with, such as the following: duplicate CRs, the large number of CRs to assign, or poorly described CRs. Such issues have led researchers to an increased interest in investigating CR repositories, by considering different aspects of software development and CR management. In this paper, we performed a systematic mapping study to characterize this research field. We analyzed 142 studies, which we classified in two ways. First, we classified the studies into different topics and grouped them into two dimensions: challenges and opportunities. Second, the challenge topics were classified in accordance with an existing taxonomy for information retrieval models. In addition, we investigated tools and services for CR management, to understand whether and how they addressed the topics identified. Copyright © 2013 John Wiley & Sons, Ltd.
    Journal of Software: Evolution and Process 12/2013; · 1.27 Impact Factor

Full-text (2 Sources)

Available from
May 17, 2014