Conference PaperPDF Available

Prest: An Intelligent Software Metrics Extraction, Analysis and Defect Prediction Tool.

Conference Paper

Prest: An Intelligent Software Metrics Extraction, Analysis and Defect Prediction Tool.

Abstract and Figures

Test managers use intelligent predictors to increase testing efficiency and to decide on when to stop testing. However, those predictors would be impractical to use in an industry setting, unless measurement and prediction processes are automated. Prest as an open source tool aims to address this problem. Compared to other open source prediction and analysis tools Prest is unique that it collects source code metrics and call graphs in 5 different programming languages, and performs learning based defect prediction and analysis. So far Prest in real life industry projects helped companies to achieve an average of 32% efficiency increase in testing effort.
Content may be subject to copyright.
Prest: An Intelligent Software Metrics Extraction, Analysis and Defect
Prediction Tool
Ekrem Kocagüneli1, Ayşe Tosun1, Ayşe Bener1, Burak Turhan2, Bora Çağlayan1
1Software Research Laboratory (Softlab), Computer Engineering Department, Boğaziçi
University, Turkey
2National Research Council (NRC), Canada
ekrem.kocaguneli@boun.edu.tr, ayse.tosun@boun.edu.tr, bener@boun.edu.tr,
Burak.Turhan@nrc-cnrc.gc.ca, bora.caglayan@boun.edu.tr
Abstract
Test managers use intelligent predictors to increase
testing efficiency and to decide on when to stop testing.
However, those predictors would be impractical to use
in an industry setting, unless measurement and
prediction processes are automated. Prest as an open
source tool aims to address this problem. Compared to
other open source prediction and analysis tools Prest is
unique that it collects source code metrics and call
graphs in 5 different programming languages, and
performs learning based defect prediction and analysis.
So far Prest in real life industry projects helped
companies to achieve an average of 32% efficiency
increase in testing effort.
1. Introduction
The role of software measurement becomes
increasingly important to understand and control
mature software development practices and products
[1]. Software measurement helps to evaluate the
software quality by measuring error-proneness of
software modules, since residual defects in the software
affects the final quality. However, measurement
programs cannot be easily employed in software
companies [2]. There has to be a tool support to
analyze the quality of the software using various source
code metrics [3,7]. Many researchers also have been
working on building predictive models: defect
prediction, cost/ effort estimation. These models need
raw data (i.e. regular measurement of software
attributes) [4,26,24,25,27,28]. These research have
significant implications in practice as well. Predictive
models support practitioners to take critical decisions
under uncertainty. Automated tools help researchers
and practitioners to measure software artifacts seamless
to coders [4,24,25,26,27,28]. Current software
development environment, on the other hand, is
complex such that multiple platforms (i.e. hardware
and software) as well as multiple programming
languages have to co-exist. Therefore any automated
code measurement and analysis tool should address the
issue of heterogeneity in software systems.
There exist several measurement and analysis tools,
which are provided either as commercial of-the-shelf
(COTS) [5, 22, 23] or as open source tools [6, 15, 19,
20, 21]. There are COTS tools, which provide
extensive set of metrics and functionalities; however,
they are not always affordable. Furthermore, their
output formats cannot be easily integrated with other
measurement and analysis tools. Open source tools, on
the other hand, [6, 15] are easily accessible and their
functionalities may be tailored to meet specific needs.
However, open source tools have certain deficiencies:
a) they can extract only a limited number of static code
attributes from a limited number of programming
languages, b) they do not include a learning based
prediction support and c) they lack multiple output
formats [8].
In this paper, we introduce an intelligent open
source, software metrics extraction, analysis and defect
prediction tool, called Prest [16]. The need for Prest
has emerged during our collaborative research with
industry partners from various domains (i.e.
telecommunication [13], embedded systems [11] and
healthcare [12]) over the past four years. Our aim in
developing Prest was to extract static code attributes
from software programs and build a learning-based
defect predictor, which would highlight defect-prone
parts of new projects, using code attributes and defect
data from past projects. Prest is capable of extracting
28 static code attributes and generating call graphs by
using five different language parsers. It also provides
output in various formats that are compatible with
popular toolkits like Weka [10]. Our industry partners
have been using Prest for two years. The project
managers have been able to detect problems in coding
practices and testing, and they take corrective actions
on a timely manner.
2. Functionality
Prest is developed as a one-stop-shop tool that is
basically capable of:
Extracting common static code metrics from C,
C++, Java, JSP and PL/SQL languages
Presenting output via GUI components and in
*.xml, *.csv, *.xls and *.arff file formats
Generating call graphs in class and method level
Defining new metrics or thresholds on extracted
metrics
Applying machine learning methods for analysis
and defect prediction.
Each of these functionalities will be further
described using a sample code (Figure 1). We also
placed the sample code of Figure 1 and an executable
jar of Prest in the Prest repository [16] for self trial.
More complex analysis including defect prediction will
be provided as a demo in Section 4.
Figure 1. Sample code
2.1. Parsing and saving a project
Prest can parse all files that are written in different
programming languages by using different parsers at
the same time. Similar tools, on the other hand, can
parse only one language at a time while ignoring other
files. Once a project, such as the sample code in Figure
1, is parsed, the metrics are presented via GUI
components in a structured manner and outputs are
placed under the related project folder within the
repository. The outputs are presented in several
formats: *.csv, *.xls,*.arff and *.xml. In Figure 2, only
one metric, i.e. cyclomatic density, is presented as the
static code attributes that can be extracted by Prest,
since we have page limitations. However, Table 1
provides full set of attributes.
Figure 2. GUI Overview of Prest
2.2. Call graph generation
Prest introduces a new and simple call graph feature
for all supported languages. It extracts this information
to better illustrate dependencies between functions/
classes and the complexity of software systems.
Basically, a function call graph represents the
encodings of caller-callee relations between functions
in a structured manner (Figure 3). Using Prest, each
function in Figure 3 is treated as a potential caller and a
unique ID is assigned to each function. Therefore, all
the functions are listed under the column
CALLER_NAME and their ID's are listed under the
column CALLER_ID in an excel file. The second
CALLEE_ID column keeps the ID's of the called
function(s) that were called by the caller function.
Generated call graph matrix of Prest can be seen in
Figure 4.
Figure 3. Function calls of sample code
Figure 4. Call graph matrix of sample code
2.3. Data Analysis and Prediction
Data analysis and prediction are particular features
of Prest, which provide analysis and prediction via
Naïve Bayes (NB) and Decision Tree (DT) algorithms.
Given actual defect data of a project, in which bugs are
matched with functions; Prest can analyze the given
data via Naïve Bayes and Decision Tree algorithms and
make predictions for a future release, which has yet not
been tested. Then, it pinpoints defect-prone modules to
increase the testing efficiency considerably. This
feature has drastically helped our industry partners by
reducing the testing effort by 32% [11]. Its architecture
and a detailed tutorial will be explained in Section 3, 4.
2.4. Threshold and New Metric Definition
Certain values of metrics or a combination of those
may be indicator of error proneness. Prest provides
users the ability to define certain conditions
(thresholds) on the extracted metrics and apply color
coding according to user-defined thresholds, i.e.
metrics of defect-prone modules are colored with red
on the GUI, whereas the defect-free ones are painted as
green. Furthermore, Prest lets users to define new
metrics by combining existing metrics via mathematical
operators. In Figure 5, definition of a new metric using
“/ DIVIDE” operator, cyclomatic_complexity and
lines_of_code metrics is illustrated.
Figure 5. Defining a new metric
3. System Architecture
Prest architecture has four main components:
Language parser, metric extractor, analysis and
prediction component and GUI components.
3.1. Language Parser
A parser is responsible for parsing code into tokens
depending on its type such as operand, operator etc.
Currently, Prest consists of C, C++, Java, JSP and
PL/SQL parsers.
3.2. Metrics Extractor
Once the language parser is done with parsing the
code into tokens, the metric extraction component
starts to execute and it produces logical results
depending on the output of the language parser. Those
logical results are used to calculate static code metrics,
listed in Table 1. Prest collects 28 static code attributes
(Table 1) and none of the other open source metrics
extraction tools [18] were able to extract all metrics.
Table 1. Static code metrics extracted by Prest
Total loc Blank LOC
Comment LOC Code Comment LOC
Executable LOC Unique Operands
Total Operands Total Operators
Halstead Vocabulary Halstead Length
Halstead Volume Halsted Level
Halstead Difficulty Halstead Effort
Halstead Error Halstead Time
Branch Count Decision Count
Call Pairs Condition Count
Multiple Condition Count Cyclomatic Density
Cyclomatic Complexity Decision Density
Design Complexity Design Density
Normalized Cyclomatic
Complexity
Formal Parameteres
3.3. Analysis and Prediction Component
Analysis and prediction component of Prest
significantly differentiates itself from similar open
source tools, since none of them provides a learning
based defect prediction component [18]. Unlike other
open source metric extraction tools, Prest can perform
analysis and predictions regarding the defect-proneness
of software by utilizing machine learning methods. We
have benefited from Weka libraries [10] to implement
two classifiers, Naïve Bayes and Decision Tree, for this
component. However, new methods may be included
either by implementing it from scratch or by calling
Weka libraries.
3.4. GUI Component
GUI component is responsible for interacting with
the user and presenting the results. We paid particular
attention to GUI component and analyzed various tools
such as Eclipse [6], Predictive [5] and WEKA [10],
before designing it. We aimed to keep the usage
simple, while providing full range of features, such as
project repository, easy switch between metric
extraction and analysis tabs, defining thresholds on
static code metrics, filtering results depending on
defined thresholds, applying color codes, and defining
new metrics.
4. Demo
In Section 2, we have analyzed a sample Java code
to discover the functionalities of Prest. In this section,
we have analyzed a large software system from one of
our industry partners. This software system has been
implemented in Java and JSP languages. We took two
versions of the same system (version 11 for training
and 12 for testing) and extracted static code attributes
from both Java and JSP files with Prest. Then, we
matched actual defect data with the files whose static
code attributes were extracted by Prest and fed them to
analysis and prediction component. We have used
Naïve Bayes classifier to predict defect-prone files in
version 12. Finally, we have measured the performance
of the prediction component of Prest when only Java
files, only JSP files and both of them are used. In
Figure 6, probability of detection rates (pd) and the
balance rate (bal) have been increased when both Java
and JSP files are used. Furthermore, probability of
false alarm rates (pf) has been decreased.
Those results have been encouraging in the sense
that extracting static code attributes from all the
languages of a software project can increase the
prediction performance. In addition, Prest, as a single
tool, is able to conduct a thorough analysis in large
software systems, thereby reducing the need for
multiple tools for different languages and machine
learning tools.
Figure 6. Improvements in the prediction
performance of Prest
5. Development Methodology
Prest has been developed by MS and PhD students in
SoftLab during the last three years. At various lengths
of involvement (from 6 months to three years), a total
of 12 students and a faculty member worked as the
developers and designers of Prest. We used a formal
waterfall approach where we took the requirements
from our industry partners, reviewed them and used
existing tools. Then, we designed the architecture,
coded Prest, and conducted alpha and beta tests with
our industry partners. Also, a senior architect has been
guiding us for the current and future architecture of the
tool. All development stages are well documented and
we have used a versioning system as well as an
automated bug tracking system. Current members of
SoftLab carry out implementation of new parsers and
they provide maintenance of Prest.
6. Current Usage & Benefits
Early versions of Prest were used by a local white-
goods manufacturer, who wanted to measure code
quality to reduce defect rates and to effectively manage
their testing resources [11]. Using Prest, we collected
static code metrics attributes from C codes at function
level. Then, we analyzed the defect-prone parts of the
software using data analysis component of Prest and
found that testing effort could be reduced by 32%
while catching 76% of defective modules [11].
Recently, we have conducted a metrics program in a
large telecommunication software system [13]. In this
project, we collected static code metrics with Prest in
Java source file-level. Then, we matched those files
with actual defect data and used Naïve Bayes classifier
to predict defective files of the software. We have also
used call graph information in method level and
applied the Naïve Bayes classifier to predict defect-
prone files in the system. Results show that prediction
model in Prest has been capable of detecting 84% of
defective files by inspecting only 31% of the code [13].
In addition to our local industry partners, currently
Prest has been in use in a multi-national company in the
UK. Since Prest [16] is designed as an open source
tool, it is available via Google Code [19] to review,
download or further develop and integrate.
7. Support
The development team of Prest provides support to
users [14]. Once a development activity is performed
and a stable version is elicited new code is committed
to the Prest repository in Google Code [16]. Therefore,
the code that users can access is always the latest stable
version of Prest. Any failure or problem in the system
can be directly entered into the issue management
system of Google Code in order to track the status of
each problem on the web.
8. Related Work
There exist a considerable number of software metrics
tools available either as open source [6, 15, 19, 20, 21]
or as commercial [5, 22, 23]. Since Prest is developed
as an open source tool, we focus on non-commercial
tools for comparison of Prest and other tools. We
acknowledge that there is no ultimate criterion to
compare different tools and conclude that one is
certainly better than the other. However, a set of
criteria may be defined while assessing different metric
tools: Number of languages that are supported, number
and nature of metrics extracted, type of output formats
and analysis and prediction components. Those
functionalities are also examined by other researchers
[7, 18]. Thus, they are also critical for our future
extensions in Prest. We have presented this comparison
with CCCC [19], Chidamber-Kemerer Java Metrics
[20], Dependency Finder [21], Eclipse Metrics Plug-in
version 1.3.6 [6] and CyVis[15] tools in Table 2. From
Table 2, we can see that Prest is more extensive than
other open source tools with respect to languages it
parses, number of extracted metrics, output formats and
its analysis and prediction component. However, this
does not make Prest the finest and the ultimate tool,
since there has been significant effort behind each tool
and we still lack some properties such as simple and
precise graphical representation of dependencies in
Eclipse plug-in or saving extracted metrics in an html
file. Nevertheless, we have managed to provide an all-
in-one tool for software practitioners by saving their
time and effort for searching multiple tools for various
needs and dealing with various output formats.
Moreover, we have benefitted from Prest in our
research studies by extracting static code attributes and
doing predictions for all experiment settings.
9. Conclusion and Future Work
Prest has been in use in three large software systems
(locally and internationally). It has also been used in
various SoftLab empirical research studies at different
companies [11, 12, 13]. Prest in practice with its
prediction capability has so far successfully guided
project managers to take decisions under uncertainty
and has considerably increased testing efficiency.
Table 2. Comparison of Prest and other open source tools
Prest CCCC CK Java
Metrics Dependency
Finder Eclipse
Plug-in CyVis
C+ + - - - -
C++ + + - - - -
Java + + + + + +
Jsp + - - - - -
Supported Languages
PL/SQL + - - - - -
csv + - - - - -
xls + - + - - -
arff + - - - - -
xml + + + + + +
Output
html - + - + + -
Data Analysis
Component + - - - - -
Call Graph
Generation + - - + + -
# Metrics
Collected 28 9 6 13 23 2
Going forward, Prest will be constantly adding new
parsers as well as more learning algorithms. Currently,
we are in the process of migrating Prest tool to cloud
computing in order to serve larger communities better,
to share data and foster reproduction of empirical
experiments.
10. Acknowledgment
This research is funded in part by Tubitak
EEEAG108E014. We would also extend our gratitude
to Mr. Turgay Aytaç, senior architect, for his guidance
as well as A.D.Oral, E.G. Isık, C. Gebi, H. Izmirlioglu,
O. Bozcan and S. Karagulle for their efforts in the
development of this tool.
11. References
[1] B. Kitchenham, S. L. Pfleeger, and N. Fenton. Towards a
framework for software measurement validation. IEEE
Transactions on Software Engineering, 21(12):929–944,
1995.
[2] N. Fenton. Software measurement: a necessary scientific
basis. Software Engineering, IEEE Transactions on,
20(3):199–206, Mar 1994.
[3] M. J. Harrold. Testing: a roadmap. In ICSE ’00:
Proceedings of the Conference on The Future of Software
Engineering, pages 61–72, New York, NY, USA, 2000.
ACM.
[4] N. Nagappan and T. Ball. Static analysis tools as early
indicators of pre-release defect density. Software
Engineering, 2005. ICSE 2005. Proceedings. 27th
nternational Conference on, pages 580–586, May 2005.
[5] Predictive, Integrated Software Metrics, available at
http://freedownloads.rbytes.net/cat/development/other4/predi
ctive-lite/
[6] Eclipse metrics plug-in 1.3.6, available at
http://sourceforge.net/projects/metrics.
[7] P. Kulik and C. Weber. Software metrics best practices –
2001. In Software Metrics Best Practices 2001, March 2001.
[8] M. Auer, B. Graser, and S. Biffl. A survey on the fitness
of commercial software metric tools for service in
heterogeneous environments: common pitfalls. Software
Metrics Symposium, 2003.
[9] B.Turhan, G. Kocak and A. Bener. Software Defect
Prediction Using Call Graph Based Ranking (CGBR)
Framework, Proceedings of the 34th EUROMICRO Software
Engineering and Advanced Applications (EUROMICRO
SEAA'08), 2008.
[10] I. H. Witten and E. Frank. Data Mining: Practical
Machine Learning Tools and Techniques. Morgan Kaufmann
Series in Data Management Systems. Morgan Kaufmann,
second edition, June 2005.
[11]A. Tosun, B. Turhan and A.Bener. Ensemble of Software
Defect Predictors: A Case Study. Proceedings of the 2nd
International Symposium on Empirical Software Engineering
and Measurement (ESEM'08 Short Paper), pp.318-320, 2008
[12] A. Tosun, B. Turhan and A. Bener. The Benefits of a
Software Quality Improvement Project in a Medical Software
Company: A Before and After Comparison, International
Symposium on Health Informatics and Bioinformatics
(HIBIT'08 Invited Paper), 2008.
[13] A.Tosun, B. Turhan and A. Bener. Direct and Indirect
Effects of Software Defect Predictors on Development
Lifecycle: An Industrial Case Study, to appear in
Proceedings of the 19th Interntational Symposium on
Software Reliability Engineering (Industry Track), 2008.
[14] Software Research Laboratory (Softlab), available at
www.softlab.boun.edu.tr
[15] Cyvis Software Complexity Visualizer,
http://cyvis.sourceforge.net/
[16] Prest Metrics Extraction and Analysis Tool, available at
http://code.google.com/p/prest/.
[17] T. Menzies, J. Greenwald and A. Frank. Data Mining
Static Code Attributes to Learn Defect Predictors, IEEE
Transactions on Software Engineering, January 2007,
Vol.33, No. 1, pp. 2-13.
[18] R. Lincke, J. Lundberg, W. Löwe. Comparing Software
Metrics Tools, ISSTA '08: Proceedings of the 2008
international symposium on Software testing and analysis,
2008
[19] C and C++ Code Counter, available at
sourceforge.net/projects/cccc, 2006.
[20] D. Spinellis. Chidamber and Kemerer Java Metrics,
available at www.spinellis.gr/sw/ckjm, 2006.
[21] Dependency Finder, available at
depfind.sourceforge.net, 2008.
[22] Analyst4j Find Using Metrics, available at
www.codeswat.com.
[23] SciTools Source Code Analysis and Metrics,
Understand for Java, available at www.scitools.com
[24] Victor R. Basili , Lionel C. Briand , Walcélio L. Melo,
A Validation of Object-Oriented Design Metrics as Quality
Indicators, IEEE Transactions on Software Engineering, v.22
n.10, p.751-761, October 1996
[25] S. R. Chidamber , C. F. Kemerer, A Metrics Suite for
Object Oriented Design, IEEE Transactions on Software
Engineering, v.20 n.6, p.476-493, June 1994
[26] Nachiappan Nagappan , Laurie Williams , John
Hudepohl , Will Snipes , Mladen Vouk, Preliminary Results
On Using Static Analysis Tools For Software Inspection,
Proceedings of the 15th International Symposium on
Software Reliability Engineering, p.429-439, November 02-
05, 2004
[27] Yue Jiang , Bojan Cuki , Tim Menzies , Nick Bartlow,
Comparing design and code metrics for software quality
prediction, Proceedings of the 4th international workshop on
Predictor models in software engineering, May 12-13, 2008,
Leipzig, Germany
[28] A. Gunes Koru , Hongfang Liu, Building Defect
Prediction Models in Practice, IEEE Software, v.22 n.6,
p.23-29, November 2005
... Rest of the code is added based on type of aggregation target. Corresponding to each such hash-map on a slave, code to gather the hash-maps from all slave processes, to merge them on the basis of keys, and to assign it to the original entity is added to the master code (lines [11][12][13][14][15][16][17][18][19][20]. ...
... A Halstead metric calculator (HMC) has also been implemented in the Dwarf compiler. All tools used for calculating Halstead metrics (HMC for Dwarf, metrics calculator [12] for C, Prest tool [13] for Spark) have been adapted to use the same conventions for counting operators and operands. We have also reported Lines of Code (LOC). ...
... Kocagüneli et al. [26] introduced Prest, an open source tool for static analysis and bug prediction. Compared to other open source prediction and analysis tools, Prest is unique in that it collects source code metrics and call graphs in 5 different programming languages, and performs learning-based defect prediction and analysis. ...
Preprint
Full-text available
Forecasting defect proneness of source code has long been a major research concern. Having an estimation of those parts of a software system that most likely contain bugs may help focus testing efforts, reduce costs, and improve product quality. Many prediction models and approaches have been introduced during the past decades that try to forecast bugged code elements based on static source code metrics, change and history metrics, or both. However, there is still no universal best solution to this problem, as most suitable features and models vary from dataset to dataset and depend on the context in which we use them. Therefore, novel approaches and further studies on this topic are highly necessary. In this paper, we employ a chemometric approach - Partial Least Squares with Discriminant Analysis (PLS-DA) - for predicting bug prone Classes in Java programs using static source code metrics. To our best knowledge, PLS-DA has never been used before as a statistical approach in the software maintenance domain for predicting software errors. In addition, we have used rigorous statistical treatments including bootstrap resampling and randomization (permutation) test, and evaluation for representing the software engineering results. We show that our PLS-DA based prediction model achieves superior performances compared to the state-of-the-art approaches (i.e. F-measure of 0.44-0.47 at 90% confidence level) when no data re-sampling applied and comparable to others when applying up-sampling on the largest open bug dataset, while training the model is significantly faster, thus finding optimal parameters is much easier. In terms of completeness, which measures the amount of bugs contained in the Java Classes predicted to be defective, PLS-DA outperforms every other algorithm: it found 69.3% and 79.4% of the total bugs with no re-sampling and up-sampling, respectively.
... Appendix A shows a complete listing of the metrics gathered by the PREST 4 tool (Kocaguneli et al. 2009) on the correct programs (before faults were injected). Although the purpose of the programs is different, we can see that most of the metrics obtained by PREST are quite similar, except Halstead metrics, which are greater for ntree. ...
Article
Full-text available
A recurring problem in software development is incorrect decision making on the techniques, methods and tools to be used. Mostly, these decisions are based on developers’ perceptions about them. A factor influencing people’s perceptions is past experience, but it is not the only one. In this research, we aim to discover how well the perceptions of the defect detection effectiveness of different techniques match their real effectiveness in the absence of prior experience. To do this, we conduct an empirical study plus a replication. During the original study, we conduct a controlled experiment with students applying two testing techniques and a code review technique. At the end of the experiment, they take a survey to find out which technique they perceive to be most effective. The results show that participants’ perceptions are wrong and that this mismatch is costly in terms of quality. In order to gain further insight into the results, we replicate the controlled experiment and extend the survey to include questions about participants’ opinions on the techniques and programs. The results of the replicated study confirm the findings of the original study and suggest that participants’ perceptions might be based not on their opinions about complexity or preferences for techniques but on how well they think that they have applied the techniques.
... In order to have homogeneous feature sets between different programming languages, the object-oriented based CK metrics are excluded in this paper. We use Prest ( Kocaguneli et al., 2009 ), an intelligent open-source tool for software metrics extraction, to collect CM features for experimental analysis. The second baseline of traditional features is the VC features used in ( Zhang et al., 2020 ). ...
Article
Full-text available
Recently, artificial intelligence techniques have been widely applied to address various specialized tasks in software engineering, such as code generation, defect identification, and bug repair. Despite the diffuse usage of static analysis tools in automatically detecting potential software defects, developers consider the large number of reported alarms and the expensive cost of manual inspection to be a key barrier to using them in practice. To automate the process of defect identification, researchers utilize machine learning algorithms with a set of hand-engineered features to build classification models for identifying alarms as actionable or unactionable. However, traditional features often fail to represent the deep syntactic structure of alarms. To bridge the gap between programs’ syntactic structure and defect identification features, this paper first extracts a set of novel fine-grained features at variable-level, called path-variable characteristic, by applying path analysis techniques in the feature extraction process. We then raise a two-stage transfer learning approach based on our proposed features, called feature ranking-matching based transfer learning, to increase the performance of cross-project defect identification. Our experimental results for eight open-source projects show that the proposed features at variable-level are promising and can yield significant improvement on both within-project and cross-project defect identification.
... LOC is computed using CLOC tool [9]. All tools used for calculating Halstead metrics (HMC for Dwarf, metrics calculator [28] for C, Prest tool [30] for Spark) have been adapted to use the same conventions for counting operators and operands 1 . ...
Article
Full-text available
Ease of programming and optimal parallel performance have historically been on the opposite side of a trade-off, forcing the user to choose. With the advent of the Big Data era and the rapid evolution of sequential algorithms, the data analytics community can no longer afford the trade-off. We observed that several clustering algorithms often share common traits—particularly, algorithms belonging to the same class of clustering exhibit significant overlap in processing steps. Here, we present our observation on domain patterns in representative-based clustering algorithms and how they manifest as clearly identifiable programming patterns when mapped to a Domain Specific Language (DSL). We have integrated the signatures of these patterns in the DSL compiler for parallelism identification and automatic parallel code generation. The compiler either generates MPI C++ code for distributed memory parallel processing or MPI–OpenMP C++ code for hybrid memory parallel processing, depending upon the target architecture. Our experiments on different state-of-the-art parallelization frameworks show that our system can achieve near-optimal speedup while requiring a fraction of the programming effort, making it an ideal choice for the data analytics community. Results are presented for both distributed and hybrid memory systems.
... De modo similar, Kocaguneli et al. [8] propuseram a ferramenta Prest para prever configurações de medidas utilizando aprendizado de máquina. Essa ferramenta suporta a extração de medidas nas linguagens C, C++, Java, JSP e PL/SQL. ...
Conference Paper
Software measures are underused due to the difficulty of interpreting their results and associating them to software quality. Different environments, languages, and development methodologies require specific measures and range of values. Thus, this paper proposes MCL (Metrics-based Constraint Language), a language that allows to specify, for different system components, the measures to be used and the expected range of values for each measure. We implemented a tool, called MCLcheck, to verify if a system conforms to the specified MCL restrictions and to report the detected violations. We explored different contexts of language usage through the MyAppointments system, demonstrating the applicability of MCL and its effectiveness as a language that provides support for preservation of quality factors, maintainability, and performance of information systems.
... The specific features available in specific datasets are described in the original papers using those datasets [11] [28] [29]. Figure 2 reports the AUC of different classifiers (vertical axis) on different projects (horizontal axis) as computed by specific techniques (color). ...
Preprint
Full-text available
[Context] The use of defect prediction models, such as classifiers, can support testing resource allocations by using data of the previous releases of the same project for predicting which software components are likely to be defective. A validation technique, hereinafter technique defines a specific way to split available data in training and test sets to measure a classifier accuracy. Time-series techniques have the unique ability to preserve the temporal order of data; i.e., preventing the testing set to have data antecedent to the training set. [Aim] The aim of this paper is twofold: first we check if there is a difference in the classifiers accuracy measured by time-series versus non-time-series techniques. Afterward, we check for a possible reason for this difference, i.e., if defect rates change across releases of a project. [Method] Our method consists of measuring the accuracy, i.e., AUC, of 10 classifiers on 13 open and two closed projects by using three validation techniques, namely cross validation, bootstrap, and walk-forward, where only the latter is a time-series technique. [Results] We find that the AUC of the same classifier used on the same project and measured by 10-fold varies compared to when measured by walk-forward in the range [-0.20, 0.22], and it is statistically different in 45% of the cases. Similarly, the AUC measured by bootstrap varies compared to when measured by walk-forward in the range [-0.17, 0.43], and it is statistically different in 56% of the cases. [Conclusions] We recommend choosing the technique to be used by carefully considering the conclusions to draw, the property of the available datasets, and the level of realism with the classifier usage scenario.
... Some quality goals such as well-establishment and precision are especially important in MDE [18 -20]. In [21], authors also developed open source tool aims to address quality measurement and prediction process to achieve automatically. In [22], authors presented the most recent challenges faced in the process to make model transformation more sophisticated. ...
Article
Full-text available
Model driven development is an important role in software engineering. It consists of multiple transformation functions. This development is a paradigm for writing and implementing computer program quickly, effectively, at minimum cost and reducing development efforts because it transforms design model to object-oriented code. Our approach is rule-based model driven development in which textual Umple model is used as primary artifact and transformed to mobile applications. In this model driven development, evaluation of quality of transformation is critical. This paper has presented a set of metrics to assess the quality attribute of modifiability and evaluated using these object-oriented metrics. Results represent our approach achieves high efficiency in quality of modifiability.
... These are however not the only studies for assessing software metrics and their tools based on specific attributes, like [2,17] on assessing cognitive complexity in java-based object-oriented systems, and other tools. Other tools were designed as for specific uses like [11,[18][19][20][21][22][23][24] B. Selected Metrics Tools ...
Article
Full-text available
The prediction of fault-prone modules continues to attract interest due to the significant impact it has on software quality assurance. One of the most important goals of such techniques is to accurately predict the modules where faults are likely to hide as early as possi-ble in the development lifecycle. Design, code, and most recently, requirements metrics have been successfully used for predicting fault-prone modules. The goal of this paper is to compare the per-formance of predictive models which use design-level metrics with those that use code-level metrics and those that use both. We ana-lyze thirteen datasets from NASA Metrics Data Program which of-fer design as well as code metrics. Using a range of modeling tech-niques and statistical significance tests, we confirmed that models built from code metrics typically outperform design metrics based models. However, both types of models prove to be useful as they can be constructed in different project phases. Code-based models can be used to increase the performance of design-level models and, thus, increase the efficiency of assigning verification and validation activities late in the development lifecycle. We also conclude that models that utilize a combination of design and code level metrics outperform models which use either one or the other metric set.
Conference Paper
Full-text available
In this paper, we present a defect prediction model based on ensemble of classifiers, which has not been fully explored so far in this type of research. We have conducted several experiments on public datasets. Our results reveal that ensemble of classifiers considerably improve the defect detection capability compared to Naive Bayes algorithm. We also conduct a cost-benefit analysis for our ensemble, where it turns out that it is enough to inspect 32% of the code on the average, for detecting 76% of the defects.
Conference Paper
Full-text available
Recent research on static code attribute (SCA) based defect prediction suggests that a performance ceiling has been achieved and this barrier can be exceeded by increasing the information content in data. In this research we propose static call graph based ranking (CGBR) framework, which can be applied to any defect prediction model based on SCA. In this framework, we model both intra module properties and inter module relations. Our results show that defect predictors using CGBR framework can detect the same number of defective modules, while yielding significantly lower false alarm rates. On industrial public data, we also show that using CGBR framework can improve testing efforts by 23%.
Conference Paper
Full-text available
This paper shows that existing software metric tools inter- pret and implement the denitions of object-oriented soft- ware metrics dierently. This delivers tool-dependent met- rics results and has even implications on the results of anal- yses based on these metrics results. In short, the metrics- based assessment of a software system and measures taken to improve its design dier considerably from tool to tool. To support our case, we conducted an experiment with a number of commercial and free metrics tools. We calcu- lated metrics values using the same set of standard metrics for three software systems of dierent sizes. Measurements show that, for the same software system and metrics, the metrics values are tool depended. We also dened a (sim- ple) software quality model for "maintainability" based on the metrics selected. It denes a ranking of the classes that are most critical wrt. maintainability. Measurements show that even the ranking of classes in a software system is met- rics tool dependent.
Article
Full-text available
The value of using static code attributes to learn defect predictors has been widely debated. Prior work has explored issues like the merits of "McCabes versus Halstead versus lines of code counts” for generating defect predictors. We show here that such debates are irrelevant since how the attributes are used to build predictors is much more important than which particular attributes are used. Also, contrary to prior pessimism, we show that such defect predictors are demonstrably useful and, on the data studied here, yield predictors with a mean probability of detection of 71 percent and mean false alarms rates of 25 percent. These predictors would be useful for prioritizing a resource-bound exploration of code that has yet to be inspected.
Article
While software metrics are a generally desirable feature in the software management functions of project planning and project evaluation, they are of especial importance with a new technology such as the object-oriented approach. This is due to the significant need to train software engineers in generally accepted object-oriented principles. This paper presents theoretical work that builds a suite of metrics for object-oriented design. In particular, these metrics are based upon measurement theory and are informed by the insights of experienced object-oriented software developers. The proposed metrics are formally evaluated against a widelyaccepted list of software metric evaluation criteria.
Conference Paper
During software development it is helpful to obtain early estimates of the defect density of software components. Such estimates identify fault-prone areas of code requiring further testing. We present an empirical approach for the early prediction of pre-release defect density based on the defects found using static analysis tools. The defects identified by two different static analysis tools are used to fit and predict the actual pre-release defect density for Windows Server 2003. We show that there exists a strong positive correlation between the static analysis defect density and the pre-release defect density determined by testing. Further, the predicted pre-release defect density and the actual pre-release defect density are strongly correlated at a high degree of statistical significance. Discriminant analysis shows that the results of static analysis tools can be used to separate high and low quality components with an overall classification rate of 82.91%.