Conference PaperPDF Available

Evaluation of Feature Extraction Methods on Software Cost Estimation

Authors:
Conference Paper

Evaluation of Feature Extraction Methods on Software Cost Estimation

Abstract

This research investigates the effects of linear and non-linear feature extraction methods on the cost estimation performance. We use principal component analysis (PCA) and Isomap for extracting new features from observed ones and evaluate these methods with support vector regression (SVR) on publicly available datasets. Our results for these datasets indicate there is no significant difference between the performances of these linear and non-linear feature extraction methods.
Evaluation of Feature Extraction Methods on Software Cost Estimation
Burak Turhan, Onur Kutlubay, Ayse Bener
Department of Computer Engineering
Bogazici University
34342 Bebek, Istanbul, Turkey
turhanb@boun.edu.tr, kutlubay@cmpe.boun.edu.tr, bener@boun.edu.tr
Abstract
This research investigates the effects of linear and
non-linear feature extraction methods on the cost
estimation performance. We use Principal Component
Analysis (PCA) and Isomap for extracting new features
from observed ones and evaluate these methods with
support vector regression (SVR) on publicly available
datasets. Our results for these datasets indicate there
is no significant difference between the performances
of these linear and non-linear feature extraction
methods.
1. Introduction
In this research we focus on carrying out an
empirical study featuring both machine learning based
regression analysis and feature extraction algorithms.
We employ PCA, which is used in previous researches,
and a non-linear method Isomap [2, 3]. We present the
empirical evaluation of both methods combined with a
standard machine learning algorithm, support vector
regression.
2. Experiments and Results
We have used 2 public datasets, ‘Cocomo NASA'
and ‘SDR’, where the latter is compiled by the authors
through several software houses in Turkey. All
experiments are performed in a 10x10-fold cross-
validation framework. The results are reported as the
mean and standard deviation of PRED(30) values,
which provides a more understandable perspective
especially to the business users [1]. We use SVR in
order to estimate the cost in the datasets. Linear kernel
function is used because of its simplicity, accuracy and
speed.
In NASA dataset, the best mean value of PRED(30)
is obtained as 59.9%. Results in SDR dataset show that
the best mean value of PRED(30) is obtained as
63.3%. Both results are obtained with only 1 feature
either using Isomap or PCA. The reason is understood
when the eigenvalues are examined. In both cases the
first eigenvalue is significantly greater than the others.
This means most of the information in data is
explained by the first principle component.
Performances on both datasets are given in Figure 1.
3. Conclusions
We evaluated a linear and a non-linear feature
extraction method, PCA and Isomap respectively, for
software cost estimation performance. We observe that
there is no evidence in favor of linear or non-linear
approaches for these datasets.
References
[1] T. Menzies, D. Port, Z. Chen, J. Hihn, S. Stukes,
“Validation Methods for Calibrating Software Effort
Models”, Proceedings of ICSE05, 2005.
[2] N. Nagappan, L. Williams, J. Osborne, M. Vouk and P.
Abrahamsson, “Providing Test Quality Feedback Using
Static Source Code and Automatic Test Suite Metrics”,
Proceedings of ISSRE05, 2005.
[3] J.B. Tenenbaum, V. De Silva and J. C. A. Langford,
“Global Geometric Framework for Nonlinear Dimensionality
Reduction”, Science, Vol. 290, 2000, pp. 2319–2323.
Figure 1. Number of features vs. PRED(30)
First International Symposium on Empirical Software Engineering and Measurement
0-7695-2886-4/07 $20.00 © 2007 IEEE
DOI 10.1109/ESEM.2007.57
497
First International Symposium on Empirical Software Engineering and Measurement
0-7695-2886-4/07 $20.00 © 2007 IEEE
DOI 10.1109/ESEM.2007.57
497
... Because of the small set of different DRMs found in the investigated articles, a fundamental research is done on this topic. It turns out that DRMs are applied more with ICEMs for software products (Chen et al., 2005;Turhan et al., 2007). In the following, five DRMs coming from the investigated literature and from the literature of the fundamental research are compared based on their advantages and disadvantages for ICEMs. ...
... Additionally, as PCA belongs to linear DRMs, PCA does not consider nonlinear interrelations between features (Shlens, 2014). But, according to the case study of Turhan et al. (2007), in the context of ICEMs for software products, nonlinear feature extraction methods do not have advantages over linear feature extraction methods. Kernel PCA, a special type of the PCA, is one of these nonlinear feature extraction methods. ...
... In the literature of LSS, no article deals with the Isomap algorithm. Only Turhan et al. (2007) use the Isomap algorithm in the context of ICEMs but apply it to software products. ...
Article
In the automotive industry, cost estimation of components to be purchased plays an important role for price negotiations with suppliers and, therefore, for cost control within the supply chain. While traditional bottom-up cost estimation is a very time-consuming and know-how intensive process, intelligent machine learning methods have the potential to significantly reduce the effort in the cost estimation process. In this paper, a literature review on intelligent cost estimation methods for parts to be procured in the manufacturing industry is done by text mining. Following the results of this literature review, building blocks for an intelligent cost estimation system are outlined that comprise cost estimation methods, dimensionality reduction methods, methods for multi-level cost estimation, and methods for interpretation of cost analytics results. Regarding cost estimation methods in the literature, Artificial Neural Networks and Support Vector Machines outperform established linear regression algorithms. Dimensionality reduction methods like Correlation Analysis or Principal Component Analysis are rarely studied in literature. Nevertheless, they contribute a lot to the reduction of expensively provided input parameters for cost estimation. Methods for multi-level cost estimation, that support cost prediction of parts and assemblies following the construction plan of a vehicle, and methods for interpretation of intelligent cost analytics cannot be found at all in literature. Consequently, in this paper corresponding approaches are derived from the areas of Multitask Learning and Explainable Machine Learning. Finally, a combination of methods considered most suitable for predictive analytics to estimate procurement costs is presented.
... In contrast, Turhan et al. employ two attribute extraction techniques, namely principal component analysis and Isomap, for extracting new attributes from existing ones and evaluate these methods with support vector regression on the SE task of software cost estimation [36]. ...
Preprint
Recent years have witnessed the growing demands for resolving numerous bug reports in software maintenance. Aiming to reduce the time testers/developers take in perusing bug reports, the task of bug report summarization has attracted a lot of research efforts in the literature. However, no systematic analysis has been conducted on attribute construction which heavily impacts the performance of supervised algorithms for bug report summarization. In this study, we first conduct a survey to reveal the existing methods for attribute construction in mining software repositories. Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). To evaluate the effectiveness of LRCA, we build a series of large scale data sets with 105,177 bug reports. Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
... Finding the right training data for their particular project had been a serious problem for us. For this goal, we evaluated sampling strategies to pick the right amount of data in our experiments [64]. ...
Chapter
In this chapter, we share our experience and views on software data analytics in practice with a retrospect to our previous work. Over ten years of joint research projects with the industry, we have encountered similar data analytics patterns in diverse organizations and in different problem cases. We discuss these patterns following a 'software analytics' framework: problem identification, data collection, descriptive statistics and decision making. We motivate the discussion by building our arguments and concepts around our experiences of the research process in six different industry research projects in four different organizations.
... Finding the right training data for their particular project had been a serious problem for us. For this goal, we evaluated sampling strategies to pick the right amount of data in our experiments [64]. ...
Article
Full-text available
In this chapter, we share our experience and views on software data analytics in practice with a review of our previous work. In more than 10 years of joint research projects with industry, we have encountered similar data analytics patterns in diverse organizations and in different problem cases. We discuss these patterns following a "software analytics" framework: problem identification, data collection, descriptive statistics, and decision making. In the discussion, our arguments and concepts are built around our experiences of the research process in six different industry research projects in four different organizations.Methods: Spearman rank correlation, Pearson correlation, Kolmogorov-Smirnov test, chi-square goodness-of-fit test, t test, Mann-Whitney U test, Kruskal-Wallis analysis of variance, k-nearest neighbor, linear regression, logistic regression, naïve Bayes, neural networks, decision trees, ensembles, nearest-neighbor sampling, feature selection, normalization.
... Thus, weighting features is our major concern in order to reflect the impact of features on the estimation performance. Various approaches propose feature selection methods (Bener, Turhan, & Kutlubay, 2007) weight assignment methods to the features, using Euclidean distance (Auer et al., 2006;Mendes, Watson, Triggs, Mosley, & Counsell, 2003;Shepperd & Schofield, 1997), fuzzy logic (Azzeh, Neagu, & Cowling, 2008), rough set analysis (Li & Ruhe, 2006; and genetic algorithms (Huang & Chiu, 2006). ...
Article
Software cost estimation is one of the critical tasks in project management. In a highly demanding and competitive market environment, software project managers need robust models and methodologies to accurately predict the cost of a new project. Analogy-based cost estimation is one of the widely used models that rely on historical project data. It checks the similarity of features between past and current projects, and it approximates current project cost from past ones. One shortcoming of analogy-based cost estimation is that it assumes all project features as equal. However, these features may have different impacts on project cost based on their relevance. In this research, we present two feature weight assignment heuristics for cost estimation. We assign weights to the project features by benefiting from a statistical technique, namely principal components analysis (PCA) that is used for extracting optimal linear patterns of high dimensional data. We test our proposed heuristics on public datasets and conclude that the prediction performance in terms of MMRE and Pred(25) increases with a statistical-based assignment technique rather than random assignment approach.
Article
Software development effort estimation is an effective factor in the success or failure of software projects. There are several methods to estimate the effort of software projects, the most common of which is analogy‐based estimation (ABE). In this article, a polynomial version of ABE (named PABE) is presented, in which, the project effort is calculated based on a polynomial ensemble of different ABE models. To optimize the controllable parameters of the PABE model, a combined global–local search metaheuristic algorithm based on particle swarm optimization and simulated annealing is utilized in two steps. At the first step, for each similarity and adaptation function, the optimized ABE model is determined by exploiting the optimal value of feature weights, the number of similar projects, and other parameters of the ABE model. Then, at the second step, the amount of effort attained by the optimized models is used for estimating the final effort by the proposed polynomial equation. The proposed PABE method has been successfully executed on five well‐known software effort estimation datasets: Maxwell, Albrecht, Cocomo81, Desharnais, and Kemerer. Obtained results show the superiority of the proposed PABE model in terms of accuracy and efficiency compared to other techniques.
Article
Recent years have witnessed the growing demands for resolving numerous bug reports in software maintenance. Aiming to reduce the time testers/developers take in perusing bug reports, the task of bug report summarization has attracted a lot of research efforts in the literature. However, no systematic analysis has been conducted on attribute construction, which heavily impacts the performance of supervised algorithms for bug report summarization. In this study, we first conduct a survey to reveal the existing methods for attribute construction in mining software repositories. Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowd-generated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). To evaluate the effectiveness of LRCA, we build a series of large scale datasets with 105 177 bug reports. Experiments over both the public dataset SDS with 36 manually annotated bug reports and new large-scale datasets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
Conference Paper
Full-text available
COCONUT calibrates effort estimation models using an exhaustive search over the space of calibration parameters in a Cocomo I model. This technique is much simpler than other effort estimation method yet yields PRED levels comparable to those other methods. Also, it does so with less project data and fewer attributes (no scale factors). However, a comparison between COCONUT and other methods is complicated by differences in the experimental methods used for effort estimation. A review of those experimental methods concludes that software effort estimation models should be calibrated to local data using incremental holdout (not jack knife) studies, combined with randomization and hypothesis testing, repeated a statistically significant number of times.
Article
Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.
Article
Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.
Conference Paper
A classic question in software development is "How much testing is enough?" Aside from dynamic coverage-based metrics, there are few measures that can be used to provide guidance on the quality of an automatic test suite as development proceeds. This paper utilizes the software testing and reliability early warning (STREW) static metric suite to provide a developer with indications of changes and additions to their automated unit test suite and code for added confidence that product quality will be high. Retrospective case studies to assess the utility of using the STREW metrics as a feedback mechanism were performed in academic, open source and industrial environments. The results indicate at statistically significant levels the ability of the STREW metrics to provide feedback on important attributes of an automatic test suite and corresponding code.