Optimising Project Feature Weights for Analogy-Based Software Cost Estimation using the Mantel Correlation
ABSTRACT Software cost estimation using analogy is an important area in software engineering research. Previous research has demonstrated that analogy is a viable alternative to other conventional estimation methods in terms of predictive accuracy. One of the important research areas for analogy is how to determine suitable project feature weights. This can be achieved by using an extensive project feature weights search, where the quality measure is optimised. However, this approach suffers similar issues as the brute-force feature selection approach in analogy. We propose a novel method to deal with this issue based upon the use of the Mantel randomisation test. Specifically, we determine project feature weights based on the strength of correlation between the distance matrix of project features and the distance matrix of known effort values of the dataset. We demonstrate the procedure on a specific dataset, showing the use of the Mantel correlation to identify whether analogy is appropriate, and whether the project feature weights can be determined by statistical inference. Our results also show improved prediction accuracy when multiple project features are used with determined weights. Our method, thus, provides a sound statistical basis for analogy.
- [show abstract] [hide abstract]
ABSTRACT: Accurate and reliable software cost estimation is a vital task in software project portfolio decisions like resource scheduling or bidding. A prominent and transparent method of supporting estimators is analogy-based cost estimation, which is based on finding similar projects in historical portfolio data. However, the various project feature dimensions used to determine project analogy represent project aspects differing widely in their relevance; they are known to have varying impact on the analogies - and in turn on the overall estimation accuracy and reliability - , which is not addressed by traditional approaches. This paper (a) proposes an improved analogy-based approach based on extensive dimension weighting, and (ii) empirically evaluates the accuracy and reliability improvements in the context of five real-world portfolio data sets. Main results are accuracy and reliability improvements for all analyzed portfolios and quality measures. Furthermore, the approach indicates a quality barrier for analogy-based estimation approaches using the same basic assumptions and quality measures.Empirical Software Engineering, 2004. ISESE '04. Proceedings. 2004 International Symposium on; 09/2004
- 01/1994; Chapman and Hall.