Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Measuring software functional size via standard Function Points Analysis (FPA) requires the availability of fully specified requirements and specific competencies. Most of the time, the need to measure software functional size occurs well in advance with respect to these ideal conditions, under the lack of complete information or skilled experts. To work around the constraints of the official measurement process, several estimation methods for FPA have been proposed and are commonly used. Among these, the International Function Point User Group (IFPUG) has adopted the ‘High-level FPA’ method (also known as NESMA method). This method avoids weighting each data and transaction function by using fixed weights instead. Applying High-level FPA, or similar estimation methods, is faster and easier than carrying out the official measurement process, but inevitably yields an approximation in the measures. In this paper, we contribute to the problem of estimating software functional size measures by using machine learning. To the best of our knowledge, machine learning methods were never applied to the early estimation of software functional size. Our goal is to understand whether machine learning techniques yield estimates of FPA measures that are more accurate than those obtained with High-level FPA or similar methods. An empirical study on a large dataset of functional size predictors was carried out to train and test three of the most popular and robust machine learning methods, namely: Random Forests, Support Vector Regression, and Neural Networks. A systematic experimental phase, with cycles of dataset filtering and splitting, parameters tuning, and model training and validation is presented. The estimation accuracy of the obtained models was then evaluated and compared to that of fixed-weight models (e.g., High-level FPA) and linear regression models, also using a second dataset as the test set. We found that Support Vector Regression yields quite accurate estimation models. However, the obtained level of accuracy does not appear significantly better with respect to High-level FPA or to models built via ordinary least squares regression. Noticeably, fairly good accuracy levels were obtained by models that do not even require discerning among different types of transactions and data.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Simplified measures consider a smaller amount of information than traditional Function Points, hence they may be used to approximate traditional measures of functional size. Previous research has shown that this approximation is relatively accurate [4]. However, in principle, even small inaccuracies in the measurement of functional size could negatively affect the accuracy of effort estimation. ...
... In many cases, project managers need a rough estimate of development effort even before functional requirements have been described in full detail. This led to the proposal of many simplified measurement methods (also known as approximate measurement methods) [3], [4], [24]- [30]. It is worth mentioning that simplified measures not only anticipate the availability of functional measures; they also make the measurement process faster and less expensive, a fact that is clearly appreciated by software project managers, as long as the inherent approximation of simplified measures does not decrease too much the accuracy of effort estimates. ...
... Specifically, the datasets we analyzed accounts for projects developed in 16 different industrial sectors, in a 25 year range. The projects have various size: e.g., our enhancement project involve functionality in the [4,7134] UFP range. Accordingly, the considered projects required quite different effort (from 21 PersonHours to over 300 PersonMonths) and were carried out by teams having various sizes (1 to 49 members). ...
Article
Full-text available
Functional size measures are often used as the basis for estimating development effort, because they are available in the early stages of software development. Several simplified measurement methods have also been proposed, both to decrease the cost of measurement and to make functional size measurement applicable when functional user requirements are not yet known in full detail. It has been shown that simplified functional measures are suitable to support effort estimation using traditional statistical effort models. Lately, machine learning techniques have been successfully used for software development effort estimation. However, the usage of machine learning techniques in combination with simplified functional size measures has not yet been empirically evaluated. This paper aims to fill this gap. It reports to what extent functional size measures can be simplified, without decreasing the accuracy of effort estimates they can support, when machine learning is used to build effort prediction models. In performing this evaluation, we also took into account that different effort models can be required when (i) new software is developed from scratch, (ii) it is extended by adding new functionality, or (iii) functionalities are added, changed and possibly removed. We carried out an empirical study, in which we used measures collected from several industrial projects. Effort estimation models were built via multiple Machine Learning techniques, using both traditional full-fledged functional size measures and simplified measures, for each of the three aforementioned types of development. According to our empirical study, it appears that using simplified functional size measures in place of traditional functional size measures for effort estimation does not yield practically relevant differences in accuracy; this result holds for all the project types considered, i.e., new developments, extensions and enhancements. Therefore, software project managers can consider analyzing only a small and specific part of functional user requirements to get measures that effectively support effort estimation.
... Models Addressed [7] Support Vector Machine, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Evolutionary Strategy (ES), Local Search (LS), Differential Evaluation (DE), and Practical Swarm Optimization (PSO) [8] Neural Network, Random Forest, and Support Vector Regression [9] Random Forest [10] Meta-heuristic algorithms: GWO, ZOA, MFO, PDO, and WSO [11] Decision Tree, K-Nearest Neighbor, Gradient Boosting, Neural Network, Naive Bayes, Support Vector Machine, and Bayesian Network [12] Genetic Algorithm (GA) [13] Vector machine, K-Nearest Neighbor, Artificial Neural Network, and Random Forest [14] SVM, MLP, decision trees, and Random Forest [15] Classification model [16] Smart AI assistant, conversational AI platform (LLMs) [17] Gradient Boosting, Neural Network, [4] word2vec, paragraph2vec, Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNNs) [18] Decision Tree, K-Nearest Neighbor [3] Naive Bayes [19] ChatGPT [5] Support Vector Machine, and Bayesian Network [20] GPT-2 language models and Transformer architecture Table 3. AI models addressed and the number of covered papers. ...
... In their investigation into employing AI to estimate the functional size of software, ref. [8] drew attention to the inherent "black-box nature" of many machine learning algorithms. This characteristic complicates the documentation, tracing, and elucidation of the processes, results, and logic underpinning machine learning algorithms, rendering them less transparent and difficult to interpret. ...
... Ref. [8] pointed out that their findings and the model's efficiency are confined to the npm ecosystem. This suggests that its applicability might not extend seamlessly across various software ecosystems, each characterized by its practices, cultural norms, and dependency management techniques. ...
Article
Full-text available
Artificial intelligence (AI) has helped enhance the management of software development projects through automation, improving efficiency and enabling project professionals to focus on strategic aspects. Despite its advantages, applying AI in software development project management still faces several challenges. Thus, this study investigates key obstacles to applying artificial intelligence in project management, specifically in the project planning phase. This research systematically reviews the existing literature. The review comprises scientific articles published from 2019 to 2024 and, from the inspected records, 17 papers were analyzed in full-text form. In this review, 10 key barriers were reported and categorized based on the Technology–Organization–Environment (TOE) framework. This review showed that eleven articles reported technological challenges, twelve articles identified organizational challenges, and six articles reported environmental challenges. In addition, this review found that there was relatively little interest in the literature on environmental challenges, compared to organizational and technological barriers.
... Some do not emphasize formal measurement methods [22], while others reduce the problem to size-based classification [23], which is a relatively easier task. Some use only the high-level requirements or use-case names to make rough estimations [24], [25], while others require additional quantitative information about the projects [26] to make predictions. ...
... However, the proposed method requires both context-related and quantitative information about the projects for size prediction. Lavazza et al. [25] reported 0.1 MMRE for a rough estimation of FPA from high-level requirements. ...
Preprint
Software Size Measurement (SSM) plays an essential role in software project management as it enables the acquisition of software size, which is the main input for development effort and schedule estimation. However, many small and medium-sized companies cannot perform objective SSM and Software Effort Estimation (SEE) due to insufficient resources and an expert workforce. This results in inadequate estimates and projects exceeding the planned time and budget. Therefore, organizations need to perform objective SSM and SEE using minimal resources without an expert workforce. In this research, we conducted an exploratory case study to predict the functional size of software project requirements using state-of-the-art large language models (LLMs). For this aim, we fine-tuned BERT and BERT_SE with a set of functional processes (FPs) and their respective functional size in COSMIC Function Points (CFP). We derived the FPs from use-case scenarios included in different project requirement documents. Although we used a relatively small dataset to train the models, we obtained promising size prediction results with a 0.23 Mean Magnitude of Relative Error (MMRE) and 0.75 PRED(30).
... Another notable obstacle, discussed in [ 49], is the lack of explainability in GenAI models. These models often function as black boxes, making it difficult to adopt them in environments where transparency is crucial, particularly in regulated industries or projects requiring decision traceability. ...
Chapter
The integration of Generative Artificial Intelligence (GenAI) into Agile Software Development (ASD) is reshaping the way development teams automate tasks, optimize processes, and enhance user experience (UX). This study presents a systematic literature review to analyze the impact of GenAI on ASD, focusing on three research questions: (1) How can GenAI tools optimize user experience in agile software development projects? (2) What are agile teams’ main challenges when integrating GenAI tools into software development projects? and (3) What stages of the agile software development cycle benefit from implementing GenAI tools? A total of 21 relevant studies published between 2020 and 2024 were selected and analyzed. Findings indicate that GenAI improves UX by facilitating automated test generation, personalized user interfaces, and enhanced documentation processes. However, challenges such as data quality, model transparency, security vulnerabilities, and team resistance hinder its adoption. Moreover, the research highlights that GenAI contributes across multiple ASD phases, including planning (requirement analysis and user story generation), implementation (automated code generation and debugging), testing (self-generated test cases), maintenance (documentation and refactoring), and retrospectives (data-driven team performance analysis). Despite its growing adoption, the study reveals a gap in empirical evaluations of GenAI’s long-term impact on Agile methodologies. Future research should explore hybrid frameworks that balance automation and human oversight, longitudinal studies on GenAI’s adoption trends, and strategies to ensure ethical and bias-free AI implementation in Agile environments. The findings contribute to a deeper understanding of GenAI’s transformative role in software development and provide practical insights for industry professionals and researchers.
... It is applied in different application domains including healthcare [14], surveillance and security [14], weather forecasting [14], banking [15], software project management and estimation [16], and so on. A number of ML techniques (Support Vector Machine, Random Forest, and Neural Network) have been applied to size the software functionality using the Function Point Analysis (FPA), an FSM of the 1 st generation [17]. However, FPA has been criticized because it has been mostly designed to size MIS applications and inadequate to size real-time software and embedded software [18]. ...
Conference Paper
Chatbots are becoming more popular due to the number of functionalities they provide, the time savings and rapid responses in real time. Developing a chatbot requires defining a list of functional requirements upfront. Some of these requirements can be derived from other chatbots to discover and provide the required functionality, while the acquisition of other requirements is time-consuming and costly. Applying a standardized functional size measurement method, such as COSMIC Function Points – ISO 19761, to chatbots requirements is helpful in estimating the related project development effort and duration. This paper proposes an automated tool named BotCFP for generating the chatbots’ sizes using the use-cases.csv dataset. Three Machine Learning techniques (Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machine (GBM)) were used to determine the chatbots’ sizes based on their functional processes (FP) name using three different text vectorization methods: TF-IDF, Word2Vec, and Bag of words. The best measurement results were provided by Random Forest using TF-IDF text vectorization method, deployed and used as an API in the BotCFP tool. The proposed tool allows users (project managers and developers) to determine the chatbot size from its FPs names before starting the development process.
... Since the introduction of Function Point analysis, many researchers and practitioners have strived to develop simplified versions of the FP measurement process, both to reduce the cost and duration of the measurement process and to make it applicable when fullfledged requirement specifications are not yet available [3,[24][25][26][27][28][29][30][31]. ...
Article
Full-text available
Functional size measures are widely used for estimating software development effort. After the introduction of Function Points, a few “simplified” measures have been proposed, aiming to make measurement simpler and applicable when fully detailed software specifications are not yet available. However, some practitioners believe that, when considering “complex” projects, traditional Function Point measures support more accurate estimates than simpler functional size measures, which do not account for greater-than-average complexity. In this paper, we aim to produce evidence that confirms or disproves such a belief via an empirical study that separately analyzes projects that involved developments from scratch and extensions and modifications of existing software. Our analysis shows that there is no evidence that traditional Function Points are generally better at estimating more complex projects than simpler measures, although some differences appear in specific conditions. Another result of this study is that functional size metrics—both traditional and simplified—do not seem to effectively account for software complexity, as estimation accuracy decreases with increasing complexity, regardless of the functional size metric used. To improve effort estimation, researchers should look for a way of measuring software complexity that can be used in effort models together with (traditional or simplified) functional size measures.
Article
A critical factor in successful project management and effective planning is the analysis of software development complexity and effort. The use of industry-standard methods, such as Function Point Analysis, serves as an effective means of improving estimation accuracy while simultaneously reducing the cost of the estimation process itself. However, standard methods, most of which were developed several decades ago, are not sufficiently adapted to modern realities, including agile software development and the use of pre-built or standardized solutions. This study proposes knowledge representation model that combine production and frame-based approaches to address the challenge of assessing the complexity and effort of software development involving SaaS (Software as a Service) and PaaS (Platform as a Service) solutions. A knowledge base model has been developed, integrating both frame-based and production models while ensuring compatibility with Function Point Analysis. The developed models and their interactions can serve as the foundation for a decision-making model within an information technology framework for software development complexity assessment, considering environmental factors and evaluation criteria. The application of the proposed models will enable the creation of automated algorithms for software development complexity estimation under conditions of incomplete functional requirements. The proposed approach improves effort estimation by considering real-world implementation contexts, which is particularly relevant for modern IT projects. Additionally, the combination of frame-based and production models lays the groundwork for further integration with artificial intelligence and machine learning to automate effort estimation processes. Future research should focus on refining the decision-making model, conducting experimental validation against traditional methods, and expanding its capabilities using fuzzy logic and neural networks for dynamic complexity assessment. The integration of this approach into decision support systems (DSS) for software project and resource management also remains a promising direction.
Article
Full-text available
In the last two decades, computing and storage technologies have experienced enormous advances. Leveraging these recent advances, Artificial Intelligence (AI) is making the leap from traditional classification use cases to automation of complex systems through advanced machine learning and reasoning algorithms. While the literature on AI algorithms and applications of these algorithms in automation is mature, there is a lack of research on trustworthy AI, i.e., how different industries can trust the developed AI modules. AI algorithms are data-driven, i.e., they learn based on the received data, and also act based on the received status data. Then, an initial step in addressing trustworthy AI is investigating the plausibility of the data that is fed to the system. In this work, we study the state-of-the-art data plausibility check approaches. Then, we propose a novel approach that leverages machine learning for an automated data plausibility check. This novel approach is context-aware, i.e., it leverages potential contextual data related to the dataset under investigation for a plausibility check. We investigate three machine learning solutions that leverage auto-correlation in each feature of dataset, correlation between features, and hidden statistics of each feature for generating the checkpoints. Performance evaluation results indicated the outstanding performance of the proposed scheme in the detection of noisy data in order to do the data plausibility check.
Conference Paper
Full-text available
The IFPUG Function Points method has originally been developed almost 35 years ago. The need was for a way to capture in numbers the functional value for users of a certain software application. At that moment the development process was largely "handmade" and "Lines of Code" was the main measurement method available. Detailed statements of functional user requirements (in terms of elementary fields, logical files, references to files etc.) are still used to produce a measurement of the functional value of an application. Unfortunately producing such a measure is quite costly and time consuming and requires very high professionalism in counters. In addition there are often endless discussions between customers and suppliers about complexity of Base Functional Components (BFC) due to extreme detail in elements to be used and the ambiguity of many counting rules when applied to actual specific systems. Production people are often forced to accept measurement as a necessary step but unsatisfied by the subjectivity and cost of the measurement process. Essentially, analysts and programmers consider measurement as a "unavoidable waste of time". The need for a simpler, faster and cheaper functional measurement method is there. On the other hand there are a lot of studies, contracts and asset measures made up using the IFPUG method so it is a pity to lose those resources. Simple Function Point (SiFP) is a new measurement method based only on two BFCs which is totally compliant with the IFPUG one. All the resources and contractual frameworks developed for IFPUG are valid for Simple FP as well, starting from the ISBSG productivity data base. The usage of the new method reduces cost, time and disputes, the translation of an entire measured application portfolio is immediate.
Article
Full-text available
Regression analysis makes up a large part of supervised machine learning, and consists of the prediction of a continuous independent target from a set of other predictor variables. The difference between binary classification and regression is in the target range: in binary classification, the target can have only two values (usually encoded as 0 and 1), while in regression the target can have multiple values. Even if regression analysis has been employed in a huge number of machine learning studies, no consensus has been reached on a single, unified, standard metric to assess the results of the regression itself. Many studies employ the mean square error (MSE) and its rooted variant (RMSE), or the mean absolute error (MAE) and its percentage variant (MAPE). Although useful, these rates share a common drawback: since their values can range between zero and +infinity, a single value of them does not say much about the performance of the regression with respect to the distribution of the ground truth elements. In this study, we focus on two rates that actually generate a high score only if the majority of the elements of a ground truth group has been correctly predicted: the coefficient of determination (also known as R -squared or R ² ) and the symmetric mean absolute percentage error (SMAPE). After showing their mathematical properties, we report a comparison between R ² and SMAPE in several use cases and in two real medical scenarios. Our results demonstrate that the coefficient of determination ( R -squared) is more informative and truthful than SMAPE, and does not have the interpretability limitations of MSE, RMSE, MAE and MAPE. We therefore suggest the usage of R -squared as standard metric to evaluate regression analyses in any scientific domain.
Article
Full-text available
Project management planning and assessment are of great significance in project performance activities. Without a realistic and logical plan, it isn’t easy to handle project management efficiently. This paper presents a wide-ranging comprehensive review of papers on the application of Machine Learning in software project management. Besides, this paper presents an extensive literature analysis of (1) machine learning, (2) software project management, and (3) techniques from three main libraries, Web Science, Science Directs, and IEEE Explore. One-hundred and eleven papers are divided into four categories in these three repositories. The first category contains research and survey papers on software project management. The second category includes papers that are based on machine-learning methods and strategies utilized on projects; the third category encompasses studies on the phases and tests that are the parameters used in machine-learning management and the final classes of the results from the study, contribution of studies in the production, and the promotion of machine-learning project prediction. Our contribution also offers a more comprehensive perspective and a context that would be important for potential work in project risk management. In conclusion, we have shown that project risk assessment by machine learning is more successful in minimizing the loss of the project, thereby increasing the likelihood of the project success, providing an alternative way to efficiently reduce the project failure probabilities, and increasing the output ratio for growth, and it also facilitates analysis on software fault prediction based on accuracy.
Article
Full-text available
In this paper, two different architectures of Artificial Neural Networks (ANN) are proposed as an efficient tool for predicting and estimating software effort. Artificial Neural Networks, as a branch of machine learning, are used in estimation because they tend towards fast learning and giving better and more accurate results. The search/optimization embraced here is motivated by the Taguchi method based on Orthogonal Arrays (an extraordinary set of Latin Squares), which demonstrated to be an effective apparatus in a robust design. This paper aims to minimize the magnitude relative error (MRE) in effort estimation by using Taguchi’s Orthogonal Arrays, as well as to find the simplest possible architecture of an artificial Neural Network for optimized learning. A descending gradient (GA) criterion has also been introduced to know when to stop performing iterations. Given the importance of estimating software projects, our work aims to cover as many different values of actual efficiency of a wide range of projects as possible by division into clusters and a certain coding method, in addition to the mentioned tools. In this way, the risk of error estimation can be reduced, to increase the rate of completed software projects.
Conference Paper
Full-text available
The “Early & Quick Function Points Approach (E&QFPA)” is a mean to approximate the results of some standard Functional Size Measurement Methods like IFPUG, SiFP or COSMIC. E&QFPA is a set of concepts and procedures that, even when applied to non-detailed functional specifications of a software system, maintains the overall structure and the essential principles of standard functional size measurement methods. The E&QFPA combines different estimation approaches in order to provide better approximations of a software system functional size: it makes use of both analogical and analytical classification of function types (transactions and data). Moreover, it allows the use of different levels of detail for different branches of the system (multilevel approach). This paper illustrates the basic concepts of the method which is mature and well established in the Italian market, as well as the results of an empirical validation experiment conducted on a real business data set of IFPUG function point measures. The usage of such a method may contribute to the rapid quantification of user requirements very early in the production life cycle.
Article
Full-text available
Estimating the work-effort and the schedule required to develop and/or maintain a software system is one of the most critical activities in managing software projects. Software cost estimation is a challenging and onerous task. Estimation by analogy is one of the convenient techniques in software effort estimation field. However, the methodology used for the estimation of software effort by analogy is not able to handle the categorical data in an explicit and accurate manner. Different techniques have, so far, been used like regression analysis, mathematical derivations, simulation, neural network, genetic algorithm, soft computing, fuzzy logic modelling etc. This paper aims to utilize soft computing techniques to improve the accuracy of software effort estimation. In this approach fuzzy logic is used with particle swarm optimization to estimate software development effort. The model has been calibrated on 30 projects, taken from NASA dataset. The results of this model are compared with COCOMO II and Alaa Sheta Model. The proposed model yields better results in terms of MMRE.
Conference Paper
Full-text available
Functional Size Measurement (FSM) is a relevant approach for software management and estimation since decades. Despite the efforts to refine the measurement definitions and practices FSM methods, real-world practitioners' needs led to a variety of proposals for approximation techniques aimed to provide (Function Point) figures in early phases of software projects and lifecycles, based on 'fuzzy' requirements (e.g. 'Size Classes', Estimated and Indicative, Quick & Early, Light and Early & Quick, Fast or Simple Function Points). This paper summarizes most common approximate sizing approaches, it provides an up-to-date comparison of their generic features, confidence levels and applicability, it shows how most approaches are basically variations or instances of a generic unique scheme (that can be derived from so-called Smart Function Points); it introduces the latest evolution of such approximation techniques -- for both IFPUG and COSMIC methods -- named 'EASY (Early & Speedy) Function Points'.
Article
Full-text available
The IFPUG Function Points method has originally been developed almost 35 years ago. The need was for a way to capture in numbers the functional value for users of a certain software application. At that moment the development process was largely "handmade" and "Lines of Code" was the main measurement method available. Detailed statements of functional user requirements (in terms of elementary fields, logical files, references to files etc.) are still used to produce a measurement of the functional value of an application. Unfortunately producing such a measure is quite costly and time consuming and requires very high professionalism in counters. In addition there are often endless discussions between customers and suppliers about complexity of Base Functional Components (BFC) due to extreme detail in elements to be used and the ambiguity of many counting rules when applied to actual specific systems. Production people are often forced to accept measurement as a necessary step but unsatisfied by the subjectivity and cost of the measurement process. Essentially, analysts and programmers consider measurement as a "unavoidable waste of time". The need for a simpler, faster and cheaper functional measurement method is there. On the other hand there are a lot of studies, contracts and asset measures made up using the IFPUG method so it is a pity to lose those resources. Simple Function Point (SiFP) is a new measurement method based only on two BFCs which is totally compliant with the IFPUG one. All the resources and contractual frameworks developed for IFPUG are valid for Simple FP as well, starting from the ISBSG productivity data base. The usage of the new method reduces cost, time and disputes, the translation of an entire measured application portfolio is immediate.
Article
Full-text available
Functional Size measurement (FSM) methods like NESM A FPA (1) and COSMIC (2) can be done in the standard way, in which the in detail described counting rules and guidelines are applied, but these measurements can also be car ried out in an approximate way. The authors, who work for the department of Sizing, Estimating & Control of Sogeti Nederland B.V., have analyzed a large number of pro jects, measuring them with both the detailed methods and the approximate methods. This paper will describe the accuracy percentages found for the different approximate met hods as well as the difference in time spent for the various analysis methods. The study gives a detailed overview of the differen ces in accuracy between the methods and the differences in effort needed to perform suc h measurements. This results in a accuracy/cost tradeoff that can be used by organiza tions to assess the measurement method that is required for a specific project.
Article
Full-text available
Early & Quick Function Points analysis (E&QFP) was invented in 1997 by Roberto Meli in order to facilitate IFPUG Function Points estimation and was presented at the ESCOM 97 conference, for the first time. Since then, the usage of the method has been spread out all over the world and E&QFP has probably become the most used functional size estimation method for Italian Public Administration software contracts. In 2004 DPO presented the version 2.0 of the rules and the method was extended, experimentally, to the COSMIC Functional Size Measurement Method. After three years of experience with the version 2.0 and ten years from the initial formulation, we present, now, the latest evolution of the method, for the IFPUG FPA, identified as version 3.0. In 2006 E&QFP has become a registered trademark but the method is still available for free in the public domain since it is managed as a "Publicly Available Method". The DPO development team has opened the doors to external contributions and the method will evolve, in the future, considering feedback from actual users in the market. A certification program has been created in 2007 to guarantee that the method is used consistently among different practitioners. The present paper illustrates the basic concepts of functional size estimation and introduces the new version 3.0 of the Early & Quick FP method for the IFPUG context.
Conference Paper
Full-text available
Software effort prediction is an important task within software engineering. In particular, machine learning algorithms have been widely-employed to this task, bearing in mind their capability of providing accurate predictive models for the analysis of project stakeholders. Nevertheless, none of these algorithms has become the de facto standard for metrics prediction given the particularities of different software projects. Among these intelligent strategies, decision trees and evolutionary algorithms have been continuously employed for software metrics prediction, though mostly independent from each other. A recent work has proposed evolving decision trees through an evolutionary algorithm, and applying the resulting tree in the context of software maintenance effort prediction. In this paper, we raise the search-space level of an evolutionary algorithm by proposing the evolution of a decision-tree algorithm instead of the decision tree itself — an approach known as hyper-heuristic. Our findings show that the decision-tree algorithm automatically generated by a hyper-heuristic is capable of statistically outperforming state-of-the-art top-down and evolution-based decision-tree algorithms, as well as traditional logistic regression. The ability of generating a highly-accurate comprehensible predictive model is crucial in software projects, considering that it allows the stakeholder to properly manage the team's resources with an improved confidence in the model predictions.
Article
Full-text available
Even though there are a number of software size and effort measurement methods proposed in literature, they are not widely adopted in the practice. According to literature, only 30% of software companies use measurement, mostly as a method for additional validation. In order to determine whether the objective metric approach can give results of the same quality or better than the estimates relying on work breakdown and expert judgment, we have validated several standard functional measurement and analysis methods (IFPUG, NESMA, Mark II, COSMIC, and use case points), on the selected set of small and medium size real-world web based projects at CMMI level 2. Evaluation performed in this paper provides objective justification and guidance for the use of a measurement-based estimation in these kinds of projects.
Article
Full-text available
Provides the software estimation research community with a better understanding of the meaning of, and relationship between, two statistics that are often used to assess the accuracy of predictive models: the mean magnitude relative error (MMRE) and the number of predictions within 25% of the actual, pred(25). It is demonstrated that MMRE and pred(25) are, respectively, measures of the spread and the kurtosis of the variable z, where z=estimate/actual. Thus, z is considered to be a measure of accuracy, and statistics such as MMRE and pred(25) to be measures of properties of the distribution of z. It is suggested that measures of the central location and skewness of z, as well as measures of spread and kurtosis, are necessary. Furthermore, since the distribution of z is non-normal, non-parametric measures of these properties may be needed. For this reason, box-plots of z are useful alternatives to simple summary metrics. It is also noted that the simple residuals are better behaved than the z variable, and could also be used as the basis for comparing prediction systems
Article
Full-text available
The appearance of the Function Point technique has allowed the ICT community to increase significantly the practice of software measurement, with respect to the use of the traditional "Lines of Code approach". A FP count, however, requires a complete and detailed level of descriptive documentation, like the Functional Specifications of the software system under measurement, to be performed. There are at least two situations in which having an estimation method, compatible but alternative to the standard rules for FP, could be decisive. The first case occurs when the development or enhancement project is in such an early phase that it is simply not possible to perform a FP count according to the IFPUG standards (i.e. in the Feasibility Study). The second case occurs when an evaluation of the existing software asset is needed, but the necessary documentation or the required time and resources to perform a detailed FP calculation are not available. Based on these and other analogous situ...
Article
Measuring Function Points following the standard process is sometimes long and expensive. To solve this problem, several early estimation methods have been proposed. Among these, the “NESMA Estimated” method is one of the most widely used; it has also been selected by the International Function Point User Group as the official early function point analysis method, under the name of ‘High-level FPA’ method. A large-scale empirical study has shown that the High-level FPA method—although sufficiently accurate—tends to underestimate the size of software. Underestimating the size of the software to be developed can easily lead to wrong decisions, which can even result in project failure. In this paper we investigate the reasons why the High-level FPA method tends to underestimate. We also explore how to improve the method to make it more accurate. Finally, we propose size estimation models built using different criteria and we evaluate the estimation accuracy of these new models. Our results show that it is possible to derive size estimation models from historical data using simple regression techniques: these models are slightly less accurate than those delivered by the High-level FPA method in terms of absolute estimation errors, but can be used earlier than the High-level FPA method, are cheaper, and do not underestimate software size.
Article
Context: Functional Size Measurement (FSM) methods, like Function Points Analysis (FPA) or COSMIC, are well-established approaches to estimate software size. Several approximations of these methods have been recently proposed as they require less time/information to be applied, however their effectiveness for effort prediction is not known. Objective: The effectiveness of approximated functional size measures for estimating the development effort is a key open question, since an approximate sizing approach may miss to capture factors affecting the effort. Therefore, we empirically investigated the use of approximate FPA and COSMIC sizing approaches, also compared with their standard versions, for effort estimation. Method: We measured 25 industrial software projects realised by a single company by using FPA, COSMIC, two approximate sizing approaches proposed by IFPUG for FPA (i.e. High Level and Indicative FPA), and three approximate sizing approaches proposed by the COSMIC organisation for COSMIC (i.e. Average Functional Process, Fixed Size Classification, and Equal Size Band). Then we investigated the quality of the regression models built using the obtained measures to estimate the development effort. Results: Models based on High Level FPA are effective, providing a prediction accuracy comparable to the one of the original FPA, while those based on the Indicative FPA method show poor estimation accuracy. Models based on COSMIC approximate sizing methods are also quite effective, in particular those based on the Equal Size Band approximation provided an accuracy similar to the one of standard COSMIC. Conclusion: Project managers should be aware that predictions based on High Level FPA and standard FPA can be similar, making this approximation very interesting and effective, while Indicative FPA should be avoided. COSMIC approximations can also provide accurate effort estimates, nevertheless, the Fixed Size Classification and Equal Size Band approaches introduce subjectivity in the measurement.
Book
Software is one of the most important products in human history and is widely used by all industries and all countries. It is also one of the most expensive and labor-intensive products in human history. Software also has very poor quality that has caused many major disasters and wasted many millions of dollars. Software is also the target of frequent and increasingly serious cyber-attacks. Among the reasons for these software problems is a chronic lack of reliable quantified data. This reference provides quantified data from many countries and many industries based on about 26,000 projects developed using a variety of methodologies and team experience levels. The data has been gathered between 1970 and 2017, so interesting historical trends are available. Since current average software productivity and quality results are suboptimal, this book focuses on “best in class” results and shows not only quantified quality and productivity data from best-in-class organizations, but also the technology stacks used to achieve best-in-class results. The overall goal of this book is to encourage the adoption of best-in-class software metrics and best-in-class technology stacks. It does so by providing current data on average software schedules, effort, costs, and quality for several industries and countries. Because productivity and quality vary by technology and size, the book presents quantitative results for applications between 100 function points and 100,000 function points. It shows quality results using defect potential and DRE metrics because the number one cost driver for software is finding and fixing bugs. The book presents data on cost of quality for software projects and discusses technical debt, but that metric is not standardized. Finally, the book includes some data on three years of software maintenance and enhancements as well as some data on total cost of ownership.
Article
During the last two decades, there has been substantial research performed in the field of software estimation using machine learning algorithms that aimed to tackle deficiencies of traditional and parametric estimation techniques, increase project success rates and align with modern development and project management approaches. Nevertheless, mostly due to inconclusive results and vague model building approaches, there are few or none deployments in practice. The purpose of this article is to narrow the gap between up-to-date research results and implementations within organisations by proposing effective and practical machine learning deployment and maintenance approaches by utilization of research findings and industry best practices. This was achieved by applying ISBSG dataset, smart data preparation, an ensemble averaging of three machine learning algorithms (Support Vector Machines, Neural Networks and Generalized Linear Models) and cross validation. The obtained models for effort and duration estimation are intended to provide a decision support tool for organisations that develop or implement software systems.
Article
Background Several functional size measurement methods have been proposed. A few ones -like IFPUG and COSMIC methods- are widely used, while others -like Simple Function Points method- are interesting new proposals, which promise to deliver functional size measures via a faster and cheaper measurement process. Objectives Since all functional size measurement methods address the measurement of the same property of software (namely, the size of functional specifications), it is expected that measures provided in a given measurement unit can be converted into a different measurement unit. In this paper, convertibility of IFPUG Function Points, COSMIC Function Points, and Simple Function Points is studied. Method Convertibility is analyzed statistically via regression techniques. Seven datasets, each one containing measures of a set of software applications expressed in IFPUG Function Points, COSMIC Function Points and Simple Function Points, were analyzed. The components of functional size measures (usually known as Base Functional Components) were also involved in the analysis. ResultsAll the analyzed measures appear well correlated to each other. Statistically significant quantitative models were found for all the combinations of measures, for all the analyzed datasets. Several models involving Base Functional Components were found as well. Conclusions From a practical point of view, the paper shows that converting measures from a given functional size unit into another one is viable. The magnitude of the conversion errors is reported, so that practitioners can evaluate if the expected conversion error is acceptable for their specific purposes. From a conceptual point of view, the paper shows that Base Functional Components of a given method can be used to estimate measures expressed in a different measurement unit: this seems to imply that different functional size measurement methods are 'structurally' strongly correlated.
Conference Paper
Simple Function Points were proposed as a lightweight alternative to standard Function Points. The idea at the base of the definition of Simple Function Points is that it is possible to get equally effective size measures even without considering fine-grained details of functional specifications. Skipping the analysis and measurement of such details should allow for saving time and effort when measuring the functional size of software. In a previous study, the Simple Function Point method was employed to size software applications included in the ISBSG repository, thus employing data from different companies. In this paper, we aim at getting further evidence of the qualities of Simple Function Points via the empirical study of 25 Web applications developed by a single software company. The study highlights the correlation between Simple Function Points and standard Function Points measures and the existence of a significant model that guarantees the convertibility between Simple Function Points and standard Function Points. Furthermore, our analysis shows that the two measures provide predictions of the effort to develop Web applications that are equally accurate. Thus, our results confirm the empirical evidence provided by the original investigation, highlighting that Simple Function Points appear essentially equivalent to Function Points, except that they require a much simpler measurement process.
Article
The features of the System Evaluation and Estimation of Resources - Software Engineering Model (SEER-SEM), a commercially available software project estimation model, are discussed. SEER-SEM is composed of a group of models working together to provide estimates of effort, duration, staffing, and defects. These models can be described by the questions they answer, such as sizing, technology, effort and schedule calculation, constrained effort/ schedule calculation, cost calculation and maintenance effort calculation. Supported sizing metrics for the model includes source lines of code (SLOC), function based sizing (FBS), and a range of other measures.
Article
Function point analysis (FPA) is a standardized method to systematically measure the functional size of software. This method is proposed by an international organization and it is currently recommended by governments and organizations as a standard method to be adopted for this type of measurement. This paper presents a compilation of improvements, focused on increasing the accuracy of the FPA method, which have been proposed over the past 13 years. The methodology used was a systematic literature review (SLR), which was conducted with four research questions aligned with the objectives of this study. As a result of the SLR, of the 1600 results returned by the search engines, 454 primary studies were preselected according to the criteria established for the SLR. Among these studies, only 18 specifically referred to accuracy improvements for FPA, which was the goal of this study. The low number of studies that propose FPA improvements might demonstrate the maturity of the method in the current scenario of software metrics. Specifically in terms of found issues, it was found that the step for calculating the functional size exhibited the highest number of problems, indicating the need to revise FPA in order to encompass the possible improvements suggested by the researchers.
Technical Report
[URL: http://cran.r-project.org/web/packages/effsize/effsize.pdf ]
Conference Paper
Simple Function Point is a functional size measurement method that can be used in place of IFPUG Function Point, but requires a much simpler –hence less time and effort consuming– measurement process. Simple Function Point was designed to be equivalent to IFPUG Function Point in terms of numerical results. This paper reports an empirical study aiming at verifying the effectiveness of Simple Function Point as a functional size measurement method, especially suitable to support estimation of software development effort. The data from a large popular public dataset were analyzed to verify the correlation of Simple Function Point with IFPUG Function Point, and the correlation of both size measures to development effort. The results obtained confirm, at a reasonable level of confidence, the hypothesis that Simple Function Point can be effectively used in place of IFPUG Function Point.
Article
Supervised machine learning is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. In other words, the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single article cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.
Article
The traditional function point approach has been well documented. Despite increasing popularity, investigations have shown a number of weaknesses. There is evidence that simple metrics may be as good as function points for early lifecycle estimates. This paper considers the use of a simplified approach to system size estimation that utilises function point elements. From this, the construction of a new model for producing function point estimates earlier in the development lifecycle is presented, together with results from application of the model to real project data.
Article
This paper examines the trade-off between the utility of outputs from simplified functional sizing approaches, and the effort required by these sizing approaches, through a pilot study. The goal of this pilot study was to evaluate the quality of sizing output provided by NESMA’s simplified size estimation methods, adapt their general principles to enhance their accuracy and extent of relevance, and empirically validate such an adapted approach using commercial software projects. A dataset of 11 projects was sized using this adapted approach, and these results compared with those of the established Indicative, Estimated and Full NESMA method approaches. The performances of these adaptations were evaluated against the NESMA approaches in three ways: (1) effort to perform; (2) the accuracy of the total function counts produced; and (3) the accuracy of the profiles of the function counts for each of the base functional component types. The adapted approach outperformed the Indicative NESMA in terms of sizing accuracy and generally performed as well as the Estimated NESMA across both datasets, and required only ~ 50 % of the effort incurred by the Estimated NESMA. This adapted approach, applied to varying levels of information presented in commercial requirements documentation, overcame some of the limitations of simplified functional sizing methods by providing more than simply the simplified ‘indication’ of overall functional size. The provision and refinement of the more detailed function profile enable a greater degree of validation and utility for the size estimate.
Article
Background The measurement of Function Points is based on Base Functional Components. The process of identifying and weighting Base Functional Components is hardly automatable, due to the informality of both the Function Point method and the requirements documents being measured. So, Function Point measurement generally requires a lengthy and costly process.Objectives We investigate whether it is possible to take into account only subsets of Base Functional Components so as to obtain functional size measures that simplify Function Points with the same effort estimation accuracy as the original Function Points measure. Simplifying the definition of Function Points would imply a reduction of measurement costs and may help spread the adoption of this type of measurement practices. Specifically, we empirically investigate the following issues: whether available data provide evidence that simplified software functionality measures can be defined in a way that is consistent with Function Point Analysis; whether simplified functional size measures by themselves can be used without any appreciable loss in software development effort prediction accuracy; whether simplified functional size measures can be used as software development effort predictors in models that also use other software requirements measures.Method We analyze the relationships between Function Points and their Base Functional Components. We also analyze the relationships between Base Functional Components and development effort. Finally, we built effort prediction models that contain both the simplified functional measures and additional requirements measures.ResultsSignificant statistical models correlate Function Points with Base Functional Components. Basic Functional Components can be used to build models of effort that are equivalent, in terms of accuracy, to those based on Function Points. Finally, simplified Function Points measures can be used as software development effort predictors in models that also use other requirements measures.Conclusion The definition and measurement processes of Function Points can be dramatically simplified by taking into account a subset of the Base Functional Components used in the original definition of the measure, thus allowing for substantial savings in measurement effort, without sacrificing the accuracy of software development effort estimates.
Article
This article presents a simplified method for counting function points. The method is a modification of Albrecht's well-known detailed function point counting method. One inhibitor to the widespread use of function points as a measure of information systems work output is the time required to complete the function point counting task. The simplified method described here was developed in the McDonnell Douglas Corporation and has the potential to reduce the time required for the counting task with no significant reduction in accuracy.
Article
ContextOne of the difficulties faced by software development Project Managers is estimating the cost and schedule for new projects. Previous industry surveys have concluded that software size and cost estimation is a significant technical area of concern. In order to estimate cost and schedule it is important to have a good understanding of the size of the software product to be developed. There are a number of techniques used to derive software size, with function points being amongst the most documented.
Article
Overfitting problem in model fitting for quantitative measurements is discussed. Two types of overfitting can be distinguished, which include using a model that is more flexible than it needs to be and using a model that includes irrelevant components or predictors. Adding predictors that perform no useful function means that in future use of the regression to make predictions it will be needed to measure and record the predictors so that their values can be substituted in the model. Adding irrelevant predictors can also make predictions worse because the coefficients fitted to them add random variation to the subsequent predictions.
Book
A benchmark text on software development and quantitative software engineering "We all trust software. All too frequently, this trust is misplaced. Larry Bernstein has created and applied quantitative techniques to develop trustworthy software systems. He and C. M. Yuhas have organized this quantitative experience into a book of great value to make software trustworthy for all of us." —Barry Boehm Trustworthy Systems Through Quantitative Software Engineering proposes a novel, reliability-driven software engineering approach, and discusses human factors in software engineering and how these affect team dynamics. This practical approach gives software engineering students and professionals a solid foundation in problem analysis, allowing them to meet customers' changing needs by tailoring their projects to meet specific challenges, and complete projects on schedule and within budget. Specifically, it helps developers identify customer requirements, develop software designs, manage a software development team, and evaluate software products to customer specifications. Students learn "magic numbers of software engineering," rules of thumb that show how to simplify architecture, design, and implementation. Case histories and exercises clearly present successful software engineers' experiences and illustrate potential problems, results, and trade-offs. Also featuring an accompanying Web site with additional and related material, Trustworthy Systems Through Quantitative Software Engineering is a hands-on, project-oriented resource for upper-level software and computer science students, engineers, professional developers, managers, and professionals involved in software engineering projects.
Conference Paper
The Early & Quick technique was originally proposed in 1997 for IFPUG Function Points, to size software in early stages of the development process, when functional requirements are still to be established in a detailed form and/or when a rapid measure is needed for existing software from a high-level viewpoint, within limited time. Typical lack of measurement details and requirements volatility in early project stages are overcome by the E&Q approach to provide a size estimate as a significant contribution to early project planning needs. Fundamental principles of the technique are classification by analogy, functionality structured aggregation, and multilevel approach, with statistical validation of numerical ranges. Recently, the technique has evolved, to fully comply with any functional size measurement method (ISO/IEC 14143:1998), so to cover new generation methods (e.g., COSMIC Full FP 2.2) and updated releases of existing methods (e.g., IFPUG FP 4.1 and 4.2). This paper describes the current technique release 2.0, application cases, validation results, supporting tools, and further improvement directions.
Article
In order to plan, control, and evaluate the software development process, one needs to collect and analyze data in a meaningful way. Classical techniques for such analysis are not always well suited to software engineering data. A pattern recognition approach for analyzing software engineering data, called optimized set reduction (OSR), that addresses many of the problems associated with the usual approaches is described. Methods are discussed for using the technique for prediction, risk management, and quality evaluation. Experimental results are provided to demonstrate the effectiveness of the technique for the particular application of software cost estimation
Sergio Di Martino, Filomena Ferrucci, Carmine Gravino, and Federica Sarro. 2020. Assessing the effectiveness of approximate functional sizing approaches for effort estimation
  • Sergio Di Martino
  • Filomena Ferrucci
  • Carmine Gravino
  • Federica Sarro
  • Di Martino Sergio
Simple Function Point (SFP) Counting Practices Manual Release 2.1
  • Ifpug
IFPUG. 2021. Simple Function Point (SFP) Counting Practices Manual Release 2.1.
Software effort estimation using radial basis function neural networks
  • Ana Maria Bautista
  • Angel Castellanos
  • Tomas San Feliu
  • Bautista Ana Maria
Ana Maria Bautista, Angel Castellanos, and Tomas San Feliu. 1993. Software Efort Estimation using Radial Basis Function Neural Networks. INFORMATION THEORIES & APPLICATIONS 319 (1993).
Early & Quick Function Points Reference Manual - IFPUG version
  • Dpo
DPO. 2012. Early & Quick Function Points Reference Manual -IFPUG version. Technical Report EQ&FP-IFPUG-31-RM-11-EN-P. DPO, Roma, Italy.
On the effort required by function point measurement phases
  • Luigi Lavazza
  • Lavazza Luigi
Luigi Lavazza. 2017. On the Efort Required by Function Point Measurement Phases. International Journal on Advances in Software Volume 10, Number 1 & 2, 2017 (2017).
Luigi Lavazza and Geng Liu. 2019. An Empirical Evaluation of the Accuracy of NESMA Function Points Estimates
  • Luigi Lavazza
  • Geng Liu
  • Lavazza Luigi
Citeseer 6-8. Roberto Meli and Luca Santillo. 1999. Function point estimation methods: A comparative overview
  • Roberto Meli
  • Luca Santillo
The IRS development and application of the internal logical file model to estimate function point counts
  • C Tichenor
  • Tichenor C.
C. Tichenor. 1997. The IRS Development and Application of the Internal Logical File Model to Estimate Function Point Counts. In IFPUG Fall Conference.
Marco Torchiano et al. 2017. effsize: Efficient effect size computation. R package version 0
  • Marco Torchiano
  • Lee Fischman
  • Karen Mcritchie
  • Donald D Galorath
Lee Fischman, Karen McRitchie, and Donald D. Galorath. 2005. Inside SEER-SEM. CrossTalk 18, 4 (2005).