Article

Software Engineering Economics

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

A summary is presented of the current state of the art and recent trends in software engineering economics. It provides an overview of economic analysis techniques and their applicability to software engineering and management. It surveys the field of software cost estimation, including the major estimation techniques available, the state of the art in algorithmic cost models, and the outstanding research issues in software cost estimation.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Within the extensive array of estimation models, algorithmic constructs such as the Constructive Cost Model (COCOMO) and its progeny, COCOMO II, continue to be favoured for their simplicity and versatile applicability across diverse project stages [3], [4]. Nevertheless, the accuracy of these models is heavily reliant on the precision of input parameters, a dependency that can engender significant errors in estimation. ...
... The COnstructive COst MOdel (COCOMO) was developed using 63 different sample projects and was published in 1981 by Barry W. Boehm. According to [3], Boehm suggested three types of the COCOMO modes: basic, intermediate, and detailed. Various input parameters estimate the cost depending on the model type. ...
... These parameters include project size measured in thousands of lines of source code (KLOC), 15 cost driver attributes, and calibration constants (a, b, c, d), whose values depend on the project's mode (organic, semidetached, or embedded) and are listed in Table 1. Organic mode requires extensive experience with similar projects, a thorough understanding of the project goals, and is open to changes in requirements and other technical specifications [3]. In contrast to organic mode, embedded mode requires moderate experience, a general understanding of the project objectives, and strict compliance with the requirements and other specifications. ...
Article
Full-text available
This study presents a comprehensive analysis of enhancing software effort estimation accuracy using a Self-Organizing Migration Algorithm (SOMA)-optimized Constructive Cost Model (COCOMO). By conducting a comparative analysis of traditional COCOMO models and SOMA-optimized variants across preprocessed datasets (NASA93, NASA63, NASA18, Kemerer, Miyazaki94, and Turkish), our research focuses on crucial evaluation metrics including Mean Magnitude of Relative Error (MMRE), Prediction at 0.25 (PRED(0.25)), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE). The analysis encompasses various configurations of COCOMO models—basic, intermediate, and post-architecture COCOMO II, supplemented with additional statistical testing and residual analysis for in-depth insights. The results demonstrate that the SOMA-optimized COCOMO models generally surpass traditional models in predictive accuracy, especially notable in metrics such as MMRE where an improvement of up to 12%, PRED(0.25) with an enhancement of 15%, MAE reduction by 18%, and a decrease in RMSE by 20% were observed. However, performance variances were identified in specific scenarios, highlighting areas for further refinement, particularly in large-scale estimations where residual plots suggested the potential for underestimation or overestimation. The study concludes that integrating the SOMA optimization algorithm into COCOMO models significantly enhances the accuracy of software effort estimations, providing valuable insights for future research to optimise estimations for larger projects and advance prediction models. This advancement addresses the technical challenge of parameter accuracy and offers a methodological improvement in model selection and application, underscoring the potential of metaheuristic optimization in software effort estimation.
... Over the years, various SCE techniques have been examined and are broadly classified by Boehm et al. [4] into six categories: expert judgment methods like the Delphi technique [5]; parametric models such as constructive cost model (COCOMO) [6]; regression-based methods [7]; machine learning techniques [8]; analogy-based estimation [9]; dynamics-based models [10]; and composite approaches [11,12]. Expert judgment methods involve the Delphi technique, a structured communication technique that relies on a panel of experts to derive a consensus. ...
... Contemporary SCE techniques are broadly categorized into three main types [6,15,17,23]: ...
... Single-variable models, represented by equations such as Effort α * (size) β , rely on a singular key parameter such as software size [25,26]. Multivariable models employ multiple independent variables and are represented as Effort α 1 * c β 1 + α 2 * c β 2 + · · · + α n * c β n ; these models are used to estimate software cost or effort [4,6]. However, algorithmic models have limitations, including the need for early-stage calibration, the inability to handle exceptions, and the susceptibility to imprecise estimations because of inaccurate project characteristics [27][28][29][30][31][32]. ...
Article
Full-text available
Software development effort estimation (SDEE) is critical for predicting the required resource investment. Since the late 2000s, numerous studies have advocated using computational intelligence (CI) to enhance the precision of SDEE models (CI-SDEE). However, a systematic examination of empirical evidence surrounding CI-SDEE is lacking. Therefore, we conducted a meticulous and systematic literature review across four dimensions: CI technique classification, estimation fidelity, model comparative analysis, and contextual applicability. Surveying empirical studies published between 2008 and 2023, we identified 38 seminal works germane to our research objectives. Our investigation revealed five distinct CI technique families utilized in SDEE, exhibiting an overall estimation accuracy commensurate with the acceptable standards and superior to non-CI counterparts. In addition, we determined that specific CI models exhibit unique advantages and disadvantages, which make them better suited to specific estimation contexts. While CI techniques have been proven promising in advancing the field of SDEE, their industrial applications are limited, necessitating additional efforts to foster their adoption. This review provides actionable academic recommendations and operational guidelines for practitioners.
... The second analysis was a common session between the raters conducted to reach a consensus on their rating, using the well-established Wideband Delphi method [5,16]. In cases of diverging assessments, each rater presented arguments for their estimate, leading to a discussion, a repeated inspection, and a reconsideration of the information sources, until a final assessment was mutually agreed upon. ...
... [4] Significant: Like moderate, but the higher severity is evident through additional data-or observer-triangulation. [5] Serious: Agreement on the (recurrent) severe manifestation of a symptom by observer triangulation (often all raters) and/or data triangulation. ...
... The ground truth vector κ can, therefore, be scaled into a weight vector by normalizing it through the division of its sum (3). A mixture for some same activity ACT across all projects (4) is then created as a weighted sum (5). ...
Article
Background: Nowadays, expensive, error-prone, expert-based evaluations are needed to identify and assess software process anti-patterns. Process artifacts cannot be automatically used to quantitatively analyze and train prediction models without exact ground truth. Aim: Develop a replicable methodology for organizational learning from process (anti-)patterns, demonstrating the mining of reliable ground truth and exploitation of process artifacts. Method: We conduct an embedded case study to find manifestations of the Fire Drill anti-pattern in n=15 projects. To ensure quality, three human experts agree. Their evaluation and the process’ artifacts are utilized to establish a quantitative understanding and train a prediction model. Results: Qualitative review shows many project issues. (i) Expert assessments consistently provide credible ground truth. (ii) Fire Drill phenomenological descriptions match project activity time (for example, development). (iii) Regression models trained on approx. 12–25 examples are sufficiently stable. Conclusion: The approach is data source-independent (source code or issue-tracking). It allows leveraging process artifacts for establishing additional phenomenon knowledge and training robust predictive models. The results indicate the aptness of the methodology for the identification of the Fire Drill and similar anti-pattern instances modeled using activities. Such identification could be used in post mortem process analysis supporting organizational learning for improving processes.
... (ISSN-2349-5162) closest fit formula to actual experience. Some of the famous algorithmic models are: Boehm's COCOMO'81, II [2], Albrecht's Function Point [3] and Putnam's SLIM [1]. All of them require inputs, accurate estimate of specific attributes, such as line of code (LOC), number of user screen, interfaces, complexity, etc which are not easy to acquire during the early stage of software development. ...
... Non Parametric models or Non-algorithmic models are the models which are based on fuzzy logic (FL), artificial neural networks (ANN) and evolutionary computation (EC). These models are used to group together a set of techniques that represent some of the facets of human mind [3], for example regression trees, rule induction, fuzzy systems, genetic algorithms, artificial neural networks, Bayesian networks and evolutionary computation. Tools based on ANN have increasingly gained popularity due to their inherent capabilities to approximate any nonlinear function to a high degree of accuracy. ...
... The reason of using FLANN is that multilayer feedforward network (MLP) though commonly used ANN in software cost estimation model utilizes a supervised learning technique called Backpropagation for training the network. However, due to its multi-layered structure, the training speeds are typically much slower as compared to other single layer feedforward networks [3]. Problems such as local minima trapping, overfitting and weight interference also make the network training in MLP challenging [8]. ...
Article
Full-text available
Software cost estimation predicts the amount of effort and development time required to build any software system. There are a number of cost estimation models. Each of these models have their own pros and cons in estimating the development cost and effort. Recently, the usage of Meta-heuristic techniques for software cost estimations is increasingly growing. Prerequisite of relative accurate estimation is work experience. Therefore, the risk associated with construction of software projects is based on preliminary estimates. Increasing risk causes increasing uncertainty about the initial program with increasing complexity and size of projects level of uncertainty become higher. So in this paper, we propose an approach, which consists of Functional Link ANN and Harmony search algorithm as its training algorithm. FLANN reduces the computational complexity in multilayer neural network. It does not have any hidden layer, and has got the fast learning ability. In our work, MRE, MMRE and MdMRE were selected as the performance index to gauge the quality of prediction of the proposed model. Extensive evaluation of results show that training a FLANN with harmony search algorithm for the Software Cost prediction problem gives out highly improved set of results. In addition the proposed model is structurally simple and requires less computation during training as the model contains no hidden layer. Introduction Software cost estimation is the prediction of amount of effort and time required to develop any software project. Software Development Effort simply covers the hours of work and the number of workers interms of staffing levels needed to develop a software project. Every single project manager is always in a quest of finding better estimates of software cost so that he can evaluate the project progress, has potential cost control, delivery accuracy, and in addition can gave the organization a better insight of resource utilization and consequently will land the organization in a better schedule of futuristic projects. The software effort prediction facet of software development project life cycle is always made at very early stage of software development life cycle. However, estimating the software development project at this time is most difficult to obtain because in addition to availability of sparse information about the project and the product at this stage, the so presented information is likely to be vague for the estimation purpose. So, estimating software development effort remains an intricate problem and thus continues to attract many researchers interest. A range of cost estimation techniques have been proposed so far and have been classified into various broad categories which will be discussed in the later sections. So far variations of a range of Artificial Neural Network models have also been developed for this purpose in addition to previously used Algorithmic models. In this paper, our focus will be on a specially varied ANN called as Functional Link Artificial Neural Network (FLANN) which is infact a high order Artificial Neural Network model and in-particular its training algorithm which will be Harmony search algorithm for the purpose of achieving a better software cost estimation model. The rest of the paper is organised as follows. In Section 2, the evolution of software cost estimation using both algorithmic and non-algorithmic models will be discussed. In addition characteristics of COCOMO II model will be presented here. FLANN architecture and the training algorithm implemented will be presented in sections 3. In Section 4 Datasets and Evaluation criteria will be discussed. Section 5 reports the Experimentation Results and their Analysis. Finally, Section 6 concludes the paper along with Future Work.
... The problem was that in the second decade of PROMISE, many researchers still continue that kind of first-decade research. For example, all too often, I must review papers from authors who think it is valid to publish results based on (e.g.) the COC81 data set first published in 1981 [6]; the DESHARNIS data set, first published in 1989 [16]; the JM1 data, first published in 2004 [38]; or the XALAN data set, first published in 2010 [27] 2 . ...
... Figure 1 comes from Cruz et al. [14]. That figure shows the effects of 10,000 different hyperparameter options applied to five machine learning algorithms (random forest; LinReg; boosted trees; decision trees; feed-forward NN) 6 . Adjusting tunings can change learners from low to high accuracies and fairness (measured here as the ratio of false positives between different social groups, such as men and women). ...
... with: 5 Specifically, after incremental active learning, the SVM had under 300 support vectors. 6 The hyperparameters of Random Forests, learners include (a) how many trees to build (e.g., ∈ {10, 20, 40, 80, 160}); (b) how many features to use in each tree (e.g., ∈ 2, 4, 10, 20, sqrt, log2, all); (c) how to poll the whole forest (e.g., majority or weighted majority); (d) what impurity measures (e.g., gini or entropy or log.loss); (e) what is the minimum examples needed to branch a sub-tree (e.g., min∈ {2, 5, 10, 20, 50, 100}; (f) should branches be binary or n-arty. ...
Preprint
Full-text available
In the past, humans have had difficulty accurately assessing complex models, leading to unreliable and sometimes dangerous results. To reduce the cognitive load on humans, large models must be simplified and summarized into smaller ones. Data mining has proven to be an effective tool for finding useful, concise models. Therefore, the PROMISE community has the necessary skills and experience to redefine and simplify and improve the relationship between humans and AI.
... For example, a vague requirements specification may lead to incorrect or missing features and reduced customer acceptance [7]. These quality defects are more expensive to fix the later they are addressed [8]: Revising a vague requirements specification is less expensive than redeveloping a faulty system built on it. Therefore, organizations aim to detect and remove requirements quality defects as early as possible [9]. ...
... We selected the book "Software Process Definition and Management" by Münch et al. [33] as a reliable summary of software process literature. The first author reviewed the descriptions of all seven lifecycle models, which cover the waterfall model [34], iterative enhancement [8], prototyping, the spiral model [35], the incremental commitment spiral model [36], Unified Process [37], and Cleanroom Development [38]. The first author extracted all textual mentions of requirements-affected activities and their attributes as prescribed by the lifecycle model. ...
Preprint
Requirements engineering aims to fulfill a purpose, i.e., inform subsequent software development activities about stakeholders' needs and constraints that must be met by the system under development. The quality of requirements artifacts and processes is determined by how fit for this purpose they are, i.e., how they impact activities affected by them. However, research on requirements quality lacks a comprehensive overview of these activities and how to measure them. In this paper, we specify the research endeavor addressing this gap and propose an initial model of requirements-affected activities and their attributes. We construct a model from three distinct data sources, including both literature and empirical data. The results yield an initial model containing 24 activities and 16 attributes quantifying these activities. Our long-term goal is to develop evidence-based decision support on how to optimize the fitness for purpose of the RE phase to best support the subsequent, affected software development process. We do so by measuring the effect that requirements artifacts and processes have on the attributes of these activities. With the contribution at hand, we invite the research community to critically discuss our research roadmap and support the further evolution of the model.
... According to Bohem et al. [9], software effort estimation techniques fall into six distinct categories, later, they are grouped into three categories such as expert judgment, algorithmic models, and machine learning processes. Conventional algorithmic models, such as the constructive cost model (COCOMO) [10], software life cycle management (SLIM) [11], and Function Point Analysis (FPA) [12] perform estimations based on a statistical analysis of project input data, which implies the effort is computed by a mathematical model from the numerical inputs of one or more projects. In the case of expert judgment, Delphi [13,14] and work breakdown structure [15] methods rely on the experience of a domain expert to estimate the required effort. ...
... Several estimation techniques have been developed to enhance the accuracy of SDEE due to its importance in the development process. Traditional models like COCOMO [10], FPA [12], System Evaluation and Estimation of Resources-Software Estimation Model (SEER-SEM) [42], and SLIM [11] rely on statistical analysis of project input data, that means effort calculated through mathematical models based on numerical inputs from one or more projects. In contrast, Delphi expert estimation [13,14] and work breakdown structure [15] models rely on expert judgment for effort estimations. ...
Article
Full-text available
Accurate software development effort estimation (SDEE) is vital for effective project planning. Due to the complex nature of software projects, estimating development effort has become a challenging task that requires careful consideration, especially in the early project phases, to prevent overestimation and underestimation. Accurate effort estimation helps to estimate the cost of a developing project through effective resource management and project budgeting for manpower. Despite numerous effort estimation models introduced over the past two decades, achieving a satisfactory level of accuracy remains elusive. The adaptive neuro-fuzzy inference system (ANFIS) model gains more popularity for estimation tasks due to its rapid learning capacity, ability to represent complex nonlinear structures, and adaptability to improperly specified data. This study presents a model called the Two-Stage optimization technique for Software Development Effort Estimation (TSoptEE). Initially, it performs feature selection through a multi-objective improved binary social network search (SNS) algorithm and then optimizes the ANFIS tunable parameters through an improved SNS algorithm to enhance the accuracy of SDEE. The proposed TSoptEE model is compared against existing estimation models and evaluated using seven performance measures over nine software datasets. The obtained results are promising in terms of accuracy and statistical significance tests. This implies that the proposed model can significantly enhance the accuracy of effort estimation.
... Validation is focused on the external view of the product: "are we building the right product?"[4] 2 Verification is focused on meeting the product specifications: "are we building the product right?"[4] ...
... Validation is focused on the external view of the product: "are we building the right product?"[4] 2 Verification is focused on meeting the product specifications: "are we building the product right?"[4] ...
Thesis
TOPdesk is a service management software provider in a wide variety of domains and industries. TOPdesk also offers consultancy to their customers that aims to continuously assess and improve the customer’s experience and service efficiency. TOPdesk offers a Mini Health Check (MHC) to their customers in which a consultant analyzes how efficiently the customer uses their software based on six Key Performance Indicators (KPI). However, the process of creating an MHC report is very time-consuming as it requires performing a lot of manual steps. Also, the norms used for the KPIs provide little meaning as they are arbitrarily chosen and not specific to the customer’s industry. This report aims to improve the current process of performing an MHC. Research has been done on how the MHC is performed, identifying the suitable technologies and learning the currently existing infrastructure that helped us pave the way to create our product. During our project we managed to create a product that automates the MHC. Through user testing we found that this process now takes about two minutes, where the manual process took about two hours. To create more meaningful norms for the KPIs, we also implemented a benchmarking feature. This allows a company to compare the results of their MHC to other TOPdesk customers in the same sector, country or of similar size. We have some recommendations for TOPdesk for the further development of our product. The MHC process could be streamlined in a few ways, most importantly with respect to the process for getting access to customer data. Benchmarking could become even more useful if data can be more easily gathered from more TOPdesk customers.
... Los modelos "orientados a los datos" se subclasifican en "propietarios" y "no propietarios";éstoś ultimos a su vez se subdividen entre los que están "basados en modelos" y aquellos que son "basados en analogía". En todos los casos se puede definir categorías "híbridas" que combinan las ya mencionadas 4 En relación a los principales métodos utilizados en la actualidad, existen diversas síntesis que dan cuenta del "estado del arte", por ejemplo [39,52,28] 5 Un rasgo inherente a todas las ingenierías es el uso de métodos y modelos con base científica (léase, objetivos) para el análisis de sus problemas; en el caso de la estimación del esfuerzo del software, es natural cuestionar por qué mayoritariamente se prefiere la "opinión del experto", abriéndose la puerta a la subjetividad. ...
... Una crítica punzante puede consultarse en [23] donde el autor se pregunta con ironía: ¿De cuántas formas se puede hacer 'machine learning' sobre datasets conteniendo menos de 100 filas? 3 Para una discusión acerca de las taxonomías empleadas en la literatura, consultar [50]. 4 Los métodos "propietarios" se comercializan sin proporcionar al público los detalles de su implementación (salvo quizá, las ideas generales de su funcionamiento.) Los métodos "basados en analogía" consisten esencialmente en consultar proyectos de desarrollo anteriores que sean similares (análogos) a lo que se desea construir: la estimación corresponde a una extrapolación del esfuerzo que fue requerido por los anteriores desarrollos. ...
Article
Full-text available
A critical synthesis on the most representative models for software development project effort estimation is provided. This work is a basis for a discussion about the methodological and practical challenges which entail the effort estimation field, specially in the mathematical/statistical modelling fundamentals, and its empirical verification in the software industry.
... Hence, the construction of accurate methods for Software Development Effort Estimation (SDEE) represents an uninterrupted activity of researchers and software designers. In the literature, several SDEE techniques exist which can be partitioned into three categories [58], [38], [42]: (1) Expert judgment, where the process of estimating the new software projects effort is conducted by a project estimator based on her or his domain knowledge; (2) Algorithmic models are known as the most popular category in SDEE techniques [11] and consist COCOMO [9], SLIM [53] and SEER-SEM [26]; (3) Machine learning which recently is being used instead of algorithmic models and consist of Artificial Neural Networks (ANNs) [24], Decision Tree (DT) [12], Support Vector Machine (SVM) [51], etc. Machine learning (ML) models are very effective in the SDEE field [8], [59]. ...
Article
Full-text available
Software Development Effort Estimation (SDEE) can be interpreted as a set of efforts to produce a new software system. To increase the estimation accuracy, the researchers tried to provide various machine learning regressors for SDEE. Kernel Ridge Regression (KRR) has demonstrated good potentials to solve regression problems as a powerful machine learning technique. Gravitational Search Algorithm (GSA) is a metaheuristic method that seeks to find the optimal solution in complex optimization problems among a population of solutions. In this article , a hybrid GSA algorithm is presented that combines Binary-valued GSA (BGSA) and the real-valued GSA (RGSA) in order to optimize the KRR parameters and select the appropriate subset of features to enhance the estimation accuracy of SDEE. Two benchmark datasets are considered in the software projects domain for assessing the performance of the proposed method and similar methods in the literature. The experimental results on Desharnais and Albrecht datasets have confirmed that the proposed method significantly increases the accuracy of the estimation comparing some recently published methods in the literature of SDEE.
... Such units as object points [2], use-case points [17], story points [5][6][7][8], etc., are derived from the concept of function points. Another wellknown development effort measurement unit comprises logical lines of code (the so-called SLOC metric) used by COCOMO [18] and COCOMO II [2]. And the third category is working time-based (e.g., used in PERT [3,4]), expressing development efforts in manhours, man-days, man-months, etc. ...
Article
Full-text available
In the early software development stages, the aim of estimation is to obtain a rough understanding of the timeline and resources required to implement a potential project. The current study is devoted to a method of preliminary estimation applicable at the beginning of the software development life cycle when the level of uncertainty is high. The authors’ concepts of the estimation life cycle, the estimable items breakdown structure, and a system of working-time balance equations in conjunction with an agile-fashioned sizing approach are used. To minimize the experts’ working time spent on preliminary estimation, the authors applied a decision support procedure based on integer programming and the analytic hierarchy process. The method’s outcomes are not definitive enough to make commitments; instead, they are supposed to be used for communication with project stakeholders or as inputs for the subsequent estimation stages. For practical usage of the preliminary estimation method, a semistructured business process is proposed.
... Mathematics 2024, 12, 1272 2 of 37 to improve cost control [8]. Compared to timely corrective activities conducted during the requirements phase, delays in correcting requirements may result in up to 200-times-higher costs [9]. Therefore, it is particularly urgent to achieve the accurate and speedy automatic extraction of requirement dependency relationships. ...
Article
Full-text available
To address the cost and efficiency issues of manually analysing requirement dependency in requirements engineering, a requirement dependency extraction method based on part-of-speech features and an improved stacking ensemble learning model (P-Stacking) is proposed. Firstly, to overcome the problem of singularity in the feature extraction process, this paper integrates part-of-speech features, TF-IDF features, and Word2Vec features during the feature selection stage. The particle swarm optimization algorithm is used to allocate weights to part-of-speech tags, which enhances the significance of crucial information in requirement texts. Secondly, to overcome the performance limitations of standalone machine learning models, an improved stacking model is proposed. The Low Correlation Algorithm and Grid Search Algorithms are utilized in P-stacking to automatically select the optimal combination of the base models, which reduces manual intervention and improves prediction performance. The experimental results show that compared with the method based on TF-IDF features, the highest F1 scores of a standalone machine learning model in the three datasets were improved by 3.89%, 10.68%, and 21.4%, respectively, after integrating part-of-speech features and Word2Vec features. Compared with the method based on a standalone machine learning model, the improved stacking ensemble machine learning model improved F1 scores by 2.29%, 5.18%, and 7.47% in the testing and evaluation of three datasets, respectively.
... Software Engineering Artifacts Can Really Assist Future Tasks (SEACRAFT) is publicly available online data repository (formerly known as PROMISE [64]. Selected dataset for this study are: Albrecht [65] Deharnais [66] Miyazaki [67], China [68], Cocomo81 [69], Finnish [70], Kitchenham [71] and Maxwell [72]. ...
Article
Full-text available
Software development effort estimation (SDEE) is recognized as vital activity for effective project management since under or over estimating can lead to unsuccessful utilization of project resources. Machine learning (ML) algorithms are largely contributing in SDEE domain, particularly ensemble effort estimation (EEE) works well in rectifying bias and subjectivity to solo ML learners. Performance of EEE significantly depends on hyperparameter composition as well as weight assignment mechanism of solo learners. However, in EEE domain, impact of optimization in terms of hyperparameter tunning as well as weight assignment is explored by few researchers. This study aims in improving SDEE performance by incorporating metaheuristic hyperparameter and weight optimization in EEE, which enables accuracy and diversity to the ensemble model. The study proposed Metaheuristic-optimized Multi-dimensional bagging scheme and Weighted Ensemble (MoMdbWE) approach. This is achieved by proposed search space division and hyperparameter optimization method named as Multi-dimensional bagging (Mdb) . Metaheuristic algorithm considered for this work is Firefly algorithm (FFA), to get best hyperparameters of three base ML algorithms (Random Forest, Support vector machine and Deep Neural network) since FFA has shown promising results of fitness in terms of MAE. Further enhancement in performance is achieved by incorporating FFA-based weight optimization to construct Metaheuristic-optimized weighted ensemble (MoWE) of individual multi-dimensional bagging schemes. Proposed scheme is implemented on eight frequently utilized effort estimation datasets and results are evaluated by 5 error metrices (MAE, RMSE, MMRE, MdMRE, Pred), standard accuracy and effect size along with Wilcox statistical test. Findings confirmed that the use of FFA optimization for hyperparameter (with search space sub-division) and for ensemble weights, has significantly enhanced performance in comparison with individual base algorithms as well as other homogeneous and heterogenous EEE techniques.
... CCSD's success depends heavily on the excitement of the software community as the quality and amount of resources given by the community is determined by it [7][8][9][10]. To get the community to help with tasks, they are given rewards like money or social perks [8]. ...
Article
Full-text available
Competitive Crowdsourcing Software Development (CCSD) is popular among academics and industries because of its cost-effectiveness, reliability, and quality. However, CCSD is in its early stages and does not resolve major issues, including having a low solution submission rate and high project failure risk. Software development wastes stakeholders’ time and effort as they cannot find a suitable solution in a highly dynamic and competitive marketplace. It is, therefore, crucial to automatically predict the success of an upcoming software project before crowdsourcing it. This will save stakeholders’ and co-pilots’ time and effort. To this end, this paper proposes a well-known deep learning model called Bidirectional Encoder Representations from Transformers (BERT) for the success prediction of Crowdsourced Software Projects (CSPs). The proposed model is trained and tested using the history data of CSPs collected from TopCoder using its REST API. The outcomes of hold-out validation indicate a notable enhancement in the proposed approach compared to existing methods, with increases of 13.46%, 8.83%, and 11.13% in precision, recall, and F1 score, respectively.
... As mentioned earlier researchers are contributing to SDEE and proposed many techniques in last four decades. Overall, these techniques are grouped into three main categories (de Barcelos et al., 2008): expert judgment, which takes effort opinion of one or more experts to determine the effort required by project (Hughes, 1996); parametric techniques, utilizing statistical and/or numerical analysis of historical project data (Boehm, 1984) and machine learning (ML) techniques , based on a set of AI algorithms, including as artificial neural networks (ANN), genetic algorithms (GA), analogy-based or case based reasoning (CBR), decision trees, and genetic programming (Idri et al., 2002;Wen et al., 2012). ...
Article
Full-text available
Crowd-Sourced software development (CSSD) is getting a good deal of attention from the software and research community in recent times. One of the key challenges faced by CSSD platforms is the task selection mechanism which in practice, contains no intelligent scheme. Rather, rule-of-thumb or intuition strategies are employed, leading to biasness and subjectivity. Effort considerations on crowdsourced tasks can offer good foundation for task selection criteria but are not much investigated. Software development effort estimation (SDEE) is quite prevalent domain in software engineering but only investigated for in-house development. For open-sourced or crowdsourced platforms, it is rarely explored. Moreover, Machine learning (ML) techniques are overpowering SDEE with a claim to provide more accurate estimation results. This work aims to conjoin ML-based SDEE to analyze development effort measures on CSSD platform. The purpose is to discover development-oriented features for crowdsourced tasks and analyze performance of ML techniques to find best estimation model on CSSD dataset. TopCoder is selected as target CSSD platform for the study. TopCoder’s development tasks data with development-centric features are extracted, leading to statistical, regression and correlation analysis to justify features’ significance. For effort estimation, 10 ML families with 2 respective techniques are applied to get broader aspect of estimation. Five performance metrices (MSE, RMSE, MMRE, MdMRE, Pred (25) and Welch’s statistical test are incorporated to judge the worth of effort estimation model’s performance. Data analysis results show that selected features of TopCoder pertain reasonable model significance, regression, and correlation measures. Findings of ML effort estimation depicted that best results for TopCoder dataset can be acquired by linear, non-linear regression and SVM family models. To conclude, the study identified the most relevant development features for CSSD platform, confirmed by in-depth data analysis. This reflects careful selection of effort estimation features to offer good basis of accurate ML estimate.
... The Constructive Cost Model (COCOMO) of Barry Boehm's Software Engineering Economics, based on lines of code metrics (LOC), which is an accidental aspect of the process, prevailed in the 1980s as the standard model of project management for estimating effort, cost and productivity, expressed in thousands of lines of code per week of work (KLOC/Week) and costs expressed in USD/KLOC [3]. This model has serious weaknesses, because each programming language throws different measures applied in the solution of the same problem and each line of code is considered with equal logical weight in the cognitive effort of development. ...
Preprint
Full-text available
Abstract— Software production is highly knowledge intensive at all stages of the process, unlike the production of tangible products. With the current methods of measurement and predicting the complexiity and effort in the software industry, it becomes difficult to manage the uncertainty, risk, quality, and resources in software project, which has the effect of extraordinary fails, delays, and underestimated costs. It is estimated that the fundamental reason lies in a low ability to dimension the effort and the cognitive resources required in its life cycle. I propose that this originates in two fundamental aspects: a) models deficient in the estimation of the complexity and the size of the software artifacts; and b) poor methods for estimating the cognitive ability of software professionals to solve problems at each stage of the software process. These aspects demand a rethinking of the concepts and models of measurement of software complexity and the estimation of efforts of professionals involved in the software life cycle. In this paper, I present the advances in the development of alternative measurement models and the understanding and application of cognitive science, machine learning, and deep learning in the software process. This new approach aims to define a new software economic engineering to solve this old problem, which will positively affect the quality and value of the products and the productivity of this industry.
... As the development process progresses, the cost of correcting a fault becomes more expensive the longer it takes to discover it. Boehm (1984). A technical product system is in permanent interaction with its environment during the usage phase. ...
Conference Paper
Full-text available
During the usage phase, a technical product system is in permanent interaction with its environment. This interaction can lead to failures that significantly endanger the safety of the user and negatively affect the quality and reliability of the product. Conventional methods of failure analysis focus on the technical product system. The interaction of the product with its environment in the usage phase is not sufficiently considered, resulting in undetected potential failures of the product that lead to complaints. For this purpose, a methodology for failure identification is developed, which is continuously improved through product usage scenarios. The use cases are modelled according to a systems engineering approach with four views. The linking of the product system, physical effects, events and environmental factors enable the analysis of fault chains. These four parameters are subject to great complexity and must be systematically analysed using databases and expert knowledge. The scenarios are continuously updated by field data and complaints. The new approach can identify potential failures in a more systematic and holistic way. Complaints provide direct input on the scenarios. Unknown, previously unrecognized events can be systematically identified through continuous improvement. The complexity of the relationship between the product system and its environmental factors can thus be adequately taken into account in product development.
... In contrast, studies do report the development costs of information goods, depending on their quality. For instance, the software development lifecycle comprises demand analysis, design, coding, testing, implementation, training, etc. Development costs depend on the development complexity and functionality requirements [5,6]. Notably, the convexity of development costs has become a cornerstone of software engineering [1,7]. ...
Article
Full-text available
This study examines information-good versioning strategies for a monopoly firm and is novel in that it endogenizes quality choices with general consumer utility and firm cost. Specifically, consumer utility for good quality is monotonic and concave. The firm first develops the highest quality version at a certain cost and then disables some features to make low-quality versions at no cost. The results reveal that in the case of two user types, if the number of low-type users is small or the additional value they obtain from being served compared with not being served is small, the optimal strategy for the firm is to give them up. If this situation is reversed, it is optimal to serve them. Meanwhile, if their taste for quality is very close to that of high-type users, they should be offered the same quality as the latter. Otherwise, the firm should implement a versioning strategy. Second, with n user types, whether a unique quality is offered to a user type depends on the corresponding benefit–cost analysis. Finally, versioning can never be socially optimal, as the social optimum requires low-type users to purchase the same quality as high-type users. Further, versioning prohibition lowers social welfare if the firm is compelled to stop serving low-type users.
... A firm's total construction cost for its plant is a function of time-to-build under a time-cost tradeoff. In the current study, the time-cost tradeoff is captured as: a one percent decrease in a firm's time-to-build its plant typically requires more than a one percent increase in the firm's total construction cost for its plant (Boehm, 1981;Graves, 1989;Mansfield, 1971;Scherer, 1967Scherer, , 1984. As will be shown later, this time-cost tradeoff is observed in our data from 1996 to 2000. ...
Article
This paper examines the effect of market-entry timing on a firm’s speed and cost of entry in a setting where a firm needs to build a plant for market entry. Based on our developed analytical model, we provide seven scenarios of the market-entry timing effect on a firm’s entry speed and cost. We test hypotheses in the liquefied natural gas (LNG) industry. We use Wooldridge’s three-step instrumental variable (IV) approach to account for endogeneity bias. We find that a late entrant has (1) a shorter time-to-build and (2) a higher cost-to-build relative to an early entrant. Further, (3) the late entrant positively moderates the negative relationship of time-to-build and cost-to-build (i.e., the negative relationship of time-to build and cost-to-build becomes less negative for the late entrant). These empirical results are consistent with the prediction of when both revenue effect (i.e., revenue curve shift) and cost effect (i.e., cost curve leftward shift) exist.
... • Modèle cascade [Boehm, 1981] C'est l'un des premiers modèles émergents pour répondre aux besoins industriels en matière de productivité et de qualité logicielle. ...
Book
Full-text available
... Probably the best known estimation model is Boehm's COCOMO model (Boehm, 1981). "The first one is a basic model which is a single-value model that computes software development effort and cost as a function of program size expressed as estimated lines of code (LOC). ...
Article
Full-text available
Machine learning purely concerned on the concept with building the program that improves the tasks performance through experience. Machine learning algorithms have proven to be of great practical value in a variety of application domains. the field of software engineering turns out to be a fertile ground where many software development tasks could be formulated as learning problems, analyzing design and testing plays the major role and approached in terms of learning algorithms We discuss several metrics in each of five types of software quality metrics: product quality, in-process quality, testing quality, maintenance equality, and customer satisfaction quality.
... 1. Quantum software cost estimation: Software cost estimation has been extensively investigated in the classical computing community [12]- [17]. In quantum applications, stakeholders or software teams are also required to accurately predict the cost of these applications to ensure the success of their project. ...
Preprint
Full-text available
Quantum computing systems depend on the principles of quantum mechanics to perform multiple challenging tasks more efficiently than their classical counterparts. In classical software engineering, the software life cycle is used to document and structure the processes of design, implementation, and maintenance of software applications. It helps stakeholders understand how to build an application. In this paper, we summarize a set of software analytics topics and techniques in the development life cycle that can be leveraged and integrated into quantum software application development. The results of this work can assist researchers and practitioners in better understanding the quantum-specific emerging development activities, challenges, and opportunities in the next generation of quantum software.
Conference Paper
Effort estimation (scope - cost - time) plays a significant role in software project management. Reducing the chances of software project failures has been a challenge faced by the scientific community for over 30 years. In this regard, research has focused on proposing methods to enhance the accuracy of effort estimations. Other authors have identified factors impacting effort estimations, while some studies define frameworks, analogies, comparison lists, ontologies, or comparisons among estimation methods, among other strategies, aimed at addressing the issues of overestimation or underestimation of software projects. The DevOps approach is a field that is being explored to develop alternative methods and strategies to tackle this problem. This work conducted a systematic literature review, allowing for the identification of methods, approaches, practices, factors, metrics, and methodologies to enhance software development project estimations within the DevOps context. This study enriches the perspective of an approach that extends the scope of estimations from the development phase to the operational phase, aiming to reduce the disparity between initial estimations and the actual (cost and time) estimations of a software project in DevOps environments.
Book
Full-text available
RAND researchers worked to understand the costs and benefits of digital engineering in the U.S. Department of Defense (DoD) and develop a decision support framework for digital engineering activities in weapon system programs. To prepare, the authors reviewed the literature and interviewed stakeholders to understand the current state of digital engineering practice and prior efforts to assess the costs and benefits of digital engineering and model-based systems engineering. They then developed decision support frameworks incorporating (1) established DoD cost-benefit analysis approaches and (2) established systems engineering decision methodologies. Along the way, the authors noted critical issues with rigor and risks in the practice of DoD digital engineering and added that aspect to the study. This research suggests that cost-benefit decision support for digital engineering is possible at any stage of a weapon system program life cycle if program data have been collected accordingly or if goal-based systems engineering principles are leveraged. Calculating definitive costs and benefits of digital engineering is imperfect because no analyst will have access to an identical weapon system program developed without digital engineering — the counterfactual scenario. https://www.rand.org/content/dam/rand/pubs/research_reports/RRA2400/RRA2418-1/RAND_RRA2418-1.pdf
Chapter
Many scientific high performance codes that simulate e.g. black holes, coastal waves, climate and weather, etc. rely on block-structured meshes and use finite differencing methods to solve the appropriate systems of differential equations iteratively. This paper investigates implementations of a straightforward simulation of this type using various programming systems and languages. We focus on a shared memory, parallelized algorithm that simulates a 1D heat diffusion using asynchronous queues for the ghost zone exchange. We discuss the advantages of the various platforms and explore the performance of this model code on different computing architectures: Intel, AMD, and ARM64FX. As a result, Python was the slowest of the set we compared. Java, Go, Swift, and Julia were the intermediate performers. The higher performing platforms were C++, Rust, Chapel, Charm++, and HPX.
Research
Full-text available
The aim and objective of education system is attempts to provide path of sustainable human development and making a responsible citizen of tomorrow's society. To satisfy the objective the education system's provides a platform to the students for acquire knowledge about the society, understand its sustainable development with develop the skill for potentially beneficial attitudes. National Service Scheme (NSS) plays vital role in accomplishing the aim and objective of education system "To utilize knowledge and useful skills into practice based on innovation". So, in the education system continuous research process carried out. In this context, there is a need system that process useful for analysis of knowledge and more specifically studies the outcome measurement and predictions. This study is tries to introduce a "Learn to Serve to Learn" model based on data science and-analytical process framework deal with waterfall model with NSS in education process.
Article
An IT industry wants a simple and accurate method of efforts estimation. Estimation of efforts before starting of work is a prediction and prediction always not accurate. Intermediate COCOMO considered 17 factor that affecting the efforts, UCP considered 13 Technical Complexity Factors and 05 Experience factors. There is a lot factors that can affect efforts estimation .Most of the parameter are covered by COCOMO and UCP, but some parameters which are included in COCOMO left by UCP. UCP is one of the popular approaches of effort estimation. This paper is increasing the Technical complexity and Experience factors used in traditional UCP approach.
Chapter
Modern automotive software is highly complex and consists of millions lines of code. For safety-relevant automotive software, it is recommended to use sound static program analysis to prove the absence of runtime errors. However, the analysis is often perceived as burdensome by developers because it runs for a long time and produces many false alarms. If the analysis is performed on the integrated software system, there is a scalability problem, and the analysis is only possible at a late stage of development. If the analysis is performed on individual modules instead, this is possible at an early stage of development, but the usage context of modules is missing, which leads to too many false alarms. In this case study, we present how automatically inferred contracts add context to module-level analysis. Leveraging these contracts with an off-the-shelf tool for abstract interpretation makes module-level analysis more precise and more scalable. We evaluate this framework quantitatively on industrial case studies from different automotive domains. Additionally, we report on our qualitative experience for the verification of large-scale embedded software projects.
Article
Effort estimation in the primary steps of software development is one of the most essential and pivotal tasks. It significantly impacts the success of the overall development of software projects. Inaccurately estimating software projects has been a persistent problem for software development organizations. The Constructive Cost Model (COCOMO) has been widely used for software effort estimation, but its existing parameters often fail to provide realistic results in the present context of development. In recent years, researchers have focused on the utilization of Nature-Inspired Algorithms (NIAs) to optimize the parameters of COCOMO to improve its performance. The necessity to increase estimation precision urged the authors to propose a novel approach Memetic Improved Anti-Predatory Nature Inspired Algorithm (MI-APNIA). The proposed MI-APNIA integrates the concept of memeplexes and a frog's defence strategy when it senses danger from a predator. The MI-APNIA algorithm improves the exploitation phase by integrating information from the global best solution into the solution search space. Leveraging the exceptional capabilities of the MI-APNIA algorithm, a coherent and reliable parametric model is established for the precise software effort estimation based on COCOMO variants, thereby showcasing improved parameter tuning compared to other existing NIAs. The results demonstrate a significant improvement in terms of Mean Magnitude Relative Error (MMRE), compared to the basic COCOMO Model with a remarkable 78.3% enhancement. Additionally, the proposed approach achieves a notable improvement in other NIA such as 10.80% better than I-APNIA, 10.96% better than APNIA, 16.5% better than SFLA, 10.8% better than PSO, and 82.19% better than GA in COCOMO V3 Variant.
Article
Full-text available
This paper presents an innovative approach to cycle time reduction in internal quality management through the implementation of an automated tool. The focus is on optimizing resource planning for product-based organizations by analyzing variables such as resource numbers, request types, and testing methods. Statistical modeling is utilized to enhance work allocation and provide precise status updates to development leads, ultimately leading to significant cost savings and efficiency improvements. The developed automation tool, as an alternative to manual test case testing, utilizes statistical modeling outputs to assist in work allocation to members of testing team and also to collect the actual time spent on each of the activities performed by them for further analysis. This will also help to provide correct status to development leads. So that results can be communicated to development leads in advance whether their request can be met or not. Resource planning is also required to do random testing in case there are few requests or quality of the product is critical and this will help in identifying how much testing is done. The next task after resources planning is server upload, which takes a lot of bandwidth of testing group to upload the document and source code. Optimization of server based on understanding of upload sequencing and server load. To meet the current load which will save several million dollars as the server cost is high.
Article
Full-text available
This paper presents an innovative approach to cycle time reduction in internal quality management through the implementation of an automated tool. The focus is on optimizing resource planning for product-based organizations by analyzing variables such as resource numbers, request types, and testing methods. Statistical modeling is utilized to enhance work allocation and provide precise status updates to development leads, ultimately leading to significant cost savings and efficiency improvements. The developed automation tool, as an alternative to manual test case testing, utilizes statistical modeling outputs to assist in work allocation to members of testing team and also to collect the actual time spent on each of the activities performed by them for further analysis. This will also help to provide correct status to development leads. So that results can be communicated to development leads in advance whether their request can be met or not. Resource planning is also required to do random testing in case there are few requests or quality of the product is critical and this will help in identifying how much testing is done. The next task after resources planning is server upload, which takes a lot of bandwidth of testing group to upload the document and source code. Optimization of server based on understanding of upload sequencing and server load. To meet the current load which will save several million dollars as the server cost is high.
Chapter
Reuse in system development is a prevalent phenomenon. However, how reuse is applied varies widely. The Generalized Reuse Framework is a strategic reuse model for systems engineering management in product development that addresses both investment and leverage of reuse through two interrelated and interacting processes: Development with Reuse (DWR) and Development for Reuse (DFR). This chapter summarizes the latest development of this framework by providing the taxonomic definition of DWR and DFR and analyzing the decision processes for reuse as applied to incremental development and product line engineering. It also describes how the framework is applied to the revision of the Constructive Systems Engineering Cost Model (COSYSMO), a parametric cost estimating model for systems engineering. With use case scenarios, it illustrates the approach to apply the framework and to quantify the economic impact of reuse vis-à-vis investment strategies.
Conference Paper
div class="section abstract"> Shafaat and Kenley in 2015 identified the opportunity to improve System Engineering Standards by incorporating the design principle of learning. The System Level Assessment (SLA) Methodology is an approach that fulfills this need by efficiently capturing the learnings of a team of subject matter experts in the early stages of product system design. By gathering expertise, design considerations are identified that when used with market and business requirements improve the overall quality of the product system. To evaluate the effectiveness of this approach, the methodology has been successfully applied over 400 times within each realm of the New Product Introduction process, including most recently to a Technology Development program (in the earliest stages of the design process) to assess the viability of various electrification technologies under consideration by an automotive Tier 1 supplier. The SLA-derived approach taken on this program showed the potential to reduce the level of redesign work, migrating the design ownership to a system-level, while significantly increasing the number of invention disclosures compared to an equivalent technology developed using traditional methods. Using the SLA method yielded 170% more invention disclosures, of which more than 40% were converted to patent filings. </div
Conference Paper
Currently, requirements for control software written in natural language are often formulated ambiguously and incompletely. Controlled natural languages (CNLs) can solve this problem, at the same time maintaining flexibility for writing and conveying requirements in an intuitive and common way. The creation of domain-specific controlled natural languages nowadays is under active development. In this paper, we suggest CNL which is based on event-driven semantics. This language is intended for describing the temporal properties of cyber-physical systems. We develop our CNL using a number of natural language patterns build for classes of requirements expressed in Event-Driven Temporal Logic formalism (EDTL). Due to formal semantics of EDTL, the suggested CNL is also unambiguous and can be translated into logic formulas. As a result, the proposed CNL provides an auxiliary tool to improve communication quality between different participants in the industrial system development process: customers, requirements engineers, developers, and others. Therefore, the solution helps to reduce the number of errors in the formulation of requirements at earlier stages of development.
Article
Full-text available
The problem of quantifying factors influencing the amount of effort needed to produce a high quality software is addressed.
Article
Full-text available
A parametric software cost estimation model prepared for Jet PRopulsion Laboratory (JPL) Deep Space Network (DSN) Data System implementation tasks is described. The resource estimation mdel modifies and combines a number of existing models. The model calibrates the task magnitude and difficulty, development environment, and software technology effects through prompted responses to a set of approximately 50 questions. Parameters in the model are adjusted to fit JPL software life-cycle statistics.
Article
In all studies of human performance, the experimenter must be certain that the subject is performing the task that the experimenter believes he has set; otherwise results become uninterpretable. Early studies of computer programming have shown such wide variations in individual performance that one might suspect that subjects differed in their interpretation of the task. Experiments are reported which show how programming performance can be strongly influenced by slight differences in performing objectives. Conclusions are drawn from these results regarding both future experimentation and management practices in computer programming.
Article
This report contains the results of work accomplished by Boeing Computer Services for AF RADC. The purpose of this study was to assess the impact of modern software development techniques on the cost of developing computer software. The five in-house projects selected for study varied in size, type of application, and computing environment. The collection of practices found to have the most beneficial impact on software development are, in order of their impact: Project Organization and Management Procedures, Testing Methodology, Configuration Management and Change Control, and Design Methodology. Existing military standards and specifications are sufficiently comprehensive to encourage the use of beneficial practices; however, certain standards and specifications may require modification to make their applicability to software procurements more pertinent. (Author)
Article
Nine software cost estimating models are evaluated to determine if they satisfy Air force needs. The evaluation considers both the qualitative and quantitative aspects of the models' outputs. Air Force needs for cost estimates are established by the Major Weapon System Acquisition Process. Associated with the different development phases are five cost estimating situations. Decisions that are made early in the Acquisition Process require software cost information that includes the entire life cycle for complete software systems, subsequent decisions require more detailed cost information. Comparison of the outputs of the nine test models with the requirements established by the five cost estimating situations indicates that the models are able to satisfy only the needs of the earliest phase of the Acquisition Process. The models perform satisfactorily for the purpose of allocating funds for software acquisition, but they fail to support such needs as assessment of alternative designs, proposal evaluation, or project management.
Article
Guidelines are presented to help managers estimate the costs of computer programming. The guidelines summarize a statistical analysis of 169 computer programming efforts with equations to estimate man months, computer hours, and months elapsed, and also planning factors such as man months per thousand instructions. Opinions, rules of thumb, and experience data based upon literature search and experience supplement the statistical results. Forms with the guidelines are organized into six sections corresponding to a six-step division of the computer programming process. Advice is given on the integration of cost estimates into a cost analysis to justify and plan ADP projects.
Article
The present approach to productivity estimation, although useful, is far from being optimized. Based on the results of the variable analysis described in this paper, and supplemented by the results of the continued investigation of additional variables related to productivity, an experimental regression model has been developed. Preliminary results indicate that the model reduces the scatter. Further work is being done to determine the potential of regression as an estimating tool, as well as to extend the analyses of the areas of computer usage, documentation volume, duration, and staffing.
Conference Paper
Despite large volumes of data and many types of metrics, software projects continue to be difficult to predict and risky to conduct. In this paper we propose software analytics which holds out the promise of helping the managers of software projects turn their plentiful information resources, produced readily by current tools, into insights they can act on. We discuss how analytics works, why it's a good fit for software engineering, and the research problems that must be overcome in order to realize its promise.
Conference Paper
This paper presents an overview of the TRW Software Productivity System (SPS), an integrated software support environment based on the Unix operating system, a wide range of TRW software tools, and a wideband local network. Section 2 summarizes the quantitative and qualitative requirements analysis upon which the system is based. Section 3 describes the key architectural features and system components. Finally, section 4 discusses our conclusions and experience to date.
Conference Paper
The program complexity measure currently seems to be the most capable measure for both quantitative and objective control of the software project. Five program complexity measures (step count, McCabe's V(G), Halstead's E, Weighted Statement Count and Process V(G)) were assessed from such a viewpoint. This empirical study was done with the data collected through a practical software project. All of these measures have highly significant correlations with the management data. Application of complexity measures to software development management is discussed and a method for the detection of anomalous modules in a program is proposed.
Conference Paper
Karl Popper has described the scientific method as “the method of bold conjectures and ingenious and severe attempts to refute them”. Software Science has made “bold conjectures” in postulating specific relationships between various 'metrics' of software code and in ascribing psychological interpretations to some of these metrics. This paper describes tests made on the validity of the relationships and interpretations which form the foundations of Software Science. The results indicate that the majority of them represent neither natural laws nor useful engineering approximations.
Conference Paper
Estimation of the size and time required for software development is probably the most difficult aspect of any project. Up to now, most estimates have been done subjectively by experts. These estimates are often inaccurate. In the midst of development, faulty estimates may contribute to delays and/or excess expenses. In the last several years, several estimation have been proposed, most of which were models to estimate software development cost {manpower). These models used program size as a variable. However at the beginning of development, when estimations are made, program sizes are usually uncertain and costs (manpower) are equally uncertain. The authors developed a program-size estimation model for batch programs in a banking system, and used the model in an actual project. Using the adapted model, estimation errors amounted to only 7 percent. This is much better than the accuracy of estimtions made by experts in the field (usually about 10 percent accuracy), and indicates that objective estimation methods can be derived for program-size. In this paper, we introduce our estimation model and discuss the adaptation of that model for a specific project.
Article
This paper presents a model of the programming process. The model has four parts. A resources model specifies how many useful man-months of design effort are available from project team members after subtracting the time required for learning and team communications. A system design model specifies how many man-months of effort are required to derive program module specifications, as a function of the number of team members, program size, and number of modules. A coding model specifies how many man-months of effort are required for coding, as a function of team, module, and program size. Finally, a checkout model specifies how many man-months are required for checkout as a function of program size, error detection and correction rates, and a design-complete factor. The model as a whole predicts that programmer productivity will decrease as project team size is increased and that project duration will first decrease and then increase as team size is increased. It also shows that productivity and project duration vary enormously as a function of project management factors, even when project complexity and programming staff competence are held constant.
Article
Programming data involving 278 commercial-type programs were collected from 23 medium- to large-scale organizations in order to explore the relationships among variables measuring program type, the testing interface, programming technique, programmer experience, and productivity. Programming technique and programmer experience after 1 year were found to have no impact on productivity, whereas on-line testing was found to reduce productivity. A number of analyses of the data are presented, and their relationship to other studies is discussed.
Article
A macromethodology to support management needs has now been developed that will produce accurate estimates of manpower, costs, and times to reach critical milestones of software projects. There are four parameters in the basic system and these are in terms managers are comfortable working with - effort, development time, elapsed time, and a state-of-technology parameter. The system provides managers sufficient information to assess the financial risk and investment value of a new software development project before it is undertaken and provides techniques to update estimates from the actual data stream once the project is underway. Using the technique, adequate analysis for decision can be made in an hour or two using only a few quick reference tables and a scientific pocket calculator. 31 refs.
Article
This paper describes a graph-theoretic complexity measure and illustrates how it can be used to manage and control program complexity. The paper first explains how the graph-theory concepts apply and gives an intuitive explanation of the graph concepts in programming terms. The issue of using nonstructured control flow is also discussed. A characterization of nonstructured control graphs is given and a method of measuring the ″structuredness″ of a program is developed. The last section of this paper deals with a testing methodology used in conjunction with the complexity measure; a testing strategy is defined that dictates that program can either admit of a certain minimal testing level or the program can be structurally reduced.
Article
"A Wiley-Interscience publication." Incluye bibliografía v. 1. Management perspectives.-- v. 2. Operations, programming, and software models
Article
This paper describes a graph-theoretic complexity measure and illustrates how it can be used to manage and control program complexity. The paper first explains how the graph-theory concepts apply and gives an intuitive explanation of the graph concepts in programming terms. The control graphs of several actual Fortran programs are then presented to illustrate the correlation between intuitive complexity and the graph-theoretic complexity. Several properties of the graph-theoretic complexity are then proved which show, for example, that complexity is independent of physical size (adding or subtracting functional statements leaves complexity unchanged) and complexity depends only on the decision structure of a program.
Article
The work of software cost forecasting falls into two parts. First we make what we call structural forecasts, and then we calculate the absolute dollar-volume forecasts. Structural forecasts describe the technology and function of a software project, but not its size. We allocate resources (costs) over the project's life cycle from the structural forecasts. Judgment, technical knowledge, and econometric research should combine in making the structural forecasts. A methodology based on a 25 X 7 structural forecast matrix that has been used by TRW with good results over the past few years is presented in this paper. With the structural forecast in hand, we go on to calculate the absolute dollar-volume forecasts. The general logic followed in "absolute" cost estimating can be based on either a mental process or an explicit algorithm. A cost estimating algorithm is presented and five tradition methods of software cost forecasting are described: top-down estimating, similarities and differences estimating, ratio estimating, standards estimating, and bottom-up estimating. All forecasting methods suffer from the need for a valid cost data base for many estimating situations. Software information elements that experience has shown to be useful in establishing such a data base are given in the body of the paper. Major pricing pitfalls are identified. Two case studies are presented that illustrate the software cost forecasting methodology and historical results. Topics for further work and study are suggested.
Article
By classifying programs according to their relationship to the environment in which they are executed, the paper identifies the sources of evolutionary pressure on computer applications and programs and shows why this results in a process of never ending maintenance activity. The resultant life cycle processes are then briefly discussed. The paper then introduces laws of Program Evolution that have been formulated following quantitative studies of the evolution of a number of different systems. Finally an example is provided of the application of Evolution Dynamics models to program release planning.
Sensitivity analysis of the Jensen software model
  • R W Jensen
  • S Lucas
Final report: Software acquisition resource expenditure (SARE) data collection methodology
  • R L Dumas
Development of a logistics software cost estimating technique for foreign military sales
  • W M Carriere
  • R Thibodeau
Wang Institute cost model (WICOMO) tool user’s manual
  • M Demshki
  • D Ligett
  • B Linn
  • G Mccluskey
  • R Miller
The real economics of software development
  • L H Putnam
  • R Goldberg
  • H Lorin
A software lifecycle case study using the PRICE model
  • W W Kuhn
Software development estimation study-A model from CAD/CAM system development experiences
  • M Okada
  • M Azuma
Avionics software support cost model
  • Syscon Corp
Review and analysis of conversion cost-estimating techniques
  • C Houtz
  • T Buschbach
SOFCOST: Grumman’s software cost eliminating model
  • H F Dircks
Advanced Programming Methods for Digital Computers
  • H D Benington
A Management Guide to PERT/CPM
  • J D Wiest
  • F K Levy
  • JD Wiest