Figure 4 - uploaded by Zaineb Sakhrawi
Content may be subject to copyright.
Context in source publication
Similar publications
Deep Learning (DL) is a branch of Machine Learning where models are developed using neural networks made of several layers for prediction. DL models have been developed to predict effort estimation in software development. This paper presents a review of works which discuss the use of DL models for effort estimation for Scrum. The various textual i...
Citations
... Using PRED and the mean magnitude of relative error (MMRE) they observed an improved error score. Sakhrawi et al. [16] used projects based on the scrum framework for their publication upon which the first model uses three ML techniques-Random Forest Regressor (RFR), Lin-earSVR, and Decision Tree Regressor (DTRegr), and the second model uses StackingRegressor. This study achieved the following results-Mean Square Error (MSE) 0.406, MAE (Mean Absolute Error) 0.206, and Root Mean Square Error (RMSE) 0.595. ...
... The second model uses all algorithms in the first model with the exception of M5P. They used the same evaluation metrics as used in their previous study [16] but this time results show that the software estimation using Correlation-based feature selection is improved and the second has better accuracy than the first one with M5P. Chukhray et al. [18] selected and trained weak predictors in the first stage-support vector machine, K-nearest neighbour classifier and multivalued linear regression models. ...
The volatile factors involved in software cost estimation have long been an occlusion for the software development life cycle. The inaccuracy they lead to during the estimation process has had an implacable effect on the stakeholders concerned. This can be mitigated by using machine learning algorithms to estimate the cost, which significantly reduces the volatility of the process and has more reliable results. Thus, implementing stacking on various datasets with SVR, LightGBM, K-nearest neighbours and Random Forest in level-0 and Ridge Regression in level-1 has given highly accurate results. The SENSE- Software Effort Estimation using Novel Stacking and Ensemble learning- model proposed in this study is substantiated on six datasets China, Kemerer, Albrecht, Nasa93, ISBSG and Maxwell and evaluated using MAE, RMSE, R², PRED and MMRE as evaluation metrics. We find that the proposed model displays competent performance in experimental evaluation and statistical analysis in comparison to the other studies used in the work.
... According to [12], ensemble learning models outperform single-model counterparts. Sakhrawi et al. [16,17] concluded that the SEM shows promise in improving the accuracy of single models used for estimating effort in both conventional and scrum software development. The use of ensemble learning models has increased in both conventional and agile contexts [18]. ...
... The use of ensemble learning models has increased in both conventional and agile contexts [18]. Several studies have employed the SEM to estimate effort during the development and maintenance phases exclusively [16,17]. This study is unique in its use to enhance the accuracy of cost prediction during the software development phase while using FS as the primary independent variable. ...
Estimating software costs is a vital step in guaranteeing the successful completion of a software project. Given the significant impact of Functional Size (FS) measurement on obtaining accurate estimates for enhancement and development projects efforts, this study aims to investigate the use of FS as the key independent variable for predicting software project development costs. This is accomplished by utilizing ensemble models. The dataset used in this study came from the International Software Benchmarking Standards Group (ISBSG) repository. This research compares various single Machine Learning (ML) models and ensemble models learning using Grid Search (GS) tuning techniques to demonstrate the efficacy of our approach. Using the FS of a new development request, the following observations were made: (i) The suggested Stacking Software Development Cost Estimation (StackSDCE) model outperformed the three distinct ML algorithms applied independently. (ii) The application of GS for fine-tuning and configuring the individual ML methods led to improved precision of the StackSDCE outcomes. (iii) StackSDCE-based GS tuning resulted in more precise estimations.
... According to Idri et al. [17], ensemble learning models perform better than their single-model counterparts. Sakhrawi et al. [21,22] concluded that the SEM is a promising method of enhancing the precision of single models that are used to estimate the effort required to enhance conventional and scrum software. In both the conventional and agile contexts, the use of ensemble learning models has increased [23]. ...
... In both the conventional and agile contexts, the use of ensemble learning models has increased [23]. Multiple studies have used the SEM to estimate the amount of effort required during the development and maintenance phases only [17,21,22]. Ensemble learning models are used in the testing phase to estimate software defects. ...
A type of software testing, regression testing is often costly and labour-intensive. As such, multiple corporations have intensified efforts to estimate the amount of effort required. However, frequent alterations in software projects impact the precision of software regression test effort estimation (SRTEE), which increases the difficulty of managing software projects. Therefore, machine learning (ML) has increasingly been used to develop more accurate SRTEEs. The estimation process of a software project comprises inputs, the model, and outputs. This present study examines the quality of estimation inputs and the model required to deliver accurate estimation outputs. An SRTEE that uses the stacking ensemble model (StackSRTEE) was developed to increase the precision of SRTEE. It consisted of the three most common ML methods, namely neural networks, support vector regression, and decision tree regression. The grid search (GS) technique was then used to tune the hyperparameters of the StackSRTEE before it was trained and tested using a dataset from the International Software Benchmarking Standards Group (ISBSG) repository. The size of the functional change; specifically, enhancement; was used as the primary independent variable to improve the inputs of the StackSRTEE model. With the appropriate features; such as the functional change size of an enhancement; (1) the proposed StackSRTEE model yielded higher accuracy than the three individual ML methods on their own, (2) using GS to tune and set the individual ML methods increased the precision of the SRTEE outputs, and (3) the StackSRTEE-based GS tuning yielded estimations that were more precise.
Background : Accurate effort estimation is crucial for planning in Agile iterative development. Agile estimation generally relies on consensus-based methods like planning poker, which require less time and information than other formal methods (e.g., COSMIC) but are prone to inaccuracies. Understanding the common reasons for inaccurate estimations and how proposed approaches can assist practitioners is essential. However, prior systematic literature reviews (SLR) only focus on the estimation practices (e.g., [26, 127]) and the effort estimation approaches (e.g., [6]). Aim : We aim to identify themes of reasons for inaccurate estimations and classify approaches to improve effort estimation. Method : We conducted an SLR and identified the key themes and a taxonomy. Results : The reasons for inaccurate estimation are related to information quality, team, estimation practice, project management, and business influences. The effort estimation approaches were the most investigated in the literature, while only a few aim to support the effort estimation process. Yet, few automated approaches are at risk of data leakage and indirect validation scenarios. Recommendations : Practitioners should enhance the quality of information for effort estimation, potentially by adopting an automated approach. Future research should aim to improve the information quality, while avoiding data leakage and indirect validation scenarios.