Figure 2 - uploaded by Narimane Zighed
Content may be subject to copyright.
The general process of using ML in maintainability prediction

The general process of using ML in maintainability prediction

Source publication
Article
Full-text available
Software maintainability is one of the most important aspects when evaluating the quality of a software product. It is defined as the ease with which the existing software can be modified. In the literature, several researchers have proposed a large number of models to measure and predict maintainability throughout different phases of the Software...

Similar publications

Research
Full-text available
Continuous integration and development pipeline are commonly used technics in traditional software development. Data management and data analysis has been revolutionized by big data and machine learning tools. Our vision is to merge these technologies and technics to bring together professionals like software engineers, data engineers and platform...

Citations

... Despite these concerns, the Index has reached general acceptance and is frequently used in maintainability research for object-oriented software. Studies conducting literature reviews on software maintainability [6,7,[43][44][45] report that the Index is one of the most-commonly used measure of maintainability. In addition, the Index is widely adopted in practice as the calculation of the Index is available within development environments, e.g., Visual Studio [12], as well as in popular metrics collection tools, e.g., JHawk [46], Radon [47], CMT++, and CMTJava [48]. ...
Article
Full-text available
During maintenance, software systems undergo continuous correction and enhancement activities due to emerging faults, changing environments, and evolving requirements, making this phase expensive and time-consuming, often exceeding the initial development costs. To understand and manage software under development and maintenance better, several maintainability measures have been proposed. The Maintainability Index is commonly used as a quantitative measure of the relative ease of software maintenance. There are several Index variants that differ in the factors affecting maintainability (e.g., code complexity, software size, documentation) and their given importance. To explore the variants and understand how they compare when evaluating software maintainability, an experiment was conducted with 45 Java-based object-oriented software systems. The results showed that the choice of the variant could influence the perception of maintainability. Although different variants presented different values when subjected to the same software, their values were strongly positively correlated and generally indicated similarly how maintainability evolved between releases and over the long term. Though, when focusing on fine-grained results posed by the Index, the variant selection had a larger impact. Based on their characteristics, behavior, and interrelationships, the variants were divided into two distinct clusters, i.e., variants that do not consider code comments in their calculation and those that do.
... Generally, almost all of the publications elaborated Software Development Life cycle phases into Development, Testing, Deployment, and sometimes Maintenance. The publications distribution among these phases is given in Fig. 3. [35] 2017 Conference paper Scopus [24] 2014 Conference paper Scopus [2] 2019 Journal paper Scopus [19] 2010 Conference paper WoS [32] 2018 Journal paper Scopus [14] 2019 Conference paper Scopus [16] 2016 Conference paper Scopus [37] 2008 Conference paper WoS [38] 2013 Conference paper Scopus [3] 2003 Conference paper WoS [20] 2016 Journal paper Scopus [33] 2019 Journal paper Scopus [40] 2018 Journal paper Scopus [25] 2019 Journal paper Scopus [21] 2017 Journal paper WoS [23] 2015 Journal paper Scopus [18] 2013 Journal paper Scopus [5] 2018 Conference paper Scopus [27] 2014 Conference paper Scopus [22] 2017 Conference paper Scopus [31] 2015 Book chapter WoS [26] 2018 Conference paper Scopus [39] 2008 Conference paper Scopus [34] 2008 Conference paper Scopus [28] 2016 Book chapter Scopus [30] 2007 Conference paper Scopus ...
Chapter
Full-text available
The software systems worldwide increase in a density on a daily basis. The success in nowadays competitive market requires sustainable and quality software product. Controversially to the quantity of software products, the quality and cost of the software are tend to depend on several aspects. However, they are not fully inculcated yet as a fundamentally essential. The full control over the software quality requires software metrics to be introduced. By effective usage of software quality metrics one can monitor the software development process, minimize the cost, track the resource usage and maintain the expected results. This paper reviews the late phases and the existing software quality models to track software process quality metrics in these late phases. And based on the summarized studies we describe our system architecture in the way to evaluate the software quality with embedded external systems. This paper find outs additional metrics we can measure with the help of our framework.
... However, such surveys involve high costs and are also very time consuming and may produce biased opinions due to the subjectiveness involved in the external quality attributes. Contrarily, measurement of internal quality attributes using Object-Oriented (OO) metric suites has been validated by many researchers for predicting maintainability keeping in view the relationship that exists between the OO metrics & maintainability [4]- [9]. Hence, the current study also uses these OO metrics for Software Maintainability Prediction (SMP). ...
... However, there exist different software metrics based on whether the paradigm is procedural or OO. As per the existing literature, software systems have been analyzed from three perspectives, i.e., the architecture of the system, its design, and the code for SMP [4]. However, out of these, code-level analysis for SMP is the most widely used perspective. ...
Article
Software Maintainability is an indispensable factor to acclaim for the quality of particular software. It describes the ease to perform several maintenance activities to make a software adaptable to the modified environment. The availability & growing popularity of a wide range of Machine Learning (ML) algorithms for data analysis further provides the motivation for predicting this maintainability. However, an extensive analysis & comparison of various ML based Boosting Algorithms (BAs) for Software Maintainability Prediction (SMP) has not been made yet. Therefore, the current study analyzes and compares five different BAs, i.e., AdaBoost, GBM, XGB, LightGBM, and CatBoost, for SMP using open-source datasets. Performance of the propounded prediction models has been evaluated using Root Mean Square Error (RMSE), Mean Magnitude of Relative Error (MMRE), Pred(0.25), Pred(0.30), & Pred(0.75) as prediction accuracy measures followed by a non-parametric statistical test and a post hoc analysis to account for the differences in the performances of various BAs. Based on the residual errors obtained, it was observed that GBM is the best performer, followed by LightGBM for RMSE, whereas, in the case of MMRE, XGB performed the best for six out of the seven datasets, i.e., for 85.71% of the total datasets by providing minimum values for MMRE, ranging from 0.90 to 3.82. Further, on applying the statistical test and on performing the post hoc analysis, it was found that significant differences exist in the performance of different BAs and, XGB and CatBoost outperformed all other BAs for MMRE. Lastly, a comparison of BAs with four other ML algorithms has also been made to bring out BAs superiority over other algorithms. This study would open new doors for the software developers for carrying out comparatively more precise predictions well in time and hence reduce the overall maintenance costs.
... The researchers behind the Software Maintainability Prediction (SMP) Framework use several different types of mathematical, machine learning, and evolving models on historical data in order to train various types of complex models with the purpose of keeping track of all kinds of software updates. [22,[24][25][26][27]. ...
Article
Full-text available
The software industry's competitive nature makes it natural that software managers and developers face several crucial decisions in managing the software project. These decisions are taken to enhance processes maturity and product quality with improved planning accuracy and monitoring control. In this study, the factors determining the growth of software project management were analyzed. This study used an online survey to collect the necessary data relating to the development, classification, consideration, priority setting, and preparation in software projects. It was observed that team incapability, time constraint, limited testing criteria, customer's inability to understand quality specifications, Budget limitation, limited ability to handle quality requirements, and lack of customer involvement are the major constraints in software project development. The analysis indicates that quality criteria, performance, security, usability, team capability, and customer involvement gained more consideration in the context of software development. Finally, it was recommended that project managers and developers should learn how essential it is to delegate specific roles to avoid difficulties resulting from a lack of clear accountability for the required specifications in the production of software.
... Most of these models were suggested in the level of code while a few models were suggested at levels of design and architecture [8]. In this paper, we will be considering the most recent proposed techniques to predict Object-Oriented software maintainability utilizing artificial intelligent techniques. ...
... Object-Oriented metrics are an estimation procedure of product metrics in which computation is done on real-world entities to depict them as indicated by plainly characterized rules. These metrics encourage programming specialists to discover the profitability of the product application (6) . ...
... Consequently, one of the essential objectives of software engineering is to develop techniques and tools for high-quality software solutions that are stable and maintainable [5]. Software maintainability is one of the most important aspects when evaluating the quality of a software product [6] and is one of key stages in the software development lifecycle [7]. ...
... The relationship between software design metrics and their maintainability has been proposed and validated by many researchers [6,9]. Based on the empirical study by Malhotra and Chug, it has been established that the quality of the software design, as well as code, is very important to enhance software maintainability [9]. ...
... The indirect maintainability measures combined with a variety of software metrics that capture the quality of software's internal quality, represent efficient input for either statistical or machine learning algorithms to make useful prediction models. To establish a relationship between software design metrics as the independent variable and maintainability as the dependent variable, various techniques have been practised in the last two and half decades [9], including statistical algorithms, machine learning algorithms, nature-inspired techniques, expert judgment, and hybrid techniques [6,19]. ...
Article
Full-text available
Software maintenance is one of the key stages in the software lifecycle and it includes a variety of activities that consume the significant portion of the costs of a software project. Previous research suggest that future software maintainability can be predicted, based on various source code aspects, but most of the research focuses on the prediction based on the present state of the code and ignores its history. While taking the history into account in software maintainability prediction seems intuitive, the research empirically testing this has not been done, and is the main goal of this paper. This paper empirically evaluates the contribution of historical measurements of the Chidamber & Kemerer (C&K) software metrics to software maintainability prediction models. The main contribution of the paper is the building of the prediction models with classification and regression trees and random forest learners in iterations by adding historical measurement data extracted from previous releases gradually. The maintainability prediction models were built based on software metric measurements obtained from real-world open-source software projects. The analysis of the results show that an additional amount of historical metric measurements contributes to the maintainability prediction. Additionally, the study evaluates the contribution of individual C&K software metrics on the performance of maintainability prediction models.
... The results showed that bagging ensemble model significantly improved the accuracy of prediction. Recently, Zighed et al. [14] conducted a comparative analysis of different OO SMP models from three perspectives i.e. the architectural level, design level and the code level. It was revealed that a number of statistical and ML techniques have been employed at the code level. ...
Article
Full-text available
Software Maintainability refers to the ease with which software maintenance activities like correction of faults, deletion of obsolete code, addition of new code etc. can be carried out to adapt to the modified environment. Predicting maintainability in early stages of development helps in reducing the cost of maintenance and ensures optimum utilization of resources. Sometimes, it becomes difficult to train prediction models using historical data of the same dataset for which the model is being developed because of the unavailability of sufficient amount of training data, in turn making a way for Cross-Project technique for Software Maintainability Prediction (CPSMP). In order to evaluate the proposed CPSMP technique, QUES dataset is used as training set and UIMS dataset is used as test set in this study with 19 different regression modelling methods. Performance of CPSMP model is evaluated using Root Mean Square Error (RMSE) as an accuracy measure. Results show that cross-project technique can successfully be applied for maintainability prediction. The average RMSE value calculated for all the modelling methods is found to be 82.310 without CPSMP whereas an average RMSE value of 71.532 is obtained with CPSMP resulting in an overall improvement in prediction performance by 13.09%. Also, 84.21% of the total techniques used in this study performed better with CPSMP.
Thesis
Today, when a company designs, develops and manufactures goods or services, it must not only target a high level of quality for the products to satisfy customers, but also comply with many standards and regulations. This is particularly true with transportation systems where we can name few famous standards and guidelines: the ISO 26262 [1] addresses the software functional safety in automotive, the ARP4754 [2] provides guidelines for the development of civil aircrafts, and the DO-178C addresses software safety [3] in aeronautics. Furthermore, these safety guidelines impose to the company to be at the state of the art for processes and methods, when designing and developing a new vehicle.In the context of automotive systems’ development, our research aims to strengthen and unify quality definition, assessment, control, or prediction activities for automotive embedded software. Thus, to resolve this problematic, first we have to explore quality concept, qualimetry -the science of quality quantification [4]-, and the state of the art about quality modeling for embedded software. The result is not only to popularize and synthetize the knowledge behind these complex concepts but also, to confirm the choice of qualimetry as the right approach to solve our problematic, for which no proper solution exists yet.We then continue our study considering biology as key factor in our research. Therefore, we create a classified collection of clades of more than 450 quality models for software. We select the most appropriate quality model from this pool of quality models, and after introducing the concept of polymorphism in quality modeling, we demonstrate how to adapt and operationalize this model to automotive embedded software. This last achievement consequently replies to our original problematic.As a further conclusion of our research, we finally investigate whether a unique quality model for software product, as Zouheyr Tamrabet et al. [5] aim to propose, is more appropriate than a meta-model as quality model aggregator for software product, giving a first glimpse of the model result whose qualifier is the genome of software quality model.[1] “ISO 26262-6:2011 - Road vehicles - Functional safety - Part 6: Product development at the software level,” International Organization for Standardization, 2011.[2] “ARP4754A - Guidelines for Development of Civil Aircraft and Systems,” SAE International, Dec. 2010, [Online]. Available: https://www.sae.org/standards/content/arp4754a/.[3] “DO-178C - Software Considerations in Airborne Systems and Equipment Certification,” Radio Technical Commission for Aeronautics, Dec. 2011, [Online]. Available: https://my.rtca.org/NC__Product?id=a1B36000001IcmqEAC.[4] G. G. Azgaldov et al., “Qualimetry: the Science of Product Quality Assessment,” Standart y i kachest vo, no. 1, 1968.[5] Zouheyr Tamrabet, Toufik Marir, and Farid MOKHATI, “A Survey on Quality Attributes and Quality Models for Embedded Software,” International Journal of Embedded and Real-Time Communication Systems (IJERTCS), vol. 9, no. 2, pp. 1–17, 2018, doi: 10.4018/IJERTCS.2018070101.
Article
Full-text available
Software engineering is a discipline of Computer Science in which the new sub-areas are constantly added, especially in the area of quality, data management, and architectural design. Nowadays software development languages and processes are rapidly changing to deliver high-quality software products, i.e., usable systems, hybrid, and fulfill users’ needs. This paper aims to identify and classify different process models proposed by the researchers based on characteristics of software quality, data management, and software integration and redesign. From a study of several models through a systematic mapping study, we identify different parameters and presented them in a traceability matrix. The parameters are classified into six areas. This paper provides an in- depth theoretical insight into the models and characteristics. A systematic mapping study was conducted through a literature review. The methodology used in this paper is both qualitative and quantitative. Initially, through a systematic mapping study, we study different models working on different parameters. And then proposed a model that can cover all the aspects of software implementation and management. We select ERP systems for it. Later we perform the GAP analysis and statistical evaluation of the model. It has been observed that all of the models are area specific either focused on quality parameters or management issues or architectural-based. The proposed model covers all aspects. The primary research shows that industrialists also need a better model for quality implementation. Our statistical analysis can serve as a decision-making tool for them to add to their decision-making processes. The other could use it to further enhance the framework for quality management. This model will enhance further in the future for better implementation.