Article

Towards a Metrics Suite for Object Oriented Design

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

While software metrics are a generally desirable feature in the software management functions of project planning and project evaluation, they are of especial importance with a new technology such as the object-oriented approach. This is due to the significant need to train software engineers in generally accepted object-oriented principles. This paper presents theoretical work that builds a suite of metrics for object-oriented design. In particular, these metrics are based upon measurement theory and are informed by the insights of experienced object-oriented software developers. The proposed metrics are formally evaluated against a widelyaccepted list of software metric evaluation criteria.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Table 3 presents the selected metrics, the definition used, and the aspect of modularity they represent. We highlight that the selected metrics had their correlation to the modularity previously explored by the literature [18,59,60]. We used the CK tool [61] to calculate the metrics in Table 3. ...
... The lowest the value, the highest the cohesion of the class. [63,59] Cohesion These metrics approach different views of modularity. High coupling means that the elements in the source code have a strong dependency on each other. ...
... Notice that the same is valid for the selection of the modularity metrics. We also selected the used metrics, influenced by their use in the literature [21,59] and evidence of its correlation with the modularity aspects [13,60]. ...
Preprint
Full-text available
Context. Code smell is a symptom of decisions about the system design or code that may degrade its modularity. For example, they may indicate inheritance misuse, excessive coupling and size. When two or more code smells occur in the same snippet of code, they form a code smell agglomeration. Objective. Few studies evaluate how agglomerations may impact code modularity. In this work, we evaluate which aspects of modularity are being hindered by agglomerations. This way, we can support practitioners in improving their code, by refactoring the code involved with code smell agglomeration that was found as harmful to the system modularity. Method. We analyze agglomerations composed of four types of code smells: Large Class, Long Method, Feature Envy, and Refused Bequest. We then conduct a comparison study between 20 systems mined from the Qualita Corpus dataset with 10 systems mined from GitHub. In total, we analyzed 1,789 agglomerations in 30 software projects, from both repositories: Qualita Corpus and GitHub. We rely on frequent itemset mining and non-parametric hypothesis testing for our analysis. Results. Agglomerations formed by two or more Feature Envy smells have a significant frequency in the source code for both repositories. Agglomerations formed by different smell types impact the modularity more than classes with only one smell type and classes without smells. For some metrics, when Large Class appears alone, it has a significant and large impact when compared to classes that have two or more method-level smells of the same type. Conclusion. We have identified which agglomerations are more frequent in the source code, and how they may impact the code modularity. Consequently, we provide supporting evidence of which agglomerations developers should refactor to improve the code modularity.
... • RQ2: Are Error-type metrics more effective predictors of faults compared to CK metrics when used to train DNN models? -This research question aims to empirically compare the performance differences between DNN models constructed using Error-type metrics and those constructed using CK metrics [13]. The rationale for using CK metrics as a benchmark for comparison is detailed in subsection 3.2. ...
... Over the past three decades, a wide range of software metrics has been proposed. These include object-oriented metrics like CK metrics suite [13], MOODS metrics suite [17], Bansiya metrics suite [16], and traditional metrics like size metrics (e.g., Function Points -FP, SLOC, KSLOC), quality metrics (e.g., Defects per FP after delivery, Defects per SLOC or KSLOC after delivery), system complex metrics [18], and Halstead metrics [19]. As highlighted by Rathore and Kumar [2], the performance of metrics can vary based on the environment (e.g., open-source vs. commercial). ...
... To validate the effectiveness of Error-type metrics, we also train DNN models on the wellestablished Chidamber & Kemerer (CK) metrics suite [13]. The CK metrics suite comprises six metrics, namely Coupling Between Objects (CBO), Weighted Methods per Class (WMC), Response for a Class (RFC), Lack of Cohesion in Methods (LCOM), Depth of Inheritance Tree (DIT), and Number of Children (NOC). ...
Article
Full-text available
In the context of software quality assurance, Software Fault Prediction (SFP) serves as a critical technique to optimise costs and efforts by classifying software modules as faulty or not, using pertinent project characteristics. Despite considerable progress, SFP techniques seem to have hit a "performance ceiling", mainly due to the limitations of small-scale datasets from public repositories and the challenge of selecting the most appropriate software metrics for each unique application domain. Additionally, traditional machine learning techniques have been the mainstay for fault-proneness prediction, leaving the potential of more advanced methodologies, such as Deep Neural Networks (DNNs), largely unexplored. This paper addresses these gaps by introducing an innovative approach for fault-proneness prediction through the application of DNNs trained with Error-type metrics on industrial open-source software projects. Error-type metrics, with their application-agnostic nature and proven capabilities in improving prediction performances, are leveraged to facilitate broader informational content in the training data and overcome the "performance ceiling". The empirical results reveal that DNN models, trained with Error-type metrics, have shown significant performance improvements of up to 40% in terms of AUC and ROC when compared to models built using the conventional CK metrics. Notably, our proposed methodology also demonstrates superior performance when compared to state-of-the-art DNN models, even those that leverage the sophisticated self-attention mechanism, with our approach surpassing them by up to 17.86%.
... According to metrics definitions, improving means increasing the value in some and decreasing the value in others. Different groups of these metrics such as CK metric (Chidamber and Kemerer, 1994), MOOD metrics (e Abreu, ...
... On the other hand, a high value indicates reduced encapsulation, increased complexity, and increased error probability (Chidamber and Kemerer, 1991). There are several LCOM criteria such as LCOM1 (Chidamber and Kemerer, 1991), LCOM2 (Chidamber and Kemerer, 1994), LCOM3 (Li and Henry, 1993),LCOM4 (Hitz and Montazeri, 1995)and LCOM5 (Henderson-Sellers, 1995). These criteria are similar but with different formulas. ...
Preprint
Full-text available
Anti-patterns occur when software design principles are not followed in the software design and coding process. Although there may be no errors or bugs in the software implementation, the quality of the software is reduced. To improve the quality of the software, its anti-patterns should be identified and corrected. Various software engineering criteria are defined and they are used to detect anti-patterns. Choosing the correct and limited combination of criteria as a detection rule is the main purpose of new automatic anti-pattern detection methods. Selection of base examples in these methods for their training is a challenge, which is highly dependent on software experts. In our method, the anti-patterns identified in open-source programs presented as a standard are used for the base examples to reduce the influence of experts’ tastes in detection. The second challenge in all metric-based methods is the complexity of choosing correct and limited criteria in detection rules which in our method by separating the rule tree for every anti-pattern, considered criteria are limited as much as possible. Therefore, we present the SADSE method, which is a new approach to create anti-pattern detection rules using standard base examples and reduce the complexity of searching for suitable software metrics. to identify the detection rules, a new fitness function is defined in the genetic programming, and the rule tree is created to identify each anti-pattern separately to reduce the complexity of selecting software criteria. The analysis of our method by SERC Benchmark shows that its precision and recall on average for all considered anti-patterns are 97% and 96.9%, respectively.
... Chidamber and Kemerer (C&K) Metrics C&K's proposed metrics are [28,43,44]: ...
... Therefore, the complexity of each method can either be considered equal to 1, or the McCabe cyclomatic complexity can be calculated for each method. C&K suggested the McCabe cyclomatic complexity to be used [43], but later in their final publication, they simply state that any "traditional" metric can be applied [44]. In general, the number of methods and the complexity constitutes an indicator of the effort required to develop and maintain the class. ...
Article
Full-text available
Over the years, various software quality measurement models have been proposed and used in academia and the software industry to assess the quality of produced code and to obtain guidelines for its improvement. In this article, we describe the design and functionality of SQMetrics, a tool for calculating object-oriented quality metrics for projects written in Java. SQMetrics provides the convenience of measuring small code, mainly covering academic or research needs. In this context, the application can be used by students of software engineering courses to make measurements and comparisons in their projects and gradually increase their quality by improving the calculated metrics. Teachers, on the other hand, can use SQMetrics to evaluate students’ Java projects and grade them in proportion to their quality. The contribution of the proposed tool is three-fold, as it has been: (a) tested for its completeness and functionality by comparing it with widely known similar tools, (b) evaluated for its usability and value as a learning aid by students, and (c) statistically tested for its value as a teachers’ aid assisting in the evaluation of student projects. Our findings verify SQMetrics’ effectiveness in helping software engineering students learn critical concepts and improve the quality of their code, as well as in helping teachers assess the quality of students’ Java projects and make more informed grading decisions.
... Most research effort focuses on object-oriented programs, by using metrics that capture information about the static structure of the code at the class-level or method-level (as for example the Chidamber and Kemerer's metrics [25]). The software metrics that have the potential of being good testability predictors are derived by investigating the correlation between the metrics and the amount, the complexity and the thoroughness of the associated test cases. ...
... Alshawan et al. proposed a set of static metrics specific to web applications [22]. A large body of papers refer to the so called CK metrics for object oriented software [25]. Gupta et al. propose a fuzzy approach to integrate the CK metrics in a unique metric that should represent the testability [10]. ...
Preprint
Estimating software testability can crucially assist software managers to optimize test budgets and software quality. In this paper, we propose a new approach that radically differs from the traditional approach of pursuing testability measurements based on software metrics, e.g., the size of the code or the complexity of the designs. Our approach exploits automatic test generation and mutation analysis to quantify the evidence about the relative hardness of developing effective test cases. In the paper, we elaborate on the intuitions and the methodological choices that underlie our proposal for estimating testability, introduce a technique and a prototype that allows for concretely estimating testability accordingly, and discuss our findings out of a set of experiments in which we compare the performance of our estimations both against and in combination with traditional software metrics. The results show that our testability estimates capture a complementary dimension of testability that can be synergistically combined with approaches based on software metrics to improve the accuracy of predictions.
... whereas McCabe [29]'s Cyclomatic Complexity utilises graph theory to evaluate the program's complexity based on the control flow graph. Other than complexity, metrics can be used to evaluate object-oriented design principles, such as class coupling and depth of inheritance [7]. ...
... Some metrics that evaluate these areas may not be suitable for all types of novice programming assignments, especially short-form assignments typically used with unit testing. This could be due to these assignments providing the overall code design, limiting the ability to use maintainability metrics such as Depth of Inheritance Tree or Coupling Between Object Classes [7]. ...
Preprint
We conducted a systematic literature review on automated grading and feedback tools for programming education. We analysed 121 research papers from 2017 to 2021 inclusive and categorised them based on skills assessed, grading approach, language paradigm, degree of automation and evaluation techniques. Most papers grade the correctness of object-oriented assignments. Typically, these tools use a dynamic technique, primarily unit testing, to provide grades and feedback to the students. However, compared to correctness grading, few tools assess readability, maintainability, or documentation, focusing solely on the presence of documentation, not documentation quality.
... We acknowledge that using the validation status from other sources will reflect and change the results presented in Table 11, 12, 16, and 17. Metric suites such as CK metrics [68], Li & Henry metrics [69], MOOD metrics [70] and QMOOD metrics [71] have been empirically validated [41,[72][73][74][75]. However, all the good indicators metrics pointed out in the results (Table 13) remain the same as we have included both "high" and "moderate" strength of evidence. ...
Article
Context: Several secondary studies have investigated the relationship between internal quality attributes, source code metrics and external quality attributes. Sometimes they have contradictory results. Objective: We synthesize evidence of the link between internal quality attributes, source code metrics and external quality attributes along with the efficacy of the prediction models used. Method: We conducted a tertiary review to identify, evaluate and synthesize secondary studies. We used several characteristics of secondary studies as indicators for the strength of evidence and considered them when synthesizing the results. Results: From 711 secondary studies, we identified 15 secondary studies that have investigated the link between source code and external quality. Our results show : (1) primarily, the focus has been on object-oriented systems, (2) maintainability and reliability are most often linked to internal quality attributes and source code metrics, with only one secondary study reporting evidence for security, (3) only a small set of complexity, coupling, and size-related source code metrics report a consistent positive link with maintainability and reliability, and (4) group method of data handling (GMDH) based prediction models have performed better than other prediction models for maintainability prediction. Conclusions: Based on our results, lines of code, coupling, complexity and the cohesion metrics from Chidamber & Kemerer (CK) metrics are good indicators of maintainability with consistent evidence from high and moderate-quality secondary studies. Similarly, four CK metrics related to coupling, complexity and cohesion are good indicators of reliability, while inheritance and certain cohesion metrics show no consistent evidence of links to maintainability and reliability. Further empirical studies are needed to explore the link between internal quality attributes, source code metrics and other external quality attributes, including functionality, portability, and usability. The results will help researchers and practitioners understand the body of knowledge on the subject and identify future research directions.
... These functionalities help to design SW with high quality such as maintainability, reliability, portability, and reusability. Estimating the SW quality and finding its correlation with static code metrics can help testers, architects, and requirement analysts to analyze the source code concerning SW quality before deploying [1] [2]. This point is our primary motivation for present work, with an aim to find the correlation between Software Aging (SA) related bugs and static code metrics as both cost and effort to fix run-time failures or Aging-related bugs increase exponentially if the reason for these failures is not identified prior to SW deployment [3] [4]. ...
... We acknowledge that using the validation status from other sources will reflect and change the results presented in Table 11, 12, 16, and 17. Metric suites such as CK metrics [68], Li & Henry metrics [69], MOOD metrics [70] and QMOOD metrics [71] have been empirically validated [41,[72][73][74][75]. However, all the good indicators metrics pointed out in the results (Table 13) remain the same as we have included both "high" and "moderate" strength of evidence. ...
Article
Context: Several secondary studies have investigated the relationship between internal quality attributes, source code metrics and external quality attributes. Sometimes they have contradictory results. Objective: We synthesize evidence of the link between internal quality attributes, source code metrics and external quality attributes along with the efficacy of the prediction models used. Method: We conducted a tertiary review to identify, evaluate and synthesize secondary studies. We used several characteristics of secondary studies as indicators for the strength of evidence and considered them when synthesizing the results. Results: From 711 secondary studies, we identified 15 secondary studies that have investigated the link between source code and external quality. Our results show : (1) primarily, the focus has been on object-oriented systems, (2) maintainability and reliability are most often linked to internal quality attributes and source code metrics, with only one secondary study reporting evidence for security, (3) only a small set of complexity, coupling, and size-related source code metrics report a consistent positive link with maintainability and reliability, and (4) group method of data handling (GMDH) based prediction models have performed better than other prediction models for maintainability prediction. Conclusions: Based on our results, lines of code, coupling, complexity and the cohesion metrics from Chidamber & Kemerer (CK) metrics are good indicators of maintainability with consistent evidence from high and moderate-quality secondary studies. Similarly, four CK metrics related to coupling, complexity and cohesion are good indicators of reliability, while inheritance and certain cohesion metrics show no consistent evidence of links to maintainability and reliability. Further empirical studies are needed to explore the link between internal quality attributes, source code metrics and other external quality attributes, including functionality, portability, and usability. The results will help researchers and practitioners understand the body of knowledge on the subject and identify future research directions.
... Tamburri and Bersani et al. [48] designated an MVC architecture pattern that used Chidamber and Kemerer (CK) metrics [49]. The LRS finds 17 metrics for microservices, the most important of which is performance. ...
... Researchers in the literature have proposed many code metrics-based defect prediction models. The Chidamber and Kemerer (CK) metrics suit is widely used for software defect prediction models in the object-oriented system [2][3][4][5]. These fault prediction models are either based on machine learning (ML) algorithms or on the code metrics' threshold values [6][7][8][9][10][11][12][13]. ...
Article
Full-text available
Software fault prediction models are very important to prioritize software classes for effective testing and efficient use of resources so that the testing process’s time, effort, and cost can be reduced. Fault prediction models can be based on either metrics’ threshold values or machine learning. Code metrics’ threshold-based models are easy to automate and faster than machine learning-based models, which can save significant time in the testing process. ROC, Alves ranking, and VARL are famous threshold value calculation techniques. Out of which ROC is the best threshold calculation technique. This research article proposes a new threshold values calculation technique based on metaheuristics. A genetic algorithm and particle swarm optimizer are used to calculate the threshold values, and the proposed technique is tested on ten open-source object-oriented software datasets and four open-source procedural software datasets. Results show that the metaheuristic-based thresholds give better results than ROC-based thresholds.
... Although standard LOC measures correlate with many code metrics [9], most of the time, one has to collect multiple measures to evaluate the code from the perspective of a single quality characteristic. As a result, it is a common practice to rely on code-metric suites, e.g., REBOOT [24], QMOOD [25], CK metrics [26], or even multiple variants of metrics measuring the same aspects, e.g., code cohesion [27]. Unfortunately, collecting multiple metrics increases the costs of running a measurement program. ...
Article
Full-text available
Context: Lines of code (LOC) is a fundamental software code measure that is widely used as a proxy for software development effort or as a normalization factor in many other software-related measures (e.g., defect density). Unfortunately, the problem is that it is not clear which lines of code should be counted: all of them or some specific ones depending on the project context and task in mind? Objective: To design a generator of task-specific LOC measures and their counters mined directly from data that optimize the correlation between the LOC measures and variables they proxy for (e.g., code-review duration). Method: We use Design Science Research as our research methodology to build and validate a generator of task-specific LOC measures and their counters. The generated LOC counters have a form of binary decision trees inferred from historical data using Genetic Programming. The proposed tool was validated based on three tasks, i.e., mining LOC measures to proxy for code readability, number of assertions in unit tests, and code-review duration. Results: Task-specific LOC measures showed a “strong” to “very strong” negative correlation with code-readability score (Kendall’s τ ranging from -0.83 to -0.76) compared to “weak” to “strong” negative correlation for the best among the standard LOC measures (τ ranging from -0.36 to -0.13). For the problem of proxying for the number of assertions in unit tests, correlation coefficients were also higher for task-specific LOC measures by ca. 11% to 21% (τ ranged from 0.31 to 0.34). Finally, task-specific LOC measures showed a stronger correlation with code-review duration than the best among the standard LOC measures (τ = 0.31, 0.36, and 0.37 compared to 0.11, 0.08, 0.16, respectively). Conclusions: Our study shows that it is possible to mine task-specific LOC counters from historical datasets using Genetic Programming. Task-specific LOC measures obtained that way show stronger correlations with the variables they proxy for than the standard LOC measures.
... II. RELATED WORK Software metrics are essential tools for assessing the quality and performance of software products and development processes. These metrics can significantly influence the accuracy of SDP models, including the widely used lines of code (LOC) and C&K metrics [8]. In the early stage of research, some studies use software metrics and traditional machine learning algorithms to predict and estimate whether software module contains defects, such as Logistic regression (LR) and support vector machine (SVM) [9]. ...
... Two proposed parameters are calculated, the amount of generosity contained in the project, and then systematically calculated alongside the set of 9 axioms by Weyuker. Chidamber and Kemerer [6,7] address these ...
Article
Sensors used in image acquisition. This sensor technology is going on upgrading as per user need or as per need of an application. Multiple sensors collect the information of their respective wavelength band. But one sensor is not sufficient to acquire the complete information of one scene. To gain the overall data of one part, it becomes essential to cartel the images from multiple sources. This is achieved through merging. It is the method of merging the data from dissimilar input sources to create a more informative image compared with an image from a single input source. These are multisensor photos e.g., panchromatic and multispectral images. The first image offers spatial records whereas the lateral image offers spectral data. Through visible inspections, the panchromatic photo is clearer than a multispectral photo however the grey shade image is. Articles are greater clear however nownot recognized whereasmultispectral picture displays one of a kind shades however performing distortion. So comparing the characteristics of these two images, the resultant image is greater explanatory than these enter images. Fusion is done using different transform methods as well as the Genetic Algorithm (GA). Comparing the results obtained by these methods, the output image by the GA is clearer. The feature of the resultant image is verified through parameters such as Root Mean Square Error (RMSE), peak signal to noise ratio, Mutual Information (MI), and Spatial Frequency (SF). In the subjective analysis, some transform techniques also giving exact fused images. The hybrid approach combines the transform technique and a GA is used for image fusion. This is again compared with GA results. The same performance parameters are used. And it is observed that the Hybrid Genetic Algorithm (HGA) is superior tothe AG. Here the only RMSE parameter is considered under the fitness function of the GA so only this parameter is far better than the remaining parameters. If we consider all parameters in the fitness function of the GA then all parameters using a HGA will give better performance. This method is called a Hybrid Multiobjective Genetic Algorithm (HMOGA) [14].
... Kitchenham concludes that papers presenting empirical validations of metrics have the highest impact on metrics research although she has also identified several issues with this type of studies. For example, 5 out of 7 papers, which empirically validated the object oriented metrics proposed by Chidamber and Kemerer [50], included Lack of Cohesion (LCOM) in the validation. Kitchenham [49] pointed out that LCOM has been demonstrated theoretically invalid [51] and that continuous attempts to validate LCOM empirically seem therefore futile. ...
Preprint
Full-text available
BACKGROUND: Software Process Improvement (SPI) is a systematic approach to increase the efficiency and effectiveness of a software development organization and to enhance software products. OBJECTIVE: This paper aims to identify and characterize evaluation strategies and measurements used to assess the impact of different SPI initiatives. METHOD: The systematic literature review includes 148 papers published between 1991 and 2008. The selected papers were classified according to SPI initiative, applied evaluation strategies, and measurement perspectives. Potential confounding factors interfering with the evaluation of the improvement effort were assessed. RESULTS: Seven distinct evaluation strategies were identified, wherein the most common one, "Pre-Post Comparison" was applied in 49 percent of the inspected papers. Quality was the most measured attribute (62 percent), followed by Cost (41 percent), and Schedule (18 percent). Looking at measurement perspectives, "Project" represents the majority with 66 percent. CONCLUSION: The evaluation validity of SPI initiatives is challenged by the scarce consideration of potential confounding factors, particularly given that "Pre-Post Comparison" was identified as the most common evaluation strategy, and the inaccurate descriptions of the evaluation context. Measurements to assess the short and mid-term impact of SPI initiatives prevail, whereas long-term measurements in terms of customer satisfaction and return on investment tend to be less used.
... Two proposed parameters are calculated, the amount of generosity contained in the project, and then systematically calculated alongside the set of 9 axioms by Weyuker. Chidamber and Kemerer [6,7] address these ...
Article
Full-text available
Various studies use numerous probabilistic methods to establish a cause-effect relationship between a drug and a disease. However, only a limited number of machine learning studies on establishing cause-effect relationships can be found on the internet. In this study, we explore machine learning approaches for interpreting large quantities of multivariate patient-based laboratory data for establishing cause-effect relationships for critically ill patients. We adopt principal component analysis as a primary method to capture daily patient changes after a medical intervention so that the causal relationship between the medical treatments and the outcomes can be established. Model validity and stability are evaluated using bootstrap testing. The model exhibits an acceptable significance level with a two-tailed test. Moreover, results show that the approach provides promising results in interpreting large quantities of patient data and establishing cause-effect relationships for making informed decisions for critically ill patients. If fused with other machine learning and probabilistic models, the proposed approach can provide the healthcare industry with an added tool for daily routine clinical practices. Furthermore, the approach will be able to support clinical decision-making and enable effective patient-tailored care for better health outcomes.
... Two proposed parameters are calculated, the amount of generosity contained in the project, and then systematically calculated alongside the set of 9 axioms by Weyuker. Chidamber and Kemerer [6,7] address these ...
Article
The existence of a large number of zombie enterprises will affect the economic development and hinder the transformation and upgrading of economic industries. To improve the accuracy of zombie enterprise identification, this paper takes multidimensional enterprise data as the original data set, divides it into training set and validation set, and gives the corresponding data pre-processing methods. Combined with 14 standardized features, an integrated learning model for zombie enterprise classification and recognition is constructed and studied based on three pattern recognition algorithms. By using the idea of integration and the cross-validation method to determine the optimal parameters, the Gradient Boosting Decision Tree (GBDT), linear kernel Support Vector Machine (SVM) and Deep Neural Network (DNN) algorithms with classification accuracies of 95%, 96% and 96%, respectively, are used as sub-models, and a more comprehensive strong supervision model with a classification accuracy of 98% is obtained by the stacking method in combination with the advantages of multiple sub-models to analyze the fundamental information of 30885 enterprises. The study improves the accuracy of zombie enterprise identification to 98%, builds enterprise portraits based on this, and finally visualizes the classification results through the platform, which provides an auxiliary means for zombie enterprise classification and identification.
... The independent variables used in this study are various object oriented metrics. Among a number of metric suites proposed in literature, we have used a famous metric suite proposed by Chidamber and Kemerer [31]. It consists of 6 metrics (WMC, NOC, DIT, LCOM, CBO, RFC) which measure different concepts of object oriented paradigm such as coupling, cohesion, inheritance. ...
Article
Full-text available
Change prediction is very essential for producing good quality software. It leads to saving of lots of resources in terms of money, manpower and time. Predicting the classes during early phases can be done with the help of model construction using machine learning techniques. Every technique requires approximately equal distribution of classes (balanced data) for an efficient prediction. In this study, we have used a sampling approach to balance the data. We observed the improvement in accuracy after the models are trained on the balanced data. To further improve the accuracy of the models, the default parameters of the sampling approach have been adjusted /tuned. The results show the improvement in accuracy after sampling and parameter tuning.
... Most existing software defect metrics fall into two main categories: software code metrics and software process metrics [11]. Software code metrics (such as LOC [12], Halstead [13], and McCabe [14]) represent the code program' complexity, and software process metrics (such as CK [15], Martin [16], and MOOD [17]) represent the development process' complexity. ...
Article
Full-text available
Software defect prediction is critical to ensuring software quality. Researchers have worked on building various defect prediction models to improve the performance of defect prediction. Existing defect prediction models are mainly divided into two categories: models constructed based on artificial statistical features and models constructed based on semantic features. DP-CNN [Li J, He P, Zhu J, et al. Software defect prediction via convolutional neural network. In: 2017 IEEE international conference on software quality, reliability and security (QRS). IEEE, 2017; 318–328.] is one of the best defect prediction models, because it combines both artificial statistical features and semantic features, so its performance is greatly improved compared to traditional defect prediction models. This paper is based on the DP-CNN model and makes the following two improvements: first, using a new Struc2vec network representation technique to mine existing information between software modules, which specializes in learning node representations from structural identity and can further extract structural features associated with defects. Let the DP-CNN model once again incorporate the newly mined structural features. Then, this paper proposes a feature selection method based on counterfactual explanations, which can determine the importance score of each feature by the feature change rate of counterfactual samples. The origin of these feature importance scores is interpretable. Under the guidance of these interpretable feature importance scores, better feature subsets can be obtained and used to optimize artificial statistical features within the DP-CNN model. Based on the above methods, this paper proposes a new hybrid defect prediction model DPS-CNN-STR. Evaluating our model on six open source projects in terms of F1 score in defect prediction. Experimental results show that DPS-CNN-STR improves the state-of-the-art method by an average of 3.3%.
... Two proposed parameters are calculated, the amount of generosity contained in the project, and then systematically calculated alongside the set of 9 axioms by Weyuker. Chidamber and Kemerer [6,7] address these ...
Article
In this research, Particle Swarm Optimization (PSO) based image equalization is projected to enhance the contrast of different breast cancer images. Breast cancer is the highest and another important root of tumor disease in females worldwide. Mass and microcalcification clusters are a significant early signs of breast cancer. The mortality rate can effectively be decreased by early diagnosis and treatment. Most practical approach for the early detection and identification of breast cancer diseases is mammography. Mammographic images contaminated by noise usually involve image enhancement techniques to aid interpretation. Contrast enhancement is divided into two categories: development of direct contrast and enhancement of indirect contrast. Indirect contrast improvement is used in the image histogram update. Histogram Equalization (HE) is the modest enhancement of the indirect contrast approach usually used for contrast enhancement. The proposed method's average entropy is 5.3251 with the highest structural similarity index 0.99725. The best contrast improvement of this method is 1.0404 and PSNR is 46.3803. The MSE value is 2157.08. This paper recommends an innovative method of enhancing digital mammogram image contrast based on different histogram equalization approaches. The performance of the projected method has been related to other prevailing techniques using the parameters, namely, discrete entropy, contrast improvement index, structural similarity index measure, mean square error, and peak signal-to-noise ratio. Investigational findings indicate that the projected strategy is efficient and robust and shows better results than others.
... These metrics can be extracted by using tools such as IntelliJ Idea [34], Rational Software Analyser (RSA) [28], and Aniche [62]. It is notable, three studies used the well-known Chidamber and Kemerer (CK) indices [63]. This feature type has been used for estimating the test efforts, predicting the defects in codes, and predicting the branch coverage. ...
Preprint
Full-text available
This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in ML-based solutions for CI and analyzed various aspects such as data engineering, feature engineering, hyper-parameter tuning, ML models, evaluation methods, and metrics. In this paper, we have depicted the phases of CI testing, the connection between them, and the employed techniques in training the ML method phases. We presented nine types of data sources and four taken steps in the selected studies for preparing the data. Also, we identified four feature types and nine subsets of data features through thematic analysis of the selected studies. Besides, five methods for selecting and tuning the hyper-parameters are shown. In addition, we summarised the evaluation methods used in the literature and identified fifteen different metrics. The most commonly used evaluation methods were found to be precision, recall, and F1-score, and we have also identified five methods for evaluating the performance of trained ML models. Finally, we have presented the relationship between ML model types, performance measurements, and CI phases. The study provides valuable insights for researchers and practitioners interested in ML-based methods in CI and emphasizes the need for further research in this area.
... Two proposed parameters are calculated, the amount of generosity contained in the project, and then systematically calculated alongside the set of 9 axioms by Weyuker. Chidamber and Kemerer [6,7] address these ...
Article
Facial expressions is an intuitive reflection of a person’s emotional state, and it is one of the most important forms of interpersonal communication. Due to the complexity and variability of human facial expressions, traditional methods based on handcrafted feature extraction have shown insufficient performances. For this purpose, we proposed a new system of facial expression recognition based on MobileNet model with the addition of skip connections to prevent the degradation in performance in deeper architectures. Moreover, multi-head attention mechanism was applied to concentrate the processing on the most relevant parts of the image. The experiments were conducted on FER2013 database, which is imbalanced and includes ambiguities in some images containing synthetic faces. We applied a pre-processing step of face detection to eliminate wrong images, and we implemented both SMOTE and Near-Miss algorithms to get a balanced dataset and prevent the model to being biased. The experimental results showed the effectiveness of the proposed framework which achieved the recognition rate of 96.02% when applying multi-head attention mechanism
Article
Background The novelty of the work lies in the formulation of these frequency-based generators, which reflects the lowest level of information loss in the intermediate calculations. The core idea behind the approach presented in this work is that a module with complex logic involved may have more probability of bugs. Software defect prediction is the area of research that enables the development and operations team to have the probability of bug proneness of the software. Many researchers have deployed multiple variations of machine learning and deep learning algorithms to achieve better accuracy and more insights into predictions. Objective To prevent this fractional data loss from different derived metrics generations, a few optimal transformational engines capable of carrying forward formulations based on lossless computations have been deployed. Methods A model Sodprhym has been developed to model refined metrics. Then, using some classical machine learning algorithms, accuracy measures have been observed and compared with the recently published results, which used the same datasets and prediction techniques. Results The teams could establish watchdogs thanks to the automated detection, but it also gave them time to reflect on any potentially troublesome modules. For quality assurance teams, it has therefore become a crucial step. Software defect prediction looks forward to evaluating error-prone modules likely to contain bugs. Conclusion Prior information can definitely align the teams with deploying more and more quality assurance checks on predicted modules. Software metrics are the most important component for defect prediction if we consider the different underlying aspects that define the defective module. Later we deployed our refined approach in which we targeted the metrics to be considered.
Article
Full-text available
Assigning access specifier is not an easy task as it decides over all security of any software .Though there are many metrics tools available in a market to measure the security at early stage. But in this case assignment of access specifier is totally based on the human judgment and understanding .Objective of proposed tool is to generate all possible solutions by applying Genetic Algorithm (GA). Our Secure Coupling Measurement Tool (SCMT) uses coupling, feature of OO design to determine the security at design level. It Takes input as a UML class diagram with basic constraints and generates alternate solutions i.e. combinations. Tool also provides metrics at code level to compute the security at code level. Result of both the metrics give proof of secure design with the help of spider chart as well as scope to change the design
Article
The cost of software testing could be reduced if faulty entities were identified prior to the testing phase, which is possible with software fault prediction (SFP). In most SFP models, machine learning (ML) methods are used, and one aspect of improving prediction accuracy with these methods is tuning their control parameters. However, parameter tuning has not been addressed properly in the field of software analytics, and the conventional methods (such as basic Differential Evolution, Random Search, and Grid Search) are either not up-to-date, or suffer from shortcomings, such as the inability to benefit from prior experience, or are overly expensive. This study aims to examine and propose parameter tuners, called DEPTs, based on different variants of Differential Evolution for SFP with the Swift-Finalize strategy (to reduce runtime), which in addition to being up-to-date, have overcome many of the challenges associated with common methods. An experimental framework was developed to compare DEPTs with three widely used parameter tuners, applied to four common data miners, on 10 open-source projects, and to evaluate the performance of DEPTs, we used eight performance measures. According to our results, the three tuners out of five DEPTs improved prediction accuracy in more than 70% of tuned cases, and occasionally, they exceeded benchmark methods by over 10% in case of G-measure. The DEPTs took reasonable amounts of time to tune parameters for SFP as well.
Article
Measuring the processes involved in knowledge engineering for designing and building an intelligent system has taken significant role. Out of the four basic processes involved in knowledge engineering, this paper deals with the knowledge acquisition process and the metrics necessary for measuring the process itself. Three metrics are proposed for the knowledge acquisition process based on the entailment procedures, its length and complexity, and the cohesion and coupling attributes of the collection of knowledge units. These three metrics are formalized based on the Briand’s mathematical properties for validating software metrics. These metrics are indicative in the way it gives an insight on the design and the development of a knowledgebase. In addition to these metrics, newer metrics can also be proposed for knowledge representation and knowledge sharing processes.
Conference Paper
Design patterns are reusable solutions to common design problems. They speed up the development process and help to document software systems. This paper presents a framework to assess the impact of design patterns on software metrics. Our proposed framework applies refactoring techniques to generate non-pattern version of a subject system. Then, software metrics have been calculated and compared for both versions; the pattern version and non-pattern version. The proposed framework has been applied to two subject systems and one design pattern. The evaluation results show that there is no consistent behavior of the software metrics between the non-pattern version and the pattern version.
Article
Full-text available
Open source software (OSS) has been developing for more than two decades. It originated as a movement with the introduction of the first free/libre OSS operating system, became a popular trend among the developer community, led to enterprise solutions widely embraced by the global market, and began garnering attention from significant players in the software industry (such as IBM's acquisition of RedHat). Throughout the years, numerous software assessment models have been suggested, some of which were created specifically for OSS projects. Most of these assessment models focus on software quality and maintainability. Some models are taking under consideration health aspects of OSS projects. Despite the multitude of these models, there is yet to be a universally accepted model for assessing OSS projects. In this work, we aim to adapt the City Resilience Framework (CRF) for use in OSS projects to establish a strong theoretical foundation for OSS evaluation focusing on the project's resilience as it evolves over time. We would like to highlight that our goal with the proposed assessment model is not to compare two OSS solutions with each other, in terms of resilience, or even do a resilience ranking between the available OSS tools. We are aiming to investigate resilience of an OSS project as it evolves and identify possible opportunities of improvements in the four dimensions we are defining. These dimensions are as follows: source code, business and legal, integration and reuse, and social (community). The CRF is a framework that was introduced to measure urban resilience and most specifically how cities' resilience is changing as they evolve. We believe that a software evaluation model that focuses on resilience can complement the pre-existing models based on software quality and software health. Although concepts that are related to resilience, like sustainability or viability, already appear in literature, to our best knowledge, there is no OSS assessment model that evaluates the resilience of an OSS project. We argue that cities and OSS projects are both dynamically evolving systems with similar characteristics. The proposed framework utilizes both quantitative and qualitative indicators, which is viewed as an advantage. Lastly, we would like to emphasize that the framework has been tested on the enterprise software domain as part of this study, evaluating five major versions of six OSS projects, Laravel, Composer, PHPMyAdmin, OKApi, PatternalPHP, and PHPExcel, the first three of which are intuitively considered resilient and the three latter nonresilient, to provide a preliminary validation of the models' ability to distinguish between resilient and not resilient projects.
Chapter
Software reliability is one of the most important software quality attributes. It is generally predicted using different software metrics that measure internal quality attributes like cohesion and complexity. Therefore, continuous focus on software metrics proposed to predict software reliability still required. In this context, an entropy-based suite of four metrics is proposed to monitor this attribute. The different metrics composing this suite are manually computed and only theoretically validated. Hence, we aim to propose an empirical approach to validate them as useful indicators of software reliability. Therefore, we start by assessing these metrics, using a set of programs retrieved from real software projects. The obtained dataset is served to empirically validate them as reliability indicators. Given that software reliability as external attribute, cannot be directly evaluated, we use two main experiments to perform the empirical validation of these metrics. In the first experiment, we study the relationship between the redundancy metrics and measurable attributes of reliability like fault-proneness. In the second one, we study whether the combination of redundancy metrics with existed complexity and size metrics that are validated as significant reliability indicators can ameliorate the performance of the developed fault-proneness prediction model. The validation is carried out using appropriate machine learning techniques. The experiments outcome showed up that, redundancy metrics provide promising results as indicators of software reliability.KeywordsSoftware reliabilitySoftware redundancy metricsSoftware metrics validationFault-pronenessComplexity metrics
Preprint
Full-text available
Software quality is the capability of a software process to produce software product satisfying the end user. The quality of process or product entities is described through a set of attributes that may be internal or external. For the product entity, especially, the source code, different internal attributes are defined to evaluate its quality like complexity and cohesion. Concerning external attributes related to the product environment like reliability, their assessment is more difficult. Thus, they are usually predicted by the development of prediction models based on software metrics as independent variables and other measurable attributes as dependent variables. For instance, reliability like other external attributes is generally measured and predicted based on other quality attributes like defect density, defect count and fault-proneness. The success of machine learning (ML) and deep learning (DL) approaches for software defect and faulty modules classification as crucial attributes for software reliability improvement is remarkable. In recent years, there has been growing interest in exploring the use of deep learning autoencoders, a type of neural network architecture, for software defect prediction. Therefore, we aim in this paper to explore the semi-supervised denoising DL autoencoder in order to capture relevant features. Then, we evaluate its performance in comparison to traditional ML supervised SVM technique for fault-prone modules classification. The performed experiments based on a set of software metrics extracted from NASA projects achieve promising results in terms of accuracy and show that denoising DL autoencoder outperforms traditional SVM technique.
Article
Full-text available
An abstract is not available.
Article
Full-text available
It is argued that inappropriate use of software complexity measures can have large, damaging effects by rewarding poor programming practices and demoralizing good programmers. Software complexity measures must be critically evaluated to determine the ways in which they can best be used. A definition of complexity is followed by a discussion of metric development and the need for a theory of programming. The properties of measures are examined in some detail, and the testing of measures is discussed.
Article
Full-text available
A set of properties of syntactic software complexity measures is proposed to serve as a basis for the evaluation of such measures. Four known complexity measures are evaluated and compared using these criteria. This formalized evaluation clarifies the strengths and weaknesses of the examined complexity measures, which include the statement count, cyclomatic number, effort measure, and data flow complexity measures. None of these measures possesses all nine properties, and several are found to fail to possess particularly fundamental properties; this failure calls into question their usefulness in measuring synthetic complexity
Article
In this Introduction' we shall sketch the business of ontology, or metaphysics, and shall locate it on the map of learning. This has to be done because there are many ways of construing the word 'ontology' and because of the bad reputation metaphysics has suffered until recently - a well deserved one in most cases. 1. ONTOLOGICAL PROBLEMS Ontological (or metaphysical) views are answers to ontological ques­ tions. And ontological (or metaphysical) questions are questions with an extremely wide scope, such as 'Is the world material or ideal - or perhaps neutral?" 'Is there radical novelty, and if so how does it come about?', 'Is there objective chance or just an appearance of such due to human ignorance?', 'How is the mental related to the physical?', 'Is a community anything but the set of its members?', and 'Are there laws of history?'. Just as religion was born from helplessness, ideology from conflict, and technology from the need to master the environment, so metaphysics - just like theoretical science - was probably begotten by the awe and bewilderment at the boundless variety and apparent chaos of the phenomenal world, i. e. the sum total of human experience. Like the scientist, the metaphysician looked and looks for unity in diversity, for pattern in disorder, for structure in the amorphous heap of phenomena - and in some cases even for some sense, direction or finality in reality as a whole.
Article
Over the last decade many software metrics have been introduced by researchers and many software tools have been developed using software metrics to measure the "quality" of programs. These metrics for measuring productivity, reliability, maintainability, and complexity, for example, are vital to software development planning and management. In this paper a new approach is presented to describe the properties of the software metrics and their scales using measurement theory. Methods are shown to describe a software complexity metric as an ordinal, an interval or a ratio scale. The use of this concept is shown by application to the Metric of McCabe. These results are very important for selecting appropriate software metrics for software measurement and for developing tools which use software metrics to evaluate the "quality" of software.
Article
An abstract is not available.
Article
The object-oriented design strategy, as both a problem decomposition and a system-implementation methodology, holds significant potential benefits for software design and implementation in scientific computing environments. Additionally, certain characteristics of object-oriented systems are particularly suited to supporting graphics-application development. This paper presents the initial direction and organization of research and development activities in the area of object-oriented graphical information systems at the University of Southwestern Louisiana. These activities, structured as a four-phase research plan, are intended to provide significant and extensible results in the evaluation of the impact of object-oriented methodologies on graphical-information system design and implementation. The long-term goals of this research include the formulation of complexity models for object-oriented development of software and the formulation and validation of software-development metrics appropriate to object-oriented systems.
Conference Paper
The Smalltalk-80* programming language includes dynamic storage allocation, full upward funargs, and universally polymorphic procedures; the Smalltalk-80 programming system features interactive execution with incremental compilation, and implementation portability. These features of modern programming systems are among the most difficult to implement efficiently, even individually. A new implementation of the Smalltalk-80 system, hosted on a small microprocessor-based computer, achieves high performance while retaining complete (object code) compatibility with existing implementations. This paper discusses the most significant optimization techniques developed over the course of the project, many of which are applicable to other languages. The key idea is to represent certain runtime state (both code and data) in more than one form, and to convert between forms when needed.
Conference Paper
Object-oriented is a very hot topic and buzzword both in academia and industry. There are object-oriented analysis and design techniques, object-oriented languages and databases, and so on. Many people see the letters "OO", attach a "G" to the front, and a "D" to the back and deem it to be "GOOD"---without much consideration for what it means in the software life cycle.This paper discusses the on-going (3+ years) object-oriented re-design and re-implementation in C++ of a commercial CASE tool. Specifically, why an object-oriented approach was chosen and the implications and collective experiences of this approach. In addition to the anticipated benefits, much of what we experienced was unforeseen and unexpected.
Article
Presented in this paper is the data model for ORION, a prototype database system that adds persistence and sharability to objects created and manipulated in object-oriented applications. The ORION data model consolidates and modifies a number of major concepts found in many object-oriented systems, such as objects, classes, class lattice, methods, and inheritance. These concepts are reviewed and three major enhancements to the conventional object-oriented data model, namely, schema evolution, composite objects, and versions, are elaborated upon. Schema evolution is the ability to dynamically make changes to the class definitions and the structure of the class lattice. Composite objects are recursive collections of exclusive components that are treated as units of storage, retrieval, and integrity enforcement. Versions are variations of the same object that are related by the history of their derivation. These enhancements are strongly motivated by the data management requirements of the ORION applications from the domains of artificial intelligence, computer-aided design and manufacturing, and office information systems with multimedia documents.
Article
One of the primary effects of software abstraction has been to further the notion of computer programs as objects rather than moving programming closer to the problem being solved. Knowledge abstraction, however, allows software to take a significant step toward the problem domain.
Article
Many software engineering methods place internal structural constraints on the documents (including specifications, designs, and code) that are produced. Examples of such structural constraints are low coupling, high cohesion, reuse in designs and code, and control structuredness and data-abstraction in code. The use of these methods is supposed to increase the likelihood that the resulting software will have desirable external attributes, like reliability and maintainability. For this reason, we believe that the software engineering community needs to know how to measure internal attributes and needs to understand the relationships between internal and external software attributes. This can only be done if we have rigorous measures of the supposedly key internal attributes. We believe that measurement theory provides an appropriate basis for defining such measures. By way of example, we show how it is used to define a measure of coupling.
Article
In spite of the widespread acceptance by academics and practitioners of structured programming precepts, relatively few formal empirical studies have been conducted to obtain evidence that either supports or refutes the theory. This paper reviews the empirical studies that have been undertaken and critiques them from the viewpoints of the soundness of their methodology and their ability to contribute to scientific understanding. In general, the evidence supporting programming precepts is weak. A framework for an ongoing research program is outlined.
Article
Object-oriented development is a partial-lifecycle software development method in which the decomposition of a system is based upon the concept of an object. This method is fundamentally different from traditional functional approaches to design and serves to help manage the complexity of massive software-intensive systems. The author examines the process of object-oriented development as well as the influences upon this approach from advances in abstraction mechanisms, programming languages, and hardware. The concept of an object is central to object-oriented development and so the properties of an object are discussed. The mapping of object-oriented techniques to Ada using a design case study is considered.
Article
Thesis (M.S.)--Massachusetts Institute of Technology, Sloan School of Management, 1989. Includes bibliographical references (leaves 128-135). by Kenneth L. Morris. M.S.
Article
This paper describes a graph-theoretic complexity measure and illustrates how it can be used to manage and control program complexity. The paper first explains how the graph-theory concepts apply and gives an intuitive explanation of the graph concepts in programming terms. The control graphs of several actual Fortran programs are then presented to illustrate the correlation between intuitive complexity and the graph-theoretic complexity. Several properties of the graph-theoretic complexity are then proved which show, for example, that complexity is independent of physical size (adding or subtracting functional statements leaves complexity unchanged) and complexity depends only on the decision structure of a program.
Article
An ontological model of an information system that provides precise definitions of fundamental concepts like system, subsystem, and coupling is proposed. This model is used to analyze some static and dynamic properties of an information system and to examine the question of what constitutes a good decomposition of an information system. Some of the major types of information system formalisms that bear on the authors' goals and their respective strengths and weaknesses relative to the model are briefly reviewed. Also articulated are some of the fundamental notions that underlie the model. Those basic notions are then used to examine the nature and some dynamics of system decomposition. The model's predictive power is discussed
Computer Language 7 November Miller Freeman Publications San Francisco CA
  • A Hecht
  • D Taylor
On The Problem of Information System Evaluation
  • Cherniavsky V.
Metrics for Object Oriented Software Development unpublished Masters Thesis M
  • K Morris
An Object Oriemed Modelling Environment
  • T Page
Constructing Abstractions for Object Oriented Applications
  • W Cunningham
  • K Beck
Using CASE for Object Oriented Design with C++
  • A Hecht
  • D Taylor
Fall International Function Point Users Group Conference
  • S L Pfleeger
  • J D Palmer