Conference Paper

The Impact of Complexity on Software Design Quality and Costs: An Exploratory Empirical Analysis of Open Source Applications.

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

It is well known that complexity affects software development and maintenance costs. In the Open Source context, the sharing of development and maintenance effort among developers is a fundamental tenet, which can be thought as a driver to reduce the impact of complexity on maintenance costs. However, complexity is a structural property of code, which is not quantitatively accounted for in traditional cost models. This paper introduces the concept of functional complexity, which weights the well-established McCabe's cyclomatic complexity metric to the number of interactive functional elements that an application provides to users. Such metric is used to analyze how Open Source development costs are affected by complexity. Traditional cost models, like CoCoMo, do not take into account the impact of complexity in estimating costs by means of accurate indicators. In contrast, results show how a higher complexity is associated with a lower design quality of code, and, hence, higher maintenance costs. Consequently, results suggest that a reliable effort estimation should be based on a precise evaluation of software complexity. Analyses are based on quality, complexity, and maintenance effort data collected for 59 Open Source applications (corresponding to 906 versions) selected from the SourceForge.net repository.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Complexity has been recognized as being an essential property and intrinsic characteristic of software products [3,21,22], while product complexity has been viewed as the main source of the complexity of corresponding software projects [23,24] and as a significant determinant of software development effort [4,24,25]. Moreover, software effort prediction methods and models typically assume positive correlation between product complexity and development effort [7,18]. ...
... In the software economics field, complexity is also viewed as an inherent property of the functional requirements of a software product, which cannot be reduced or simplified beyond a certain threshold [3,21,22]. Similarly, product complexity has been viewed as the main source of the complexity of the corresponding software projects [23,24], and also been claimed to be a significant and nonnegligible factor that influences the effort of software development and maintenance [4,24,25]. As such, a positive correlation between software complexity and development effort exists in many estimation models: "a more complex piece of software will generally require greater effort in development than a less complex counterpart" [18]. ...
Article
Full-text available
[Background:] Software effort prediction methods and models typically assume positive correlation between software product complexity and development effort. However, conflicting observations, i.e. negative correlation between product complexity and actual effort, have been witnessed from our experience with the COCOMO81 dataset. [Aim:] Given our doubt about whether the observed phenomenon is a coincidence, this study tries to investigate if an increase in product complexity can result in the abovementioned counter-intuitive trend in software development projects. [Method:] A modified association rule mining approach is applied to the transformed COCOMO81 dataset. To reduce noise of analysis, this approach uses a constant antecedent (Complexity increases while Effort decreases) to mine potential consequents with pruning. [Results:] The experiment has respectively mined four, five, and seven association rules from the general, embedded, and organic projects data. The consequents of the mined rules suggested two main aspects, namely human capability and product scale, to be particularly concerned in this study. [Conclusions:] The negative correlation between complexity and effort is not a coincidence under particular conditions. In a software project, interactions between product complexity and other factors, such as Programmer Capability and Analyst Capability, can inevitably play a "friction" role in weakening the practical influences of product complexity on actual development effort.
... Complexity has been recognized as being an essential property and intrinsic characteristic of software products [3,21,22], while product complexity has been viewed as the main source of the complexity of corresponding software projects [23,24] and as a significant determinant of software development effort [4,24,25]. Moreover, software effort prediction methods and models typically assume positive correlation between product complexity and development effort [7,18]. ...
... In the software economics field, complexity is also viewed as an inherent property of the functional requirements of a software product, which cannot be reduced or simplified beyond a certain threshold [3,21,22]. Similarly, product complexity has been viewed as the main source of the complexity of the corresponding software projects [23,24], and also been claimed to be a significant and nonnegligible factor that influences the effort of software development and maintenance [4,24,25]. As such, a positive correlation between software complexity and development effort exists in many estimation models: "a more complex piece of software will generally require greater effort in development than a less complex counterpart" [18]. ...
Conference Paper
Full-text available
Background:] Software effort prediction methods and models typically assume positive correlation between software product complexity and development effort. However, conflicting observations, i.e. negative correlation between product complexity and actual effort, have been witnessed from our experience with the COCOMO81 dataset. [Aim:] Given our doubt about whether the observed phenomenon is a coincidence, this study tries to investigate if an increase in product complexity can result in the abovementioned counter-intuitive trend in software development projects. [Method:] A modified association rule mining approach is applied to the transformed COCOMO81 dataset. To reduce noise of analysis, this approach uses a constant antecedent (Complexity increases while Effort decreases) to mine potential consequents with pruning. [Results:] The experiment has respectively mined four, five, and seven association rules from the general, embedded, and organic projects data. The consequents of the mined rules suggested two main aspects, namely human capability and product scale, to be particularly concerned in this study. [Conclusions:] The negative correlation between complexity and effort is not a coincidence under particular conditions. In a software project, interactions between product complexity and other factors, such as Programmer Capability and Analyst Capability, can inevitably play a "friction" role in weakening the practical influences of product complexity on actual development effort.
... Similarly, in mission- [22,32] and safetycritical [29] systems, the maintenance of clean software design has been shown to be vital. Furthermore, widely adopted opensource systems also benefit significantly from meticulous software maintenance, which impacts their overall sustainability and security [17,21,24]. ...
Conference Paper
Full-text available
Background: Due to the lack of standards at the time they were written, many codebases suffer from reduced code quality. Some of these systems can not be replaced and need to be maintained to guarantee the healthy development of the software. In the context of this paper, we are presenting a case study performed on the legacy codebase of ASML, a leading Dutch company specializing in the design and manufacture of advanced semiconductor lithography equipment. As new policies were enforced within the company, many specific interfaces were deprecated. The presence of such obsolete interfaces in the codebase adds to the overall complexity of the legacy system, while also permitting a future violation of the Interface Segregation Principle, one of the five SOLID design principles. Aims: The objective of this paper is to present our experience with refactoring the specified interfaces. Besides performing the clean-up, we were tasked to find an optimal, iterative, and repeat-able approach to performing the clean-up of these interfaces while documenting our findings in a reproducible manner for future employees. Method: To discover an efficient clean-up strategy, we used the Action Research methodology. Going through the "Planning", "Acting", "Observing" and "Reflecting" phases of this methodology repeatedly, we revised the strategy and executed our refactoring work accordingly. Results: After multiple refactoring operations, our process stabilized and we introduced the "Cleaning-Up Cycle". We spent less time and had fewer failures per interface as our cycle evolved, despite attempting more complex types of interfaces later. Conclusions: We found the Clean-Up Cycle to represent a valuable strategy when executing this type of refactoring. This was fed back to the ASML team and considered the foundation for future employees continuing our work.
... Software maintenance continues to be the larger and costlier part of a software system's lifetime: maintaining a clean software design allows software engineers to implement new features faster and reduce the number of bugs in the program [9,8]. These aspects have been reported by researchers in various scenarios, from large legacy mainframe systems [59,10,11,48], to mission- [18,27] and safety-critical [26] systems, to widely adopted open source systems [21,13,17]. ...
Article
Full-text available
The presence of technical debt in legacy systems is an inevitable consequence of years of development. Metrics play a significant role in informing the prioritisation process of maintenance activities to reduce this debt. However , it is important to note that not all metrics are equally important or readily available in real industrial settings. This paper summarises an experience report of refactoring activities performed at a Dutch partnering company, aimed at identifying, prioritising and repaying parts of the architectural technical debt accumulated in two decades of development. Given the size of the refactoring task, a data-driven prioritisation was necessary, and based on the impact that the maintenance activity would have on the base system. However, the metrics available from the monitoring of the system formed a limited set, and were not always focused on architectural aspects. Even so, the impact analysis was performed and resulted in the selection of a subset of components that needed urgent maintenance. The refactoring of the identified components was centred around the well-known SOLID design principles, particularly the Dependency Inversion (DI) principle. Additionally, a set of recurring actions was established into 'refac-toring patterns' and systematically applied to more than 5,000 source, header and custom domain language files. This work, albeit limited to the period where the activity was planned for, was well received by the industrial collaborator. The patterns have proven very valuable in the process of maintaining such a large project scope. The data-driven approach and the identified patterns have helped the team navigate this large space and consistently refactor similar architectural issues that fall under the same category.
... Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [42]. Meanwhile, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [40], and hence the greater mental effort people have to exert to solve the complexity [39]. ...
Article
Full-text available
Within the service-oriented computing domain, Web service composition is an effective realization to satisfy the rapidly changing requirements of business. Although the research into Web service composition has unfolded broadly, little work has been published towards composition effort estimation. Since examining all of the related work in this area becomes a mission next to impossible, the classification of composition approaches can be used to facilitate multiple research tasks. However, the current attempts to classify Web service composition are not suitable for the research into effort estimation. For example, the contexts and technologies of composition approaches are confused in the existing classifications. This paper firstly proposes an effort-oriented classification matrix for Web service composition, which distinguishes between the context and technology dimension. The context dimension is aimed at analyzing the environmental influence on the effort of Web service composition, while the technology dimension focuses on the technical influence on the effort. Therefore, different context types and technology categories can be treated as different effort factors. Based on the classification matrix, this paper also builds an effort-estimation-checklist table by applying a set of qualitative effort estimation hypotheses to those effort factors. The table can then be used to facilitate comparing the qualitatively estimated effort between different composition approaches.
... Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [10]. Meanwhile, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [8], and hence the greater mental effort people have to exert to solve the complexity [7]. ...
... In the software economics field, complexity has been viewed as an inherent property of the functional requirements of a software product, which cannot be reduced or simplified beyond a certain threshold [19]. Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [11]. In fact, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [9], and hence the greater mental effort people have to exert to solve the complexity [8]. ...
Article
Full-text available
Considering web service composition (WSC) has increasingly become a significant practice in SOA implementations, WSC-based SOA projects would be particularly worth more attention. However, effort estimation for WSC-based SOA implementations may suffer from challenges because of the numerous and various approaches to WSC. This paper proposes circumstantial-evidence-based effort judgement as a supplementary to expert judgement to achieve qualitative effort comparison between different implementation proposals of a WSC-based SOA project. In detail: 1) Through viewing WSC-based SOA system from a perspective of mechanistic organisation, we borrow divide-and-conquer (D&C) as the generic strategy to narrow down the problem of effort judgement for the entire SOA implementation to that for individual WSCs. 2) Benefiting from an effort-oriented classification matrix, we identify a set of effort factors for individual WSC approaches. 3) By analogy, we apply the evidence concept and judgement method in forensic science to the context of effort judgement, and finally realise the qualitative effort judgement for WSC-based SOA implementations.
... Results of a study based on statistical techniques shows that a higher complexity is associated with a lower design quality of code and hence higher maintenance costs [3]. ...
Conference Paper
Full-text available
SOA is a latest architecture for softwares and a lot of tools are available to implement this architecture. Critics of cyclomatic complexity argue that complexity changes with modularization of code. If the technology is shifted from linear programming to OOP and SOA, the complexity of code will also change. The cyclomatic complexity was for the first time introduced by TJ McCabe as metric for the measurement of complexity of a piece of code. McCabe calculated the complexity of a sample code written in fortran language. Fortran is a linear programming language and there are no functions and classes in this language. Hence at the time of introducing cyclomatic complexity there was no concept of structured languages and object oriented languages. This is ignored in McCabes's cyclomatic complexity due to which it is not enough to measure complexity for advance programming architectures like OOP and SOA. Further work has been done on the complexity of structured, OOP and SOA but still work is required on SOA. This study proposes a new metric for the measurement of complexity of WCF a SOA. The significance of this new metric is that it can help to estimate cost of a new project, maintenance cost of already existing projects, basis path testing, comparison of two projects and many other factors.
... Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [42]. Meanwhile, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [40], and hence the greater mental effort people have to exert to solve the complexity [39]. ...
... In the software economics field, complexity has been viewed as an inherent property of the functional requirements of a software product, which cannot be reduced or simplified beyond a certain threshold [19]. Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [11]. In fact, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [9], and hence the greater mental effort people have to exert to solve the complexity [8]. ...
Conference Paper
Full-text available
Expert judgment for software effort estimation is oriented toward direct evidences that refer to actual effort of similar projects or activities through experts' experiences. However, the availability of direct evidences implies the requirement of suitable experts together with past data. The circumstantial-evidence-based judgment proposed in this paper focuses on the development experiences deposited in human knowledge, and can then be used to qualitatively estimate implementation effort of different proposals of a new project by rational inference. To demonstrate the process of circumstantial-evidence-based judgment, this paper adopts propositional learning theory based diagnostic reasoning to infer and compare different effort estimates when implementing a Web service composition project with some different techniques and contexts. The exemplar shows our proposed work can help determine effort tradeoff before project implementation. Overall, circumstantial-evidence-based judgment is not an alternative but complementary to expert judgment so as to facilitate and improve software effort estimation.
... Moreover, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance [42]. Meanwhile, the more complexity involved in a system, the more difficulty the designers or engineers have to understand the implementation process and thus the system itself [40], and hence the greater mental effort people have to exert to solve the complexity [39]. ...
Article
Full-text available
In 1995, DARPA initiated a work on a programmable concept of computer networking that would overcome shortcomings of the Internet Protocol. In this concept, each packet is associated with a program code that defines packet’s behavior. The code defines available network services and protocols. The concept has been called Active Networks. The research of Active Networks nearly stopped as DARPA ceased funding of research projects. Because we are interested in research of possible successors to the Internet Protocol, we continued the research. In this paper, we present an active network node called Smart Active Node (SAN). Particularly, this paper focuses on SAN’s ability to translate data flow transparently between IP network and active network to further improve performance of IP applications.
... Additionally, complexity has been proved to be a significant and non-negligible factor that influences software development and maintenance (Francalanci and Merlo 2008). ...
Conference Paper
Full-text available
With more and more availability of services, Web service composition (WSC) based SOA implementations have increasingly become a significant type of SOA projects in practice. However, effort estimation for such a type of SOA project can still be limited because of the numerous and various approaches to WSC. Through viewing WSC based SOA system from a perspective of mechanistic organization, this paper borrows Divide-and-Conquer (D&C) as the generic strategy to narrow down the problem of effort judgment for the entire SOA implementation to that for individual WSCs. Moreover, benefiting from an effort-oriented classification matrix and a set of effort-related hypotheses, we assign scores to effort factors of WSC assisted by a set of rules. These effort scores are used to facilitate qualitatively judging different effort between different types of WSC approaches, and eventually construct an effort checklist for WSC approaches. Finally, this effort checklist can be used together with D&C algorithm to realize the qualitative effort judgment for WSC based SOA implementations.
Article
A wide range of public goods, such as open source software, possess two often-ignored features: (i) excludable and potentially rivalrous contribution benefits (e.g. status seeking) and (ii) nonexcludable and nonrival consumption costs (e.g. adoption costs). I develop a model of the voluntary provision of public goods that incorporates these features. I find that these additional features mitigate the well-known incentive problems, but introduce new ones. Costly consumption lessens the free-rider problem, leading to more efficient provision. Private benefits similarly reduce the free-rider problem, but can lead to over-provision via a negative congestion externality on the supply side. Status-seeking induces an increase in contributions to the benefit of each contributor but imposes a cost on all other consumers and contributors. Efforts to maximize welfare by a community leader or social planner often involve transferring surpluses from consumers to producers.
Article
Full-text available
This paper reports the results obtained from use of project complexity parameters in modeling effort estimates. It highlights the attention that complexity has recently received in the project management area. After considering that traditional knowledge has consistently proved to be prone to failure when put into practice on actual projects, the paper endorses the belief that there is a need for more open-minded and novel approaches to project management. With a view to providing some insight into the opportunities that integrate complexity concepts into model building offers, we extend the work previously undertaken on the complexity dimension in project management. We do so analyzing the results obtained with classical linear models and artificial neural networks when complexity is considered as another managerial parameter. For that purpose, we have used the International Software Benchmarking Standards Group data set. The results obtained proved the benefits of integrating the complexity of the projects at hand into the models. They also addressed the need of a complex system, such as artificial neural networks, to capture the fine nuances of the complex systems to be modeled, the projects.
Article
Full-text available
This paper presents Spoon and its AOP extension. Spoon is a pure Java 5 framework for implementing source-level and annotation-driven program transformations. It aims to be a powerful tool to build and integrate middleware. Spoon allows for the definition of program processors and annotation processors that use Compile-Time reflection, which is achieved with an extension of Sun's APT. In particular, Spoon provides an AOP extension under the form of a set of annotation processors. With Spoon, it is possible to do comprehensive and efficient AOP in pure Java, without relying on any specific language or IDE support.
Article
The initial stages of a study of program complexity in the context of highlevel languages are reported. The goal of the research has been the development of techniques to measure program attributes and the formulation of a model of program complexity against which the complexity of a program for a task may be measured or, at least, may be compared with another program for the same task. We analyze, in particular, how the concepts of control flow and data flow can be used to define objective criteria to compare alternative implementations of an algorithm, and to develop useful programming guidelines.
Article
Numeric measurement of programs, where measurements can be logically related to optimum approaches, has appeal from an engineering standpoint. Although software engineering has come a long way in the sense of establishing disciplines and orderly processes, the use of numbers to aid in understanding the reasons for those disciplines has not made the same progress.
Article
Review of the theory, called ″Software science″ and the evidence supporting it. A brief description of a related theory, called ″software physics″ , is included.
Article
The purpose of this study is to investigate the possibility of providing some useful measures to aid in the evaluation of software designs. Such measurements should allow some degree of predictability in estimating the quality of a coded software product based upon its design and should allow identification and correction of deficient designs prior to the coding phase, thus providing lower software development costs. The study involves the identification of a set of hypothesized measures of design quality and the collection of these measures from a set of designs for a software system developed in industry. In addition, the number of modifications made to the coded software that resulted from these designs was collected. A data analysis was performed to identify relationships between the measures of design quality and the number of modifications made to the coded programs. The results indicated that module coupling was an important factor in determing the quality of the resulting product. The design metrics accounted for roughly 50–60% of the variability in the modification data, which supports the findings of previous studies. Finally, the weaknesses of the study are identified and proposed improvements are suggested.
Article
Metaphors, such as the Cathedral and Bazaar, used to describe the organization of FLOSS projects typically place them in sharp contrast to proprietary development by emphasizing FLOSS’s distinctive social and communications structures. But what do we really know about the communication patterns of FLOSS projects? How generalizable are the projects that have been studied? Is there consistency across FLOSS projects? Questioning the assumption of distinctiveness is important because practitioner-advocates from within the FLOSS community rely on features of social structure to describe and account for some of the advantages of FLOSS production. To address this question, we examined 120 project teams from SourceForge, representing a wide range of FLOSS project types, for their communications centralization as revealed in the interactions in the bug tracking system. We found that FLOSS development teams vary widely in their communications centralization, from projects completely centered on one developer to projects that are highly decentralized and exhibit a distributed pattern of conversation between developers and active users. We suggest, therefore, that it is wrong to assume that FLOSS projects are distinguished by a particular social structure merely because they are FLOSS. Our findings suggest that FLOSS projects might have to work hard to achieve the expected development advantages which have been assumed to flow from “going open. ” In addition, the variation in communications structure across projects means that communications centralization is useful for comparisons between FLOSS teams. We 1 The social structure of Free and Open Source software development 2 found that larger FLOSS teams tend to have more decentralized communication patterns, a finding that suggests interesting avenues for further research examining, for example, the relationship between communications structure and code modularity.
Article
Discussed is the unit-of-measure situation in programming. An analysis of common units of measure for assessing program quality and programmer productivity reveals that some standard measures are intrinsically paradoxical. Lines of code per programmer-month and cost per defect are in this category. Presented here are attempts to go beyond such paradoxical units as these. Also discussed is the usefulness of separating quality measurements into measures of defect removal efficiency and defect prevention, and the usefulness of separating productivity measurements into work units and cost units.
Article
A family of syntactic complexity metrics is defined that generates several metrics commonly occurring in the literature. The paper uses the family to answer some questions about the relationship of these metrics to error-proneness and to each other. Two derived metrics are applied; slope which measures the relative skills of programmers at handling a given level of complexity and r square which is indirectly related to the consistency of performance of the programmer or team. The study suggests that individual differences have a large effect on the significance of results where many individuals are used. When an individual is isolated, better results are obtainable. The metrics can also be used to differentiate between projects on which a methodology was used and those on which it was not.
Article
The effort required to service maintenance requests on a software system increases as the software system ages and deteriorates. Thus, it may be economical to replace an aged software system with a freshly written one to contain the escalating cost of maintenance. We develop a normative model of software maintenance and replacement effort that enables us to study the optimal policies for software replacement. Based on both analytical and simulation solutions, we determine the timings of software rewriting and replacement, and hence the schedule of rewriting, as well as the size of the rewriting team as functions of the: user environment, effectiveness of rewriting, technology platform, development quality, software familiarity, and maintenance quality of the existing and the new software systems. Among other things, we show that a volatile user environment often leads to a delayed rewriting and an early replacement (i.e., a compressed development schedule). On the other hand, a greater familiarity with either the existing or the new software system allows for a less-compressed development schedule. In addition, we also show that potential savings from rewriting will be higher if the new software system is developed with a superior technology platform, if programmers' familiarity with the new software system is greater, and if the software system is rewritten with a higher initial quality
Article
In recent years, open source software - more properly, free and open source software - has emerged as one popular solution to the so-called "software crisis". Advocates regard F/OSS as an agile, practice-led initiative that addresses three key issues namely cost, time scale and quality. F/OSS products are usually freely available for public download. The collaborative, parallel efforts of globally distributed developers allow many F/OSS products to be developed much more quickly than conventional software. Many F/OSS offerings are recognized for their high standards of reliability, efficiency, and robustness; products such as GNU/Linux, Apache, and Bind have become "category killers" stifling the incentive to develop any competing products.
Article
SourceForge provides abundant accessible data from Open Source Software development projects, making it an attractive data source for software engineering research. However it is not without theoretical peril and practical pitfalls. In this paper, we outline practical lessons gained from our spidering, parsing and analysis of SourceForge data.