A software complexity model of object-oriented systems
ABSTRACT A model for the emerging area of software complexity measurement of OO systems is required for the integration of measures defined by various researchers and to provide a framework for continued investigation. We present a model, based in the literature of OO systems and software complexity for structured systems. The model defines the software complexity of OO systems at the variable, method, object, and system levels. At each level, measures are identified that account for the cohesion and coupling aspects of the system. Users of OO techniques perceptions of complexity provide support for the levels and measures.
- [Show abstract] [Hide abstract]
ABSTRACT: Demand for quality software has undergone rapid growth during the last few years. This is leading to an increase in development of metrics for measuring the properties of software such as coupling, cohesion or inheritance that can be used in early quality assessments. Quality models that explore the relationship between these properties and quality attributes such as fault proneness, maintainability, effort or productivity are needed to use these metrics effectively. This study reflects the relevance of quality models to industrial practices and the maturity of research in developing these models. In this paper we summarise the results of empirical studies done so far to assess the applicability of fault proneness models across object-oriented software. We perform a systematic study of these to identify general conclusions drawn from them. This work recommends the research methodology that should be followed to predict fault proneness models.International Journal of Computer Applications in Technology 02/2014; 49(1):22-41.
- [Show abstract] [Hide abstract]
ABSTRACT: Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-proneness prediction models. Objective. We aim to provide an in-depth understanding of the effect of class size on the true associations between object-oriented metrics and fault-proneness. Method. We first employ statistical methods to examine the extent of the potentially confounding effect of class size in the fault prediction context. After that, we propose a linear regression-based method to remove the potentially confounding effect. Finally, we empirically investigate whether this removal could improve the prediction performance of fault-proneness prediction models. Results. Based on open-source software systems, we found: (a) the confounding effect of class size on the associations between object-oriented metrics and fault-proneness in general exists; (b) the proposed linear regression-based method can effectively remove the confounding effect; and (c) after removing the confounding effect, the prediction performance of fault prediction models with respect to both ranking and classification can in general be significantly improved. Conclusion. We should remove the confounding effect of class size when building fault prediction models.ACM Transactions on Software Engineering and Methodology (TOSEM). 02/2014; 23(1).
- [Show abstract] [Hide abstract]
ABSTRACT: Many studies have investigated the relationships between object-oriented (OO) metrics and change-proneness and conclude that OO metrics are able to predict the extent of change of a class across the versions of a system. However, there is a need to re-examine this subject for two reasons. First, most studies only analyze a small number of OO metrics and, therefore, it is not clear whether this conclusion is applicable to most, if not all, OO metrics. Second, most studies only uses relatively few systems to investigate the relationships between OO metrics and change-proneness and, therefore, it is not clear whether this conclusion can be generalized to other systems. In this paper, based on 102 Java systems, we employ statistical meta-analysis techniques to investigate the ability of 62 OO metrics to predict change-proneness. In our context, a class which is changed in the next version of a system is called change-prone and not change-prone otherwise. The investigated OO metrics cover four metric dimensions, including 7 size metrics, 18 cohesion metrics, 20 coupling metrics, and 17 inheritance metrics. We use AUC (the area under a relative operating characteristic, ROC) to evaluate the predictive effectiveness of OO metrics. For each OO metric, we first compute AUCs and the corresponding variances for individual systems. Then, we employ a random-effect model to compute the average AUC over all systems. Finally, we perform a sensitivity analysis to investigate whether the AUC result from the random-effect model is robust to the data selection bias in this study. Our results from random-effect models reveal that: (1) size metrics exhibit moderate or almost moderate ability in discriminating between change-prone and not change-prone classes; (2) coupling and cohesion metrics generally have a lower predictive ability compared to size metrics; and (3) inheritance metrics have a poor ability to discriminate between change-prone and not change-prone classes. Our results from sensitivity analyses show that these conclusions reached are not substantially influenced by the data selection bias.Empirical Software Engineering 01/2012; 17:200-242. · 1.18 Impact Factor
January 4, 1992
A Software Complexity Model
David P. Tegarden
Information and Decision Sciences
School of Business and Public Administration
California State University
San Bernadino, California 92407-2397
Steven D. Sheetz
Graduate School of Business
University of Colorado
Boulder, Colorado 80309-0419
David E. Monarchi
Graduate School of Business
University of Colorado
Boulder, Colorado 80309-0419
A model for the emerging area of software complexity measurement of OO systems is required
for the integration of measures defined by various researchers and to provide a framework for
continued investigation. We present a model, based in the literature of OO systems and software
complexity for structured systems. The model defines the software complexity of OO systems
at the variable, method, object, and system levels. At each level, measures are identified that
account for the cohesion and coupling aspects of the system. Users of OO techniques perceptions
of complexity provide support for the levels and measures.
Object-Oriented Systems, Software Metrics, Software Measurement, Software Quality
Send Comments to Steven D. Sheetz at the above address.
Decision Support Systems: The International Journal (1/93)
Among the many claimed benefits of Object-Oriented (OO) systems are faster
development, higher quality, easier maintenance, reduced costs, increased scalability, better
information structures, and increased adaptability . One of the primary reasons for these
claims is that OO approaches control complexity of a system by supporting hierarchical
decomposition through both data and procedural abstraction . However, as Brooks points out,
"The complexity of software is an essential property, not an accidental one" . The OO
decomposition process merely helps control the inherent complexity of the problem; it does not
reduce or eliminate the complexity. Measurement of the software complexity of OO systems has
the potential to aid in the realization of these expected benefits.
Measurement of software complexity has been of great interest to researchers in software
engineering for some time [6, 22, 35]. Software complexity has been shown to be one of the
major contributing factors to the cost of developing and maintaining software . According
to Coad and Yourdon , a good OO design is one that allows trade-offs of analysis, design,
implementation and maintenance costs throughout the lifetime of the system so that the total
lifetime costs of the system are minimized. Software complexity measurement can contribute to
making these cost trade-offs in two ways. These are:
1)To provide a quantitative method for predicting how difficult it will be to
design, implement, and maintain the system.
2)To provide a basis for making the cost trade-offs necessary to reduce costs
over the lifetime of the system.
We propose a model of the software complexity of OO systems described at four levels:
variable, method, object, and system. Figure 1 shows the levels and the relationships between
the levels. At each level, measures are identified to account for the cohesion (intra) and coupling
(inter) aspects of the system at that level. The measures account for both procedural and data
characteristics of an OO system, and are applicable throughout the lifetime of an OO system.
Together the measures provide analysts, designers, and programmers using OO techniques the
ability to identify which components of an OO system are the most complex, and may therefore
require additional analysis, design, or testing.
1.1. Related Work
Traditional systems have been the focus for most of the past work on software complexity
measurement [16, 22, 35, 54]. Recently there has been increasing interest in software complexity
measurement of OO systems [11, 13, 31, 37, 38, 39, 40, 47, 52]. These researchers have applied
traditional complexity metrics to OO systems , identified new metrics for OO systems [13,
31, 40, 44, 47], or considered both approaches [37, 38, 39]. The proposed model and measures
fall into the last category. Measures that correspond to those identified by these researchers are
included in the set of proposed measures. What has been lacking in the previous research is a
framework for the organization and investigation of the identified measures. The levels defined
in the proposed model specifically address this issue.
Work in other areas of OO system measurement includes measures of object reuse [5, 23,
24, 34], measures of OO CASE effectiveness [1, 2], and planning and estimation models [28, 32].
The next section of the paper provides background for the model and measures. The
following section presents the model and the set of measures identified for each level. The final
section contains a summary and identifies future research directions.
2.1. Definition of Software Complexity
Software complexity measurement is an area of software engineering concerned with the
measurement of factors that affect the cost of developing and maintaining software. Zuse 
states that the term "software complexity" is poorly defined and that software complexity
measurement is a misnomer. He offers a definition of software complexity that is consistent with
The true meaning of software complexity is the difficulty to maintain, change and
understand software. It deals with the psychological complexity of programs. 
Three specific types of psychological complexity that affect a programmer's ability to comprehend
software have been identified [12, 16]: problem complexity, system design complexity, and
Problem complexity is a function of the problem domain. Simply stated, it is assumed
that complex problems are more difficult for a programmer to comprehend than simple problems.
Since this type of complexity is impossible to control, it generally is ignored in software
System design complexity addresses the mapping of a problem space into a given
representation. Structural complexity and data complexity are the two types of system design
complexity defined for structured systems . Structural complexity addresses the concept of
coupling. Coupling measures the interdependence of modules of source code, e.g., C functions
calling other C functions. It is assumed that the more coupling between modules, the more
difficult it is for a programmer to comprehend a given module.
Data complexity addresses the concept of cohesion. Cohesion measures the
intradependence of a module. In this case, it is assumed that the more cohesive a module, the
easier it is for a programmer to comprehend the module . Structural and data complexity
measures are based on the module's fan-in, fan-out, and number of input/output variables [3, 11,
The complexity of a system is based on the sum of the structural and data complexity for
all modules in the system . These measures address information system complexity at the
system and module levels. This multiple level approach with an emphasis on cohesion and
coupling provides a basis in traditional software measurement for the proposed software
complexity model of OO systems.
Procedural complexity is associated with the logical structure of a program. This
approach to complexity measurement assumes that the length of the program (number of tokens
or the lines of code)  or the number of logical constructs (sequences, decisions, or loops) 
that a program contains determines the complexity of the program .
In this paper, we address system design complexity for OO systems.
2.2. Desirable Properties of Software Measures
Many authors propose desirable properties of measures for effective evaluation of software
complexity [18, 35, 53]. We believe two essential properties should be used to create a set of
measures of software and designs. First, the measures should be applicable throughout the
system development process. Many procedural complexity measures, such as lines of source
code, that have an inherent dependence on completing significant portions of the development
effort before the measure can be applied ignore potential opportunities to control software
complexity in the early phases of the development process. Second, the measures should be
intuitive in nature. By intuitive, we mean that analysts, designers, and programmers can agree
on the reasonableness of both aggregate and component measures.
2.3. Characteristics of Good OO Systems Design
Criteria for a good OO design have been identified using the concepts of coupling,
cohesion, reuse, clarity, depth of the generalization-specialization hierarchy, simplicity, and size
. Other researchers have identified the need for measures for the evaluation of an object.
The measures identified include the concepts of coupling, cohesion, sufficiency, completeness,
and primitiveness . Clarity, simplicity, sufficiency, completeness, and primitiveness are
associated with the comparison of a design or software to the problem domain requirements and
therefore, problem complexity. It is not clear how they can be measured from available static
representations (source code or designs); therefore, we do not address them further.
Two types of coupling (interaction and inheritance) and three types of cohesion (service,
class, and generalization-specialization) have been identified . In this paper, we propose
measures of interaction coupling, inheritance coupling, service cohesion, and class cohesion.
Generalization-specialization cohesion requires domain specific knowledge and, therefore, is
associated with problem complexity, not system design complexity. In addition, measures of the
depth and size of the generalization-specialization hierarchy for an OO system are proposed.
Each of these measures can be calculated from the available static representations of an OO
2.4. Software Design Fundamentals Emphasized by OO Systems
OO systems emphasize three software design fundamentals that are useful in controlling
software complexity: polymorphism, encapsulation, and inheritance. Each of these is described
Polymorphism means having the ability to take several forms. In OO systems,
polymorphism allows the implementation of a given operation to be dependent on the object that
"contains" the operation. For example, a "compute-pay" operation can be implemented differently
based on the employee's object type, e.g., part-time, hourly, or salaried. When a new type of em-
ployee is created, e.g., student, the programmer simply creates a new type of employee object
and a new "compute-pay" operation in the new object. The "compute-pay" operations of the
other types of employee are not affected by the payment operation implementation required for
the new type of employee. This reduces complexity by isolating the effect of changes and
providing highly consistent semantics across the interfaces to all employee objects. Thus,
polymorphism is a mechanism that can be used to control the complexity of an OO system. In
contrast, structured systems often have all compute pay operations contained in one program.
The program must be capable of differentiating between the different types of employees and
applying the appropriate operation. Adding new types of employees may require existing code
to be changed.
The use of polymorphism also can increase the complexity of an OO system . For
example, if the "compute-pay" operation in one type of employee object is implemented to print
employee descriptive information or some other function, i.e., not compute the pay of the
employee, then semantic consistency across the interfaces of the employee objects no longer
exists. The programmer may no longer assume that all operations with the same name perform
the same generic function. The semantics of each individual implementation of an operation must
be determined. This increases the difficulty of understanding the employee objects. When used
this way, polymorphism can lead to the same type of software engineering problems created by
the unconstrained use of goto statements . The key to controlling an OO system's complexity
through the use of polymorphism is to ensure that all operations with the same name are
2.4.2. Encapsulation (Information Hiding)
OO systems integrate both the structural (data) and behavioral (procedural) aspects of a
program, while structured systems force an artificial separation of the structure from behavior.
According to Brodie and Ridjanovic ,
The separate treatment of structure and behavior complicates design,
specification, modification, and semantic integrity analysis.
Encapsulation allows programmers to modify the implementation of an object while avoiding the
creation of unwanted side effects in other objects by hiding the implementation detail behind a
public interface (protocol). This reduces complexity by ensuring that changes to the internal
operations of an object, i.e., those that do not modify the public interface, are contained within
An inheritance mechanism is considered important due to the potential for reuse.
Inheritance allows programmers to define objects incrementally by reusing previously defined
objects as the basis for new objects. There have been many different types of inheritance
mechanisms associated with OO systems . The most common inheritance mechanisms
include different forms of single and multiple inheritance.
Single inheritance allows a subclass to have only a single parent class. The subclass
extends the parent's definition. For example, when defining a new type of employee, e.g.,
student, the new employee type can inherit the common characteristics of being an employee
from a generic type of employee. Using this approach, the programmer only needs to be
concerned with the difference between student employees and generic employees. Existing
programming languages and OO methodologies permit extending the parent's definition including
the redefinition of some or all of the parent's properties [8, 14, 21, 27, 29, 36, 41, 45, 51]. With
redefinition capabilities, it is possible to introduce an inheritance conflict, i.e., a property of a
subclass with the same name as a property of a parent class.
Multiple inheritance occurs when a subclass may inherit from more than one parent class.
In this situation, the types of inheritance conflicts are multiplied. In addition to the possibility
of having an inheritance conflict between the subclass and one (or more) of its parent classes,
it is now possible to have conflicts between two (or more) parent classes. Inheritance conflicts
increase the difficulty of understanding an inheritance structure and individual objects in the
structure. Thus, there is a risk of increasing the complexity of an OO system, instead of
decreasing it, through the use of inheritance.
Snyder [48, 49] points out that the underlying cause of the inheritance conflict problem
is that most inheritance mechanisms violate encapsulation. For example, when the definition of
a superclass is modified, all of its subclasses are affected. This may introduce additional
inheritance conflicts in one (or more) of the superlcass's subclasses. Therefore, programmers
must be aware of the effects of the modification not only in the superclass, but in each subclass
that inherits the modification.
Rumbaugh, et al.,  suggest the following rules when using inheritance.
Query operations should not be redefined;
Operations should only be redefined if the overriding operation restricts the
semantics of the inherited operation;
Redefining operations should never change the protocol or the underlying
semantics of the inherited operation;
However, as they point out, "The implementation and use of many existing object-oriented
languages violates these principles." [45,p. 65] Thus, inheritance conflicts caused by redefinition
capabilities and multiple inheritance mechanisms must be addressed when considering the overall
complexity of an individual object or OO system.
Another concern related to using inheritance occurs when a subclass does not utilize all
of its superclass(es)' properties. It has been suggested that this may indicate a subclass that has
been misclassified in the inheritance network . This also demonstrates that the use of
inheritance may work to increase the complexity of an OO system.
2.5. Measures Based on Information Flow and Data Bindings
Measures that address system design complexity for structured systems that also are
consistent with the characteristics of good OO systems design, the desirable properties of software
measures, and the software design fundamentals presented above are based on information flow
and data bindings. The measures defined by research on information flow and data bindings
include fan-in and fan-out [3, 25], data bindings , input/output variables , and intra- and
inter-module complexity .
Fan-in is the number of modules that call a module; fan-out is the number of modules that
a module calls. It has been stated that the complexity of a structured system is proportional to
the number of connections between modules in the system [3, 25], i.e., the higher the coupling
of modules in a system the more complex the system.
The complexity of a system also has been described as a function of the amount of work
that the system's individual modules perform. Data bindings and the number of input/output
variables describe the amount of work an individual module performs [11, 26]. These measures
address the cohesion of a module.
Card and Agresti  state that a system's complexity is best described by combining the
above. This combination results in an intra-module (data) and inter-module (structural)
complexity for each module in a system. The overall complexity of a structured system is then
defined as the sum of the average complexities (intra- and inter-) of each module. By identifying
both module level complexity and system level complexity, the overall amount of complexity can
be represented for a given system .
3. Model of Complexity for OO Systems
The four level model explicitly addresses the inherent components of OO systems
including variables, methods, and objects, as well as the overall system. The proposed measures
account for the cohesion (intra-) and coupling (inter-) aspects of the system at each level (see Fig.
1). The combined set measures from all levels results in a representation of OO system software
complexity applicable throughout the lifetime of an OO system. Variable level complexity and
method level complexity contribute to, i.e., influence by increasing or decreasing, the complexity
measured at the object level. Complexity at the object level influences complexity at the system
3.1 Levels of Software Complexity for OO Systems
Similar to the system and module levels of software complexity defined for structured
systems , the complexity of OO systems can be represented by a set of measures defined at
different levels. For OO systems the variable, method, object, and system levels are necessary.
The linear model of system design complexity for structured systems provides support for the
existence of a similar model of OO systems complexity.
3.1.1. Definitions of the Levels
Variable level complexity is associated with the definition and use of variables throughout
the system. Method level complexity is associated with the definition and use of methods
throughout the system. Object level complexity combines variable and method complexity with
measures of the inheritance structure. System level complexity provides high level
representations of OO system size and organization. Aggregate measures for each level should
provide the basis for complexity tradeoffs. However, due to the lack of a theory of OO software
construction and evolution, aggregate measures and operational rules for making complexity
tradeoffs must be determined empirically [19, 45].
3.1.2. Justification of the Levels
Each level of complexity can be justified from the literature on OO systems, the literature
on software measures, and the perceptions of users of OO techniques. The literature of OO
systems and software measures was presented in the background sections of this paper (See
section 2) and are mentioned only briefly here. The perceptions of users of OO techniques are
discussed in more detail.
First, many OO analysis and design techniques identify variables, methods, and objects
as components of OO systems [7, 14, 15, 45] because each of these concepts is inherent to OO
systems. Assuming that some combination of these components is a "system", all four levels are
justified from the literature on OO systems.
Second, as stated above, previous literature on software measurement provides support for
the idea of levels of complexity [11, 12]. This literature also identifies measures of data bindings
, module cohesion and coupling [3, 25], and summation across structure diagrams .
These ideas correspond directly to the variable, method, and object levels of the proposed model.
Abstraction of these concepts to the system level is also reasonable based on software measures
developed for structured systems [11, 12].
Finally, a study of users of OO techniques perceptions of OO software complexity was
undertaken. Seven graduate students in information systems who had completed a course in OO
techniques that required analysis, design, and programming participated in the study. The group
identified 148 concepts that they believed contributed to the complexity of OO systems.
Classification of the 148 concepts by the authors, using a content analysis approach [17, 30],
identified 11 concepts that contained a reference to the variable level, 17 concepts that contained
a reference to the method level, 41 concepts that contained a reference to the object level, and
13 concepts that contained a reference to the system level.
From the 148 concepts the group identified 10 categories that organized the concepts by
similarity. The categories were then ranked by their importance to OO system complexity.
Group agreement on category importance was moderately-high, Kappa coefficient of concordance
K=.70, p=.10. The categories are ranked in order of importance and presented, with their
definitions, in Table 1. Each participant then was asked to place each concept in the category
that he or she believed it belonged.
High levels of agreement, 6 or 7 participants, existed on 40% (59) of the concepts and
most concepts, 80% (118), were placed in the ten categories by 4 or more participants, showing
a high degree of face validity. Only 4 of the 148 concepts were not assigned to some category
by all participants. The overall K, Kappa coefficient of agreement for all concept categorizations,
score is .50 indicating a moderate level agreement about the categorization of the concepts. This
result is significantly different from 0 agreement, with a calculated Z=69.23, p < .001.
Review of the definitions in Table 1 shows that the class design, structure, method design,
and message passing are associated with the static representations of OO systems. The class
design and method design categories contain concepts that apply to the variable level; the method
design and message passing categories contain concepts that apply to the method level; the class
design, message passing, and structure categories contain concepts that apply to the object level;
and the structure, class design, and message passing categories contain concepts that apply to the
system level. These categories were ranked in the top half of categories on importance to OO
system complexity (see Table 1).
It should be noted that some categories, (i.e., maintenance, project management,
methodology and tools, problem domain, reusability, and solution domain), contain few concepts
that apply to the proposed levels. This is an indication that the group perceived OO system
complexity as consisting of components that involve all aspects of the development process. The
current model involves only those categories that apply to static representations of OO systems,
i.e., designs and code.
So, the notion of levels of complexity is supported by the literature on OO systems, the
literature on software measures, and the perceptions of potential users of OO software complexity
3.1.3. Usefulness of the Levels
The model shows relationships that provide the basis for suggesting that a rationale exists
for making complexity tradeoffs between levels to rearrange (and thereby reduce) the complexity
of an OO system. Complexity tradeoffs involve increasing complexity in one level (or measure)
to reduce complexity in another level (or measure). The ability to measure in what part of the
system the complexity inherent to a problem exists is essential to controlling the complexity of
OO systems and the development of complexity tradeoff criteria.
The model provides measures for making two types of complexity tradeoffs throughout
the development of an OO system: design and implementation. Design tradeoffs are associated
with design decisions of the system. Tradeoffs of this type include manipulating object, variable,
and method levels to move complexity from the object level to the variable and method levels
(or vice versa), controlling inter-object measures, and determining the number of abstract versus
concrete classes at the system level. Decisions to have many classes with few methods and
variables or few classes with many methods and variables in each, or to have deep versus broad
inheritance structures are examples of design tradeoffs. These tradeoffs affect the measures,
which should reflect the complexity of the design representations of OO systems.