ArticlePDF Available

A Critique of Cyclomatic Complexity as a Software Metric

Authors:

Abstract

McCabe's cyclomatic complexity metric (1976) is widely cited as a useful predictor of various software attributes such as reliability and development effort. This critique demonstrates that it is based upon poor theoretical foundations and an inadequate model of software development. The argument that the metric provides the developer with a useful engineering approximation is not borne out by the empirical evidence. Furthermore, it would appear that for a large class of software it is no more than a proxy for, and in many cases is outperformed by, lines of code
A
critique
of
cyclomatic
complexity as a software
metric
by
Martin
Shepperd
McCabe’s cyclomatic complexity metric
is
widely cited as a useful predictor
of
various
software attributes such as reliability and
development effort. This critique
demonstrates that it
is
based upon poor
theoretical foundations and an inadequate
model
of
software development. The
argument that the metric provides the
developer with a useful engineering
approximation
is
not borne out by the
empirical evidence. Furthermore, it would
appear that
for
a large class
of
software it
is
no more than a proxy
for,
and in many cases
is
outperformed by, lines
of
code.
Fig.
1
90
Derivation
of
v(G)
for
an
example program
1
Introduction
The need
for
some objective measurement
complexity has been long acknowledged
of
software
Two early
contributions to this field are Halstead’s ‘software science’
(Ref.
1)
and the cyclomatic complexity approach
of
McCabe
(Ref.
2).
Both metrics are based upon the premise that
software complexity
is
strongly related to various measurable
properties
of
program code.
Although initially well received by the software engin-
eering community, software science based metrics have been
increasingly subject to criticism. Attacks have been made
upon the underlying pyschological model
(Refs.
3
and 4). The
soundness
of
many empirical ‘validations’ has been
questioned (Ref.
5)
and difficulties noted with counting rules
(Ref.
6).
The ability
of
software science metrics to capture
program complexity in general would thus appear to be in
great doubt.
It
is
thus rather surprising that the cyclomatic complexity
metric has not been subjected to a similar degree
of
scrutiny
to that given to software science. This
is
particularly the case
given the high degree
of
acceptance
of
the metric within the
software engineering community. It
is
widely cited
(Refs.
7
-
13),
subjected to a ‘blizzard
of
refinements’
(Refs.
14
-
22),
applied as a design metric (Ref.
23)
and described
in best-selling textbooks on software engineering
(Refs.
24
and
25).
Yet there have been comparatively few empirical
studies; indeed, as a basic approach, the metric has been
allowed to pass relatively unquestioned.
The hypothesis
of
a simple deterministic relationship
between the number
of
decisions within a piece
of
software
and its complexity
is
potentially
of
profound importance to
the whole field
of
software engineering. This requires very
careful evaluation.
The rest
of
the paper reviews the theories put forward by
McCabe. Theoretical criticisms
of
the metric are outlined and
the various empirical validations
for
the metric are reviewed,
together with aspects
of
experimental design. It
is
concluded
that cyclomatic complexity
is
questionable on both
theoretical and empirical grounds. Therefore cyclomatic
complexity
is
of
very limited utility.
2
The cyclomatic complexity metric
Given the increasing costs
of
software development, McCabe
considered that a ‘mathematical technique that will provide
a quantitative basis
for
modularisation and allow us to
identify software modules that will be difficult to test
or
maintain’ was required.
Use
of
a lines of code (LOC) metric
was rejected since McCabe could
see
no obvious relationship
between length and module complexity. Instead, he
suggested that the number
of
control paths through a module
would be a better indicator, particularly as this appeared to
be strongly related to testing effort. Furthermore, much
of
Software
Engineering
Journal March
1988
the work on 'structured programming'
in
the early
1970s
concentrated
on
program control flow structures (Refs. 26
and 27).
Unfortunately, the number of paths through any software
with a backward branch is potentially infinite. Fortunately,
the problem can be resolved by the application of graph
theory. The control flow of any procedural piece of software
can be depicted as a directed graph, by representing each
executable statement (or group of statements where the flow
of control
is
sequential)
as
a node, and the flow of control
as the edges between them. The cyciomatic complexity of
a graph is useful because, providing the graph
is
strongly
connected,
it
indicates the number of basic paths (i.e. linearly
independent circuits) contained within
a
graph, which, when
used in combination, can generate
all
possible paths through
the graph or program.
The cyclomatic complexity
v
of
a program graph
G
is
v(G)
=
e
-
n
+
1
(1)
where e
is
the number of edges, and
n
is
the number of nodes.
A
strongly connected graph is one for which given any two
nodes
r
and
s
there exist paths from
r
to
s
and
s
to
r.
Fig.
1
shows an example derivation of cyclomatic complexity
from a simple program and
its
related control graph. Note
that the program graph is made strongly connected by the
addition of an edge connecting the END node to the BEGIN
node.
The process of adding an extra edge to the program graph
can be bypassed by adding one to the cyclomatic complexity
calculation. The calculation can be generalised for program
graphs that contain one or more components, subject to the
restriction that each component contains
a
single entry and
a single exit node. For a graph
S
with a set of connected
components the cyciomatic complexity is
v(S)
=
e
-
n
t
2p
(2)
where
p
is
the number of connected components.
A
multi.component program graph is derived if the
software contains separate subroutines. This
is
illustrated
in
Fig. 2.
As
McCabe observed, the calculation reduces to a simple
count
of
conditions pius one. He argued that since
a
compound condition, for example
IFX
<
1
ANDY <2THEN
was
a
thinly disguised nested
IF,
then each condition should
contribute to module complexity, rather than merely
counting predicates (see Figs. 3a and
b).
Likewise a case
statement is viewed
as
a multiple
IF
statement (i.e.
it
contributes
n
-
I
to v(G), where
n
is the number
of
cases).
McCabesaw a practical application
of
the metric in using
it
to provide an upper
limit
to module complexity, beyond
which a module should be subdivided into simpler
components,
A
value of
v(G)
5
10
was suggested, although
he accepted that in certain situations, notably large case
structures, the
limit
might be relaxed.
3
Theoretical considerations
The counting rules for different control statements have been
the subject of some controversy. Myers(Ref.
19)
has argued
that a complexity interval
is
a more effective measure of
complexity than a simple cyclomatic number. The interval
has a lower bound of decision statement count (i.e. predicate
count) pius one and an upper bound of individual condition
count
plus
one.
Myers used the following three examples to support
his
Software Engineering
Journal
March
1988
Fig.
2
Derivation
of
v(S)
for
a
program
with
a
subroutine
modified form
of
the cyclomatic complexity metric:
v(G)
=
2
Myers
=
(2:2)
IF
X=O
AND
Y>I
THEN
...
v(G)
=
3
ELSE
...
:
Myers
=
(2:3)
IF X=O THEN
IF
Y>
1
THEN
...
ELSE.
.
.;
ELSE
...
v(G)
=
3
Myers
=
(3:3)
His argument is that
it
is
intuitively obvious that the third
example
is
more complex than the second,
a
distinction not
made by the cyciomatic number. The idea underlying
his
31
YC2
THEN
next
stmt
b
THEN
Fig.
3
separate decisions
a Treated as a single decision
b
Treated as separate decisions
modification appears to be that there
is
more potential for
inserting additional ELSE clauses into a program with a
larger number of
IF
statements. They are not counted by the
McCabe metric, as
is
demonstrated by the following two
program fragments, both of which have cyclomatic
complexities of 2:
Compound condition treated
as
a
single decision and as
IF
X<1
THEN
...,
IF
X<
1
THEN
ELSE
...
...(
v(G)
=
2
v(G)
=
2
Since
Myers’
complexity interval does not directly count
ELSE statements it
is
arguable whether
it
represents much
of an improvement over that of McCabe’s metric. However,
the criticism of cyclomatic complexity remains,
in
that it fails
to distinguish between selections with and without ELSE
branches. From the standpoint of pyschological complexity
this
is
significant; however, since the number of basic paths
remains unaltered testing difficulty may not increase. Thus
the failure of cyclomatic complexity to count ELSE branches
is
only a serious deficiency
if
the metric
is
intended
to
capture
complexity
of
comprehension.
The treatment of case statements has also been subject
to disagreement. Hansen (Ref.
15)
has suggested that since
they
were easier to understand than the equivalent nested
IFS they should only contribute one to the module
complexity. Other researchers
(Ref.
28) have suggested a
log&) relationship, where
n
is
the number of cases.
Evangelist (Ref. 29) also encountered anomalies
in
the
application of counting
rules.
Much of
the
difficulty stems
from the fact that McCabe was originally thinking
in
terms
of Fortran, whereas most of
these
difficulties arise from other
32
languages, some of them more recent, such as Ada.t Here
one has to contend with problems such as distinguishing
between
‘IF
y
=
1
ORy
=
3’
and
‘IF
y
=
0
OR ELSE x/y>l’. The
mapping from code to a program graph
is
ambiguous.
Another area of controversy
is
that
v
=
1
will remain true
for a linear sequence of any length. Since the metric
is
insensitive to complexity contributed from linear sequences
of statements, several researchers have suggested
modifications to the simple use of cyclomatic complexity.
Hansen has proposed a 2-tuple of cyclomatic complexity and
operand count (defined to be arithmetical operators, function
and subroutine calls, assignments, input and output
statements and array subscription). Unfortunately, as Baker
and Zweben (Ref.
30)
point out, this approach does suffer
from the problem of ‘comparing apples and oranges’. It
is
not clear how to rank
in
order of complexity the 2-tuples
(iJ)
and
(1,k)
where
i>l
and
k>j.
Stetter (Ref. 21) suggests an alternative approach to
this
particular problem
in
the form of a cyclomatic flow
complexity metric. Flow of data
is
considered
in
addition to
flow
of
control. Complexity
will
generally increase with an
increase
in
length of a linear sequence of statements since
more data references
will
almost invariably be made.
A further
objection to the cyclomatic complexity metric
is
its behaviour towards
the
structuring of software.
A
number
of researchers (Refs.
30
-
33)
argue that the cyclomatic
complexity can increase
when
applying generally accepted
techniques to improve program structure. Certainly the
metric
is
insensitive to the use
of
unstructured techniques
such as jumping in and out of loops, since
all
that
is
captured
is
the number of decisions plus one. Evangelist
(Ref.
34)
reports that the application
of
only 2 out of 26 of Kernighan
and Plauger’s
rules
of good programming style
(Ref.
35)
invariably results in a decrease in cyclomatic complexity.
A
development of the unstructuredktructured argument
is
the objection that
the
metric ignores the context
or
environment of a decision.
All
decisions have a uniform
weight, regardless of depth of nesting or relationship with
other decisions. The complexity of a decision cannot be
considered
in
isolation, but must take into account other
decisions within
its
scope. This has resulted
in
variants of
cyclomatic complexity which allow for nesting depth (Refs.
18,
32 and
36).
It
is
worth noting that all counting rule variants to the
metric are based upon arguments along the lines that
it
is
intuitively
obvious that one example
is
more complex than
another and therefore an adjustment must be made to the
counting rules. Such arguments are based upon
issues
of
cognitive complexity or ‘perplexity’ (Ref.
37),
which
is
only
one view of software complexity. Difficulty of testing
is
another aspect of software complexity and one with which
McCabe was primarily concerned. These different
interpretations of cyclomatic complexity have significant
implications upon the validation and application of the
metric.
A more fundamental objection to cyclomatic complexity
is
the inconsistent behaviour when measuring modularised
software.
As
Eqn.
2
indicates,
v(G)
is
sensitive to the number
of subroutines within a program, because McCabe suggests
that
these
should be treated as unconnected components
within
the
control graph. This has the bizarre result of
increasing overall complexity as a program
is
divided into
more, presumably simpler, modules.
In
general, the
complexity
v
of a program
P‘
will be:
v(P’)
=
v(P)
+
i
(3)
where
P
is
equivalent program to
P’
but with a single
tAda
is
a
trademark
of
the
US
Government Ada Joint Program
Office.
Software Engineering Journal March
1988
component, and
i
is
the number of modules or subroutines
used by
P'
(i.e. the number of graph components
-
1).
However, the relationship
is
further complicated by the
observation that graph complexity may be reduced in a
situation where modularisation eliminates code duplication.
Thus
v(P')
=
v(P)
+
i-
((v,
-
1)
*
(U/
-
1)) (4)
j=1
where
v,
is
the complexity of the jth module or subroutine,
and
U,
is
the
number of times the jth module
is
called.
To summarise, general program complexity increases with
the addition of extra modules but decreases with the
factoring out of duplicate code.
All
other aspects of
modularity are disregarded.
If
one were to be prescriptive
on the basis of Eqn.
4,
it would be to only modularise
when
a fragment of non-linear code (i.e. containing decisions)
could be factored out.
As
a model with which to view general
software complexity, this appears unacceptable.
Three classes of theoretical objection have been presented.
First, there
is
the
issue of the very simplistic approach to
decision counting. Ease of program comprehension
is
unlikely to be completely orthogonal to software complexity.
The ease of comprehending a decision
is
not invariant, and
thus
a constant weighting of one seems inappropriate.
Secondly, the metric appears to be independent of generally
accepted program structuring techniques. Since these are
intended to reduce complexity this does not exactly inspire
confidence. Thirdly, and most importantly,
is
the arbitrary
impact of modularisation upon total program complexity.
As
a measure of inter-modular complexity,
in
other words
for all non-trivial software, cyclomatic complexity would
seem unsatisfactory on theoretical grounds.
4
Empirical validation
of
the metric
Many early validations of the metric were merely based upon
intuitive notions of complexity. For example, McCabe states
that 'the complexity measure
v
is
designed to conform to
our intuitive notion of complexity' (Ref.
2).
Hansen (Ref.
15)
argues that a good measure of program complexity should
satisfy several criteria, including that of relating 'intuitively
to the psychological complexity of programs'.
He
does not
suggest that there
is
a need for any objective validation.
Likewise, Myers (Ref.
19)
treats intuition as sufficient
grounds for employing the metric.
This seems a rather curious approach:
if
intuition
is
a
reliable arbiter of complexity this eliminates the need for a
quantitative measure. On the other hand,
if
intuition cannot
be relied upon, it hardly provides a reasonable basis for
validation. Clearly a more objective approach to validaticn
is
required.
The theoretical objections to the metric, that
it
ignores
other aspects of software
such
as data and functional com-
plexity, are not necessarily fatal. It
is
easy to construct cer-
tain pathological examples, but this need not invalidate the
metric
if
it
is
possible to demonstrate that
in
practice it
provides a useful engineering predictor of factors that are
associated with complexity. Researchers have usually taken
these to include effort involved
in
testing and maintenance,
error incidence and ability to recall code.
A
number of empirical studies have been carried out.
These are summarised
in
Table
I.
A
difficulty that arises
in
interpretation of many of these studies
is
that there
is
no ex-
plicit hypothesis being evaluated. Two possible
a
posteriori
hypotheses with which to examine the empirical work are as
follows:
0
Hypothesis
I:
Total program cyclomatic complexity can
be used to predict various
useful
software characteristics
(for example development time, incidence of errors and
program comprehension).
0
Hypothesis
2:
Programs comprising modules with low
v(G)
(<lo)
are easier to test and maintain than those for
which
this
is
not
the
case (McCabe's original hypothesis).
As
Table
1
indicates, the results of various empirical valida-
tion studies do not give a great deal of support to either
hypothesis.
In
general the results are not very compelling,
either
at the program level (hypothesis
1)
or for the studies
Pm0.47
P=0.41,0.81,0.79
Davis (Ref. 43)
r
is
-ve, +ve
Feuer (Ref.
44)
P=0.90"'*
Gaffney (Ref. 45) P=0.60
Henry (Ref. 46) P=0.84**' P=0.92'*"
Kitchenham (Ref. 47) F=0.86,0.88
r2=
0.46,0.49,0.21""'
Paige (Ref. 48)
a=
0.90
Schneiderman (Ref. 49) r2=0.61"*" p=o.32*****
Shen(Ref
50)
P=0.78***
Sheppard (Ref.
51) rz=0.79
P
=
0.38
r=
0.35
Sunohara (Ref. 52)
r*
=
0.4.0.38 P=0.72,0.7
Wang (Ref. 53) P=0.62 P-0.59
Woodfield (Ref. 54)
P=0.26,
R=
0
Woodward (Ref. 22) r2=0.90
P
=
Pearson moment
*
r
was 'improved' ied for potentially 'aberrant' results
**
correlated with Mead's token count)
R
=
indirect error count
(i.e.
version count),
or
program change count
using log-log transformations
****
*****
Software Engineering Journal March
1988
33
that deal with individual modules (hypothesis
2),
such as
Basili and Perricone (Ref. 38). The major exception
is
the
Henry
et
al.
(Ref.
46) study
of
165
procedures from the UNIXS
operating system, where the results show a strong correla-
tion between
v(G)
and module
error
rates. This result may be
slightly artificial since they appear to have filtered out all
error-free
modules.
Based upon the observation that large modules tend to
contain more
errors
than sma!l modules, the Basili and
Perricone
(Ref.
38)
study uses error density
(i.e.
errors per
thousand LOC) as a size-normalised metric
of
software
error-proneness. Their rather surprising finding was that
error density diminishes with increasing cyclomatic com-
plexity. Work by Shen
et
a/.
(Ref.
50) gives support to this
result, although there
is
disagreement as to whether
error
density
is
an appropriate means
of
size normalisation since
module size and error density do not appear to be indepen-
dent. Nevertheless, this strongly underlines the deficiency
of
a simple intra-modular complexity metric.
The clearest result from the empirical studies
is
the strong
relationship between cyclomatic complexity and LOC. Even
in the study
of
Henry
et
al.
there appears to be a fairly strong
association. Ironically it was the ‘inadequacy’
of
LOC
as a
module complexity metric that led
to
McCabe proposing
cyclomatic complexity as an alternative. A considerable
number
of
studies (Refs. 41, 47.48,
53
and
55)
indicate that
LOC actually outperforms cyclomatic complexity.
The most reasonable inference that can be drawn from
the above studies
is
that there exists a significant class
of
software
for
which
v(G)
is
no more than a proxy
for
LOC.
A
suggestion
of
Henry
et
al.
(Ref.
46) that software can be
characterised as either decision
or
computation bound
could have a considerable bearing upon interpretation
of
empirical studies. In cases
of
decision-bound software such
as
UNIX,
v(G)
will closely correspond to LOC. In computation-
bound software, with sizeable portions
of
linear code this
correspondence will be very marginal, and possibly accounts
substantially for the erratic results
of
Table
1.
An interesting development
of
this point has been made
by Humphreys(Ref.
56),
who argues that there exists a trade-
off
between decision
or
control flow complexity and data
structure complexity. One such example
is
the use
of
de-
cision tables to replace multiple
IF
or
CASE statements (a
common technique in systems programming). The conse-
quence
of
this
is
that the cyclomatic complexity
for
the
decision table solution will be substantially lower than
for
the alternative solution. Yet, he argues, the two pieces
of
software appear to have similar complexities. More signifi-
cantly, they will require a similar amount
of
testing effort
since they have the same number
of
boundary conditions to
contend with. Thus the claimed association between testing
difficulty and
v(G)
in many cases
is
distinctly tenuous. The
suggestion has been made
(Ref.
57) that this
is
due to
McCabe’s ambiguous mapping function
of
program control
flow
to
a program graph. Either way it does not bode well
for
cyclomatic complexity as a predictor
of
testing effort.
Most
of
the studies reported above place reliance upon
obtaining high correlation coefficients.
Use
of
Pearson’s
product moment, which
is
the most widely used correlation
coefficient in the studies above, requires the assumption
that the data
is
drawn from a population with a roughly
normal distribution. This creates a particular problem when
examining module error rates. The impossibility
of
a nega-
tive
error
count results in a pronounced skew in the error
distribution. This skew can be reduced by various trans-
formation techniques,
for
instance by using the square root
or
logarithm. Studies such as
Refs.
40
and
46
would be
SUNIX
is
a trademark
of
AT
&
T
Bell
Laboratories.
34
more meaningful
if
one
of
these techniques were applied
to obtain a more normal distribution
so
we could place a
higher degree
of
confidence in the correlation coefficients
produced.
There are two alternative empirical approaches; both have
considerable difficulties associated with them. The first
is
large-scale passive observation, where the researcher has
little
if
any influence. The second
is
more carefully con-
trolled experimentation, which out
of
practical necessity
tends to be very small scale;
see
for
example
Refs.
22,
41,42
and
53.
Large-scale passive observation
is
based upon the notion
that the variance introduced into the study from uncon-
trolled factors such as differences in individual ability, task
difficulty and differing environments
is
compensated by the
large sample size involved. Problems include the difficulty
of
obtaining accurate measurements
(Ref.
39). Their results
showed some improvement when restricted to results vali-
dated by various cross-checks. More significant
is
the prob-
lem
of
variation in individual ability
(Ref.
58).
Brooks
(Ref.
59)
suggests that differences in ability
for
individuals from
similar backgrounds
of
up to
25
to 1
are
such as to make it
very difficult to obtain statistically significant results.
The second approach, as typified by
Ref.
54,
is
more care-
fully controlled since the timescales and number
of
subjects
are relatively small.
Here
measurements are potentially
more accurate; however variance from external factors
is
still
a major difficulty.
Use
of
within-subject experimental design
is
a partial solution, although it does not address a number
of
factors, such as the subject’s familiarity with the problem
and the comparability
of
tasks. The small size
of
tasks being
undertaken
is
another problem area; frequently programs
of
less
than
300
LOC
(Refs.
43,44,
53
and
54)
are used. These
programs are, by software engineering standards, trivial. In
such situations the onus
is
upon the researcher to demon-
strate that results at a small scale are equally applicable
for
large systems. Such a finding would be counter to current
directions in software engineering.
To summarise, many
of
the empirical validations
of
McCabe’s metric need to be interpreted with caution. First
the use
of
correlation coefficients on skewed data causes
artificially high correlations. Secondly, the assumption
of
causality would seem doubtful given the consistently high
association between cyclomatic complexity and LOC.
Thirdly, the high variation in programmer ability reduces the
statistical significance
of
correlation coefficients.
However, despite the above reservations, some trends in
the results are apparent. The strong association between
LOC and cyclomatic complexity gives the impression that
the latter may well be no more than a proxy
for
the former.
The ability
of
v(G)
to predict
error
rates, development time
and program recall
is
quite erratic. Most damning
is
the out-
performing
of
v(G)
by a straightforward LOC metric in over a
third
of
the studies considered.
5
Conclusions
A severe difficulty in evaluating McCabe’s metric and
associated empirical work
is
the lack
of
explicit model upon
which cyclomatic complexity
is
based. The implicit model
appears to be that the decomposition
of
a system into suit-
able components
(or
modules)
is
the key issue. The decom-
position should be based upon ease
of
testing individual
components. Testing difficulty
is
entirely determined by the
number
of
basic paths through a program’s flowgraph.
Unfortunately, and perhaps not surprisingly, different in-
vestigators have interpreted cyclomatic complexity in a
variety
of
ways. For example, some studies treat cyclomatic
complexity at a program level by summing individual
module complexities (Ref.
54),
while others consider com-
Software Engineering Journal March
1988
plexity purely at a module level
(Ref.
46). Naturally this state
of
affairs does not facilitate the comparison
of
results.
An important distinction
is
made between intra- and inter-
modular complexity. Eqns.
3
and
4
suggest that cyclomatic
complexity
is
rather suspect in the latter area. Thus the only
possible
role
for
cyclomatic complexity
is
as an intra-modu-
lar complexity metric. Even this
is
made to look doubtful in
the light
of
the work
of
Basili and Perricone. In any case,
many researchers
(Ref.
60)
would argue that the problem
of
how to modularise a program
is
better resolved by consider-
ations
of
‘coupling’ and ‘cohesion’ (i.e. inter-modular com-
plexity), which are not adequately captured by the metric.
As noted earlier, most
of
the empirical work has relied
upon obtaining high correlation coefficients to substantiate
McCabe’s metric. However, a high correlation coefficient
between two variables does not necessarily imply causality,
as illustrated by the well known,
if
slightly apochryphal,
example
of
the spatial distribution
of
ministers
of
religion
and prostitutes! Setting aside quibbles
of
experimental
methodology
(Refs.
59
and
61),
the fundamental problem
remains that without an explicit underlying model the
empirical ‘validation’
is
meaningless and there
is
no hypoth-
esis
to be refuted.
Even
if
we disregard all the above problems and accept the
correlation coefficients at face value, the results are dis-
tinctly erratic. Cyclomatic complexity fails to convince as a
general software complexity metric. This impression
is
strengthened by the
close
association between
v(G)
and LOC
and the fact that
for
a significant number
of
studies LOC
outperforms
v(G).
The majority
of
modifications to McCabe’s original metric
remain untested.
To
what extent do validations
of
cyclo-
matic complexity impinge upon these modified metrics,
many
of
which appear to be very minor variants? Prather
(Ref.
32),
in an attempt to provide some unifying framework,
suggests a set
of
axioms which a ‘proper’ complexity metric
must satisfy:
Axiom
I:
The complexity
of
the whole must not be
less
than the sum
of
the complexities
of
the parts.
0
Axiom
2:
The complexity
of
a selection must be greater
than the sum
of
all the branches(i.e. the predicate must con-
tribute complex it y).
Axiom
3:
The complexity
of
an iteration must be
greater than the iterated part
(for
the same reason as axiom
2).
Although an interesting idea, a number
of
problems remain.
First, the axioms are limited to structured programs.
Secondly, the axioms provide very little constraint upon
the imaginations
of
software complexity metrics designers.
Thirdly, the axioms, however reasonable, are based purely
upon arguments of intuition. This
is
particularly the case
for
Prather’s suggestion
of
an upper bound
of
twice the lower
bound
for
axioms
2
and 3. Finally, the underlying model
is
incomplete, in as much as there are no connections with ob-
servable events in the software development process.
This axiomatic approach has been further developed
(Refs. 30,62 and
63)
such that any program may be reduced
into a hierarchy
of
irreducibles (prime trees). The benefits
are the removal
of
subjectivity over the issue
of
counting
rules and the ability to draw comparisons between different
metrics. Still unresolved are the problems
of
using intuition
when deriving actual complexity values from different
irre-
ducibles and the construction
of
a complete model
of
the
relevant world for a complexity metric. The difficulty
is,
of
course, that the ‘real world’
is
not entirely formal, in the
sense that we cannot model it with precise mathematical
relationships. The best we can hope
for
is
engineering
approximations.
Software Engineering Journal March
1988
It
is
arguable that the search
for
a general complexity
metric based upon program properties
is
a futile task. Given
the vast range
of
programmers, programming environ-
ments, programming languages and programming tasks, to
unify them into the scope
of
a single complexity metric
is
an
awesome task. A more fruitful approach might be to derive
metrics from the more abstract notations and concepts-of
software designs. This would have the additional advantage
that design metrics are available earlier on in the software
development process.
For a software complexity metric to be treated seriously
by the software engineering community, considerably more
emphasis must be placed on the validation process. It may
well be ‘intellectually very appealing’
(Ref.
22)
but this
is
insufficient. Following from the suggestion (Ref.
55)
that the
LOC metric be regarded as a ‘baseline’
for
the evaluation
of
metrics, there must exist considerable doubts about the
utility
of
McCabe’s cyclomatic complexity metric.
6
Acknowledgments
The author would like to thank Prof. Darrel lnce
of
the Open
University, Milton Keynes, England, for the many useful
suggestions and kind help he has given during the prepar-
ation
of
this paper.
He
would also like to record his thanks
towards the referee who provided constructive criticism and
a number
of
additional insights.
7
References
1
HALSTEAD, M.H.: ‘Elements of software science’ (North-
Holland, 1977)
2 McCABE, T.J.:
‘A
complexity measure’,
I€€€
Transactions
on
Software Engineering, 1976,
2,
(4). pp. 308
-
320
3
COULTER,
N.S.:
‘Software science and cognitive psychology’,
/€E€
Transactions on Software Engineering, 1983,
9,
(2), pp.
4 SHEN, V.Y., CONTE,
S.D.,
and DUNSMORE, H.E.: ‘Software
science revisited: a critical analysis of the theory and its em-
pirical support’,
/€€E
Transactions on Software Engineering,
1983,
9,
(2), pp. 155
-
165
5
HAMER,
PG.,
and FREWIN, G.D.:
‘M.H.
Halstead’s software
science
-
a critical examination’. Proceedings of Sixth Inter-
national Conference on Software Engineering, Tokyo, Japan,
1982
6 LASSEZ, J.L.,
VAN
DER KNIJFF,
D.,
SHEPHERD,
J.,
and
LASSEZ, C.:
‘A
critical examination of software science’, Journal
of Systems
&
Software, 1981,
2,
pp.
105
-
112
7 ARTHUR, L.J.: ‘Measuring programmer productivity and soft-
ware quality’ (Wiley-lnterscience, 1985)
8
COBB, G.W.:
‘A
measurement of structure for unstructured
languages’. Proceedings of ACM SIGMETRICSISIGSOFT
Software Quality Assurance Workshop, 1978
9 DE MARCO, T.: ‘Controlling software projects: management,
measurement and estimation’ (Yourdon Press, 1982)
10
DUNSMORE,
H.E.:
‘Software metrics: an overview of an evolving
methodology’, lnformation Processing
&
Management, 1984,
11
HARRISON, W., MAGEL, K., KLUCZNY, R., and DE KOCK, A.:
’Applying software complexity metrics to program mainten-
ance’, Computer, 1982,
15,
Sept., pp. 65
-
79
12
SCHNEIDEWIND, N.F.: ‘Software metrics for aiding program
development and debugging’. National Computer Conference,
New York,
NY,
USA,
Jun. 1979, AFIPS Conference Proceedings
13
TANIK, M.M.:
‘A
comparison
of
program complexity prediction
models’,
SlGSOFT
Software Engineering Notes, 1980,
5,
(4),
14 CURTIS,
B.:
’Software metrics: guest editor’s introduction’,
lEEE
Transactions on Software Engineering, 1983,
9,
(6),
pp. 637
-
638
15
HANSEN, W.J.: ‘Measurement of program complexity by the
pair (cyclomatic number, operator count)’, SlGPLAN Notices,
166
-
171
20,
(1
-
2), pp.
183
-
192
Vol. 48, pp. 989
-
994
pp.
10
-
16
1978,
13,
(3), pp. 29
-
33
35
16
HARRISON, W., and MAGEL, K.: ‘A complexity measure based
on nesting level’, SlGPLAN Notices,
1981, 16,
(3),
pp.
63
-
74
17
IYENGAR,
S.S.,
PARAMESWARAN, N., and FULLER,
J.:
‘A
measure of logical complexity of programs’, Computer
Languages,
1982, 7,
pp.
147
-
160
18
MAGEL,
K.:
‘Regular expressions in a program complexity
metric: SlGPLAN Notices,
1981, 16, (7).
pp.
61
-
65
19
MYERS, G.J.: ‘An extension to the cyclomatic measure of
program complexity’, SIGPLAN Notices,
1977, 12, (lo),
pp.
61
20
OVIEDO, E.: ‘Control flow, data flow and program complexity’
Proceedings of COMPSAC
80
Conference, Buffalo, NY, USA,
21
STETTER,
F.:
‘A measure of program complexity’, Computer
Languages,
1984, 9,
(3),
pp.
203
-
210
22
WOODWARD, M.R., HENNELL, M.A., and HEDLEY, D.A.: ‘A
measure of control flow complexity in program text’,
lEEE
Trans-
actions on Software Engineering,
1979,
5,
(1).
pp.
45
-
50
23
HALL, N.R., and PREISER,
S.:
’Combined network complexity
measures’, IBM Journal of Research
&
Development,
1984, 28,
(I),
pp.
15
-
27
24
PRESSMAN, R.S.: ‘Software engineering. A practitioner’s
approach’ (McGraw-Hill,
1987),
Second Edition
25
WIENER, R., and SINCOVEC,
R.:
’Software engineering with
Modula-2 and Ada’ (Wiley,
1984)
26
DAHL, O.J., DIJKSTRA, E.W., and HOARE, C.A.R.: ‘Structured
programming’ (Academic Press,
1972)
27
DIJKSTRA, E.W.: ‘Goto statement considered harmful’, Com-
munications of ACM,
1968, 18,
(8),
pp.
453
-
457
28
BASILI, V.R., and REITER, R.W.: ‘Evaluating automatable
measures of s/w development’. Proceedings of IEEE Workshop
on Quantitative Software Models,
1979,
pp.
107
-
116
29
EVANGELIST, W.M.: ‘Relationships among computational, soft-
ware and intuitional complexity’, SIGPLAN Notices,
1983, 18,
30
BAKER, A.L., and ZWEBEN,
S.:
’A comparison of measures of
control flow complexity’, /€€E Transactions
on
Software
Engin-
eering,
1980, SE-6, (6),
pp.
506
-
511
31
OULSNAM,
G.:
‘Cyclomatic numbers do not measure complex-
ity of unstructured programs’, lnformation Processing Letters,
32
PRATHER, R.E.: ’An axiomatic theory of software complexity
metrics’, Computer Journal,
1984,
27,
(4),
pp.
340
-
347
33
SINHA, P.K., JAYAPRAKASH,
S.,
and LAKSHMANAN, K.B.: ‘A
new look at the control flow complexity
of
computer programs:
in
BARNES,
D.,
and BROWN, P.: ‘Software engineering
86’.
Proceedings of BCS-IEE Software Engineering
86
Conference,
Southampton, England, Sept.
1986
(Peter Peregrinus,
1986),
34
EVANGELIST, W.M.: ‘Software complexity metric sensitivity to
program structuring rules’, Journal of Systems
&
Software,
35
KERNIGHAN, B.W., and PLAUGER, P.: ‘The elements of pro-
gramming style’ (McGraw-Hill,
1978)
36
PIOWARSKI, P.: ‘A nesting level complexity measure’, SlGPLAN
Notices,
1982,
17,
(9).
pp.
40
-
50
37
WHITTY, R.W., and FENTON, N.E.: ‘The axiomatic approach to
systems complexity’,
in
‘Designing for system maturity’.
Pergamon lnfotech State of the Art Report (Pergamon Press,
1985)
38
BASILI, V.R., and PERRICONE, B.T.: ‘Software errors and com-
plexity: an empirical investigation’, Communications
of
ACM,
39
BASILI, V.R., SELBY, R.W., and PHILLIPS, T.Y.: ‘Metric analysis
and data validation across Fortran projects:
I€€€
Transactions
on
Software Engineering,
1983,
SE-9,
(6),
pp.
652
-
663
40
BOWEN,
J.:
‘Are current approaches sufficient for measuring
software quality?’. Proceedings of Software Quality Assurance
Workshop,
1978,
pp.
148
-
155
41
CURTIS, B. et al.: ‘Measuring the psychological complexity of
software maintenance tasks with the Halstead and McCabe
metrics;
/€€E
Transactions
on
Software Engineering,
1979,
42
CURTIS, B., SHEPPARD, S.B., and MILLIMAN, P.: ‘Third time
charm: stronger prediction of programmer performance by soft-
ware complexity metrics’. Proceedings of Fourth IEEE Inter-
-
64
Oct.
1980,
pp.
146
-
152
(121,
pp.
57
-
59
1979,8,
pp.
207
-
211
pp.
88
-
102
1982,3,
pp.
231
-
243
1983, 27,
(I),
pp.
42
-
52
SE-5,
(2),
pp.
96
-
104
36
national Conference on Software Engineering, New York, NY,
USA,
1979
43
DAVIS, J.S.: ‘Chunks: a basis for complexity measurement’, lnfor-
mation Processing &Management,
1984,20, (1
-
2),
pp.
119
-
127
44
FEUER, A.R., and FOWLKES, E.B.: ‘Some results from an em-
pirical study of computer software’. Proceedings of Fourth IEEE
International Conference on Software Engineering, Munich,
West Germany, pp.
351
-
355
45
GAFFNEY, J.E.: ‘Program control, complexity and productivity’.
Proceedings of IEEE Workshop on Quantitative Software
Models for Reliability,
1979,
pp.
140
-
142
46
HENRY,
S.,
KAFURA, D., and HARRIS, K.: ‘On the relationship
among three software metrics: ACM SlGMETRlCS Performance
Evaluation Review, Spring
1981,
pp.
81
-
88
47
KITCHENHAM, B.A.: ‘Measures of programming complexity’,
ICL Technical Journal,
1981, 2,
(3),
pp.
298-316
48
PAIGE, M.: ‘A metric for software test planning’. Proceedings of
COMPSAC
80
Conference, Buffalo, NY, USA, Oct.
1980,
pp.
499
49
SCHNEIDERMAN, N.F.: ‘An experiment in software error data
collection and analysis’,
/E€€
Transactions
on
Software
Engin-
eering,
1979,
SE-5,
(3),
pp.
276
-
286
50
SHEN. V.Y., YU, T.-J., THEBAUT, S.M., and PAULSEN, L.R.:
‘Identifying error-prone software
-
an empirical study’,
/€E€
Transactions
on
Software Engineering,
1985, SE-11, (4),
pp.
317
51
SHEPPARD, S.B., CURTIS, B., MILLIMAN, P., BORST, M.A., and
LOVE, T.: ‘First-year results from a research program on human
factors in software engineering’. National Computer Conference,
New York, NY, USA, Jun.
1979,
AFIPS Conference Proceedings
52
SUNOHARA, T., TAKANO, A., VEHARA, K., and OHKAWA, T.:
‘Program complexity measure for software development man-
agement’. Proceedings of Fifth IEEE International Conference
on Software Engineering, San Diego, CA, USA, March
1981,
pp.
53
WANG, A.S., and DUNSMORE, H.E.: ‘Back-to-front programming
effort prediction’, Information Processing
&
Management,
1984,
54
WOODFIELD, S.N., SHEN, V.Y., and DCINSMORE,
H.E.:
‘A study
of several metrics for programming effort’, Journal ofSystems
&
Software,
1981,
2,
pp.
97-103
55
BASILI, V.R., and HUTCHENS, D.H.: ‘An empirical study of a
syntactic complexity family’,
lEEE
Transactions on Software
Engineering,
1983, SE-9, (6),
pp.
664-672
56
HUMPHREYS, R.A.: ’Control flow as a measure of program
complexity’. UK Alvey Programme Software Reliability and
Metrics Club Newsletter
4, 1986,
pp.
3-7
57
WHITTY,
R.:
‘Comments on “Control flow as a measure
of
program
complexity”’. UK Alvey Programme Software Reliability and
Metrics Club Newsletter
5,
1987,
pp.
1-2
58
SCHNEIDER, G.M., SEDLMEYER, R.L.. and KEARNEY,
J.:
‘On
the complexity of measuring software complexity’. National
Computer Conference, Chicago, IL, USA, May
1981,
AFIPS Con-
ference Proceedings Vol.
50,
pp.
317-322
59
BROOKS, R.E.: ‘Studying programmer behaviour experimentally:
the problems of proper methodology’, Communications of
60
STEVENS, W.P., MYERS, G.J., and CONSTANTINE, L.L.: ‘Struc-
tured design’,
lBM
Systems Journal,
1974, 13, (2),
pp.
115-129.
61
SAYWARD, F.G.: ’Experimental design methodologies in soft-
ware science: Information Processing
&
Management,
1984,20,
62
FENTON, N.E., and WHITTY, R.W.: ‘Axiomatic approach to soft-
ware metrification through program decomposition’, Computer
Journal,
1986, 29, (4),
pp.
330-340
63
PRATHER, R.E.: ‘On hierarchical software metrics’, Software
Engineering Journal,
1987,
2,
(2).
pp.
62-65
-
504
-
323
Vol.
48,
pp.
1021-1027
100-106
29,
(1-2),
pp.
139-149
ACM,
1980,23, (4),
pp.
207-213
(1-2),
pp.
223-227
M. Shepperd is with the School of Computing
&
Information Tech-
nology, Wolverhampton Polytechnic, Wolverhampton WVl ILY,
England, and is also with the Computing Discipline, Faculty of
Mathematics, The Open University, Walton Hall, Milton Keynes
MK7
Software Engineering
Journal
March
1988
... However, there has been an ongoing debate in the literature regarding the true effectiveness of code metrics in understanding software maintenance challenges [9]. For instance, while some studies have endorsed the utility of McCabe's cyclomatic complexity [67], [33], [68], [8], [69], others have questioned its effectiveness [70], [71], [30]. This skepticism extends to other widely-used code metrics as well [30]. ...
Preprint
Full-text available
Self-Admitted Technical Debt (SATD) refers to the phenomenon where developers explicitly acknowledge technical debt through comments in the source code. While considerable research has focused on detecting and addressing SATD, its true impact on software maintenance remains underexplored. The few studies that have examined this critical aspect have not provided concrete evidence linking SATD to negative effects on software maintenance. These studies, however, focused only on file- or class-level code granularity. This paper aims to empirically investigate the influence of SATD on various facets of software maintenance at the method level. We assess SATD's effects on code quality, bug susceptibility, change frequency, and the time practitioners typically take to resolve SATD. By analyzing a dataset of 774,051 methods from 49 open-source projects, we discovered that methods containing SATD are not only larger and more complex but also exhibit lower readability and a higher tendency for bugs and changes. We also found that SATD often remains unresolved for extended periods, adversely affecting code quality and maintainability. Our results provide empirical evidence highlighting the necessity of early identification, resource allocation, and proactive management of SATD to mitigate its long-term impacts on software quality and maintenance costs.
... Again, our objective is to have a single, reasonable indicator that informs us of the cyclomatic complexity of a functional construct. Last but not least, it is important to mention that we are aware that the use of cyclomatic complexity as a quality indicator has been criticized in previous studies (Shepperd 1988;Graylin et al. 2009) mainly because it is highly correlated with lines of code. However, our intention is not to fully capture code complexity. ...
Article
Full-text available
The Python programming language features several functional constructs which can bring some benefits, e.g., fewer side effects, easier parallelization, or, in some cases, better comprehensibility due to concise code. This paper investigates the extent to which the addition/modification of certain functional constructs, i.e., lambdas, comprehensions, and map/reduce/filter functions, have higher chances to induce fixes than other changes, as well as how much this correlates with the complexity of the construct, and what is the survival time of such fix-inducing changes. To this aim, we analyze the change history of 200 open-source Python projects accounting for ≃630k630k\simeq 630k commits. Results of the study show that: (i) changes to functional constructs have higher odds of inducing fixes than other changes, (ii) some functional constructs, such as lambdas and comprehensions, have higher odds of inducing fixes than others, (iii) the cyclomatic complexity of the functional construct is either not noteworthy or has a negligible effect, and (iv) the survival time of the fix-inducing changes varies among the involved functional constructs. The qualitative analysis performed on a statistically significant sample highlights different scenarios in which functional constructs have been fixed. Results of this study suggest better development support when using functional constructs during development, and prioritize code review and testing on certain areas of the source code.
... Added LOC of MPL files C10 Number of modified MPL files C11 Whether to modify the source files of the main PL C12 Delta Maintainability Model (DMM) metric value for the unit (e.g., method) size property (di Biase et al., 2019) C13 DMM metric value for the unit complexity property C14 DMM metric value for the unit interfacing property C15 Average cyclomatic complexity of the modified MPL files (Shepperd, 1988) C16 Average number of methods of the modified MPL files in this commit before this modification C17 Average number of methods modified in MPL files C18 Average token count in the modified MPL files (In PyDriller, a "token" refers to a basic unit of source code, such as a keyword, identifier, operator, or punctuation symbol) C19 Maximum cyclomatic complexity of the modified MPL files C20 Maximum number of methods of the modified MPL files in this commit before this modification C21 Maximum number of methods modified in the MPL files C22 Maximum token count in the modified MPL files C23 Entropy of the modified MPL files Lizard. C23: Extract MPL file extensions in this commit, calculate the number of files in each PL, and use formula ...
Preprint
Full-text available
Context: An increasing number of software systems are written in multiple programming languages (PLs), which are called multi-programming-language (MPL) systems. MPL bugs (MPLBs) refers to the bugs whose resolution involves multiple PLs. Despite high complexity of MPLB resolution, there lacks MPLB prediction methods. Objective: This work aims to construct just-in-time (JIT) MPLB prediction models with selected prediction metrics, analyze the significance of the metrics, and then evaluate the performance of cross-project JIT MPLB prediction. Method: We develop JIT MPLB prediction models with the selected metrics using machine learning algorithms and evaluate the models in within-project and cross-project contexts with our constructed dataset based on 18 Apache MPL projects. Results: Random Forest is appropriate for JIT MPLB prediction. Changed LOC of all files, added LOC of all files, and the total number of lines of all files of the project currently are the most crucial metrics in JIT MPLB prediction. The prediction models can be simplified using a few top-ranked metrics. Training on the dataset from multiple projects can yield significantly higher AUC than training on the dataset from a single project for cross-project JIT MPLB prediction. Conclusions: JIT MPLB prediction models can be constructed with the selected set of metrics, which can be reduced to build simplified JIT MPLB prediction models, and cross-project JIT MPLB prediction is feasible.
... Added LOC of MPL files C10 Number of modified MPL files C11 Whether to modify the source files of the main PL C12 Delta Maintainability Model (DMM) metric value for the unit (e.g., method) size property (di Biase et al., 2019) C13 DMM metric value for the unit complexity property C14 DMM metric value for the unit interfacing property C15 Average cyclomatic complexity of the modified MPL files (Shepperd, 1988) C16 Average number of methods of the modified MPL files in this commit before this modification C17 Average number of methods modified in MPL files C18 Average token count in the modified MPL files (In PyDriller, a "token" refers to a basic unit of source code, such as a keyword, identifier, operator, or punctuation symbol) C19 Maximum cyclomatic complexity of the modified MPL files C20 Maximum number of methods of the modified MPL files in this commit before this modification C21 Maximum number of methods modified in the MPL files C22 Maximum token count in the modified MPL files C23 Entropy of the modified MPL files Lizard. C23: Extract MPL file extensions in this commit, calculate the number of files in each PL, and use formula ...
Article
Full-text available
Context: An increasing number of software systems are written in multiple programming languages (PLs), which are called multi-programming-language (MPL) systems. MPL bugs (MPLBs) refers to the bugs whose resolution involves multiple PLs. Despite high complexity of MPLB resolution, there lacks MPLB prediction methods. Objective: This work aims to construct just-in-time (JIT) MPLB prediction models with selected prediction metrics, analyze the significance of the metrics, and then evaluate the performance of cross-project JIT MPLB prediction. Method: We develop JIT MPLB prediction models with the selected metrics using machine learning algorithms and evaluate the models in within-project and cross-project contexts with our constructed dataset based on 18 Apache MPL projects. Results: Random Forest is appropriate for JIT MPLB prediction. Changed LOC of all files, added LOC of all files, and the total number of lines of all files of the project currently are the most crucial metrics in JIT MPLB prediction. The prediction models can be simplified using a few top-ranked metrics. Training on the dataset from multiple projects can yield significantly higher AUC than training on the dataset from a single project for cross-project JIT MPLB prediction. Conclusions: JIT MPLB prediction models can be constructed with the selected set of metrics, which can be reduced to build simplified JIT MPLB prediction models, and cross-project JIT MPLB prediction is feasible.
Article
Full-text available
The problem of quantifying factors influencing the amount of effort needed to produce a high quality software is addressed.
Article
Numerous software quality studies have been performed over the past three years-mostly sponsored by the Rome Air Development Center. It is proposed by the author that more emphasis should be placed on devising and validating quantitative metrics that are indicative of the quality of software when it is being designed and coded. Such measures could be applied effectively, as relative guidelines without formal validation. However for such measures to be predictive of the quality of the delivered software, they must be validated with actual operational error data or data gathered in a simulated operational environment. This paper includes a review of proposed metrics from the literature a report of a Hughes intramodule metric study, and recommendations for refining proposed software quality assurance criteria.
Article
A concise, engineering-oriented resource that provides practical support to IT professionals and those responsible for the quality of the software or systems they develop. Software quality stems from two distinctive, but associated, topics in software engineering: software functional quality and software structural quality. This book studies the tenets of both of these notions, which focus on the efficiency and value of a design, respectively. It addresses engineering quality on both the application and system levels with attention to information systems (IS) and embedded systems (ES) as well as recent developments. Software Quality Engineering introduces the basic concepts of quality engineering like the nature of the engineering process, quality models and measurements, and evaluation quality, and provides a step-by-step overview of the application of software quality engineering in commonly recognized phases of the software development process. It also discusses management of software quality engineering processes, with special attention to budget, planning, conflict resolution, and traceability of quality requirements. Targeted at graduate engineering students and software quality specialists, Software Quality Engineering: Provides an analysis of interdependence between software functionality and its quality; Includes a list of software quality engineering "to-dos" and models of software quality requirements traceability; Covers the practical use of related ISO/IEC JTCI/SC7 standards. © 2014 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Article
The HIPO Hierarchy chart is being used as an aid during general systems design. The considerations and techniques presented here are useful for evaluating alternatives for thos portions of the system that will be programmed on a computer. The charting technique used here depicts more details about the interfaces than the HIPO Hierarchy chart. This facilitates consideration during general program design of each individual connection and its associated passed parameters. The resulting design can be documented with the HIPO charts. (if the designer decides to have more than one function in any module, the structure chart should show them in the same block. However, the HIPO Hierarchy chart would still show all the functions in separate blocks.) The output of the general program design is the input for the detailed module design. The HIPO input-process-output chart is useful for describing and designing each module.
Article
As the cost of programming becomes a major component of the cost of computer systems, it becomes imperative that program development and maintenance be better managed. One measurement a manager could use is programming complexity. Such a measure can be very useful if the manager is confident that the higher the complexity measure is for a programming project, the more effort it takes to complete the project and perhaps to maintain it. Until recently most measures of complexity were based only on intuition and experience. In the past 3 years two objective metrics have been introduced, McCabe's cyclomatic number v(G) and Halstead's effort measure E. This paper reports an empirical study designed to compare these two metrics with a classic size measure, lines of code. A fourth metric based on a model of programming is introduced and shown to be better than the previously known metrics for some experimental data.
Article
The technique of flowcharting proposed by Rothon [ROT79] (called Rothon diagrams throughout this paper) and the technique of Data Abstraction proposed by Parnas [PAR72a], [PAR72b], [PAR76] are briefly discussed along with other work that is relevant ...