Defining and Validating Metrics for Assessing the Maintainability of Entity-Relationship Diagrams
ABSTRACT Database and data model evolution is a significant problem in the highly dynamic business environment that we experience these days. To support the rapidly changing data requirements of agile companies, conceptual data models, which constitute the foundation of database design, should be sufficiently flexible to be able to incorporate changes easily and smoothly. In order to understand what factors drive the maintainability of conceptual data models and to improve conceptual modelling processes, we need to be able to assess conceptual data model properties and qualities in an objective and cost-efficient manner. The scarcity of early available and thoroughly validated maintainability measurement instruments motivated us to define a set of metrics for Entity-Relationship (ER) diagrams, which are a relevant graphical formalism of the conceptual data modelling method. In this paper we show that these objectives and easily calculated metrics, measuring internal properties of ER diagrams related to their structural complexity, can be used as indirect measures (hereafter called indicators) of the maintainability of the diagrams. These metrics may replace more expensive, subjective, and hence potentially unreliable maintainability measurement instruments that are based on expert judgement. Moreover, these metrics are an alternative to direct measurements that can only be obtained during the actual process of data model maintenance. Another result is that the validation of the metrics as early maintainability indicators opens up the way for an in-depth study of structural complexity as a major determinant of conceptual data model maintainability. Apart from the definition of a metrics suite, a contribution of this study is the methodological approach that was followed to theoretically validate the proposed metrics as structural complexity measures and to empirically validate them as maintainability indicators. This approach is based both on Measurement Theory and on an expe
- SourceAvailable from: umd.edu[show abstract] [hide abstract]
ABSTRACT: An effective data collection method for evaluating software development methodologies and for studying the software development process is described. The method uses goal-directed data collection to evaluate methodologies with respect to the claims made for them. Such claims are used as a basis for defining the goals of the data collection, establishing a list of questions of interest to be answered by data analysis, defining a set of data categorization schemes, and designing a data collection form. The data to be collected are based on the changes made to the software during development, and are obtained when the changes are made. To ensure accuracy of the data, validation is performed concurrently with software development and data collection. Validation is based on interviews with those people supplying the data. Results from using the methodology show that data validation is a necessary part of change data collection. Without it, as much as 50 percent of the data may be erroneous. Feasibility of the data collection methodology was demonstrated by applying it to five different projects in two different environments. The application showed that the methodology was both feasible and useful.IEEE Trans. Software Eng. 01/1984; 10:728-738.
- [show abstract] [hide abstract]
ABSTRACT: With the increasing focus on early development as a major factor in determining overall quality, many researchers are trying to define what makes a good conceptual model. However, existing frameworks often do little more than list desirable properties. The authors examine attempts to define quality as it relates to conceptual models and propose their own framework, which includes a systematic approach to identifying quality-improvement goals and the means to achieve them. The framework has two unique features: it distinguishes between goals and means by separating what you are trying to achieve in conceptual modeling from how to achieve it (it has been made so that the goals are more realistic by introducing the notion of feasibility); and it is closely linked to linguistic concepts because modeling is essentially making statements in some languageIEEE Software 04/1994; 11(2):42-49. · 1.62 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: In this paper, we describe an empirical investigation into the modifiability and understandability of object-oriented (OO) software. A controlled experiment was conducted to establish the effects of varying levels of inheritance on understandability and modifiability. The software used in this experiment consisted of a C++ system without any inheritance and a corresponding version containing three levels of inheritance, as well as a second larger C++ system without inheritance and a corresponding version with five levels of inheritance. For both of the systems, the application modelled a database for a University personnel system. A number of statistical hypotheses were tested. Results indicated that the systems without inheritance were easier to modify than the corresponding systems containing three or five levels of inheritance. Also, it was easier to understand the system without inheritance than a corresponding version containing three levels of inheritance. Results also indicated that larger systems are equally difficult to understand whether or not they contain inheritance. The results contained in this paper highlight the need for further empirical investigations in this area, particularly into the benefits of using inheritance.Journal of Systems and Software. 01/2000; 52:173-179.
Tel. : 32 - (0)9 – 264.34.61
Fax. : 32 - (0)9 – 264.35.92
Defining and Validating Metrics for Assessing the Maintainability of
Marcela Genero 1
Geert Poels 2
Mario Piattini 3
1 Department of Computer Science, University of Castilla-La Mancha, Ciudad Real, Spain
2 Department of Management Information, Operations Management and Technology Policy, Ghent
University, Ghent, Belgium
3 Department of Computer Science, University of Castilla-La Mancha, Ciudad Real, Spain
Corresponding author is G. Poels, Faculty of Economics and Business Administration, Ghent
University, geert.poels@UGent.be, tel: +32 9 264 34 97.
Defining and Validating Metrics for Assessing the Maintainability of
Marcela Genero1, Geert Poels2 and Mario Piattini1
1ALARCOS Research Group, Department of Computer Science
University of Castilla-La Mancha
Paseo de la Universidad, 4 - 13071 - Ciudad Real (Spain)
2Department of Management Information, Operations Management and Technology Policy
Faculty of Economics and Business Administration
Hoveniersberg 24, 9000 Ghent (Belgium)
Database and data model evolution is a significant problem in the highly dynamic business
environment that we experience these days. To support the rapidly changing data requirements
of agile companies, conceptual data models, which constitute the foundation of database design,
should be sufficiently flexible to be able to incorporate changes easily and smoothly. In order to
understand what factors drive the maintainability of conceptual data models and to improve
conceptual modelling processes, we need to be able to assess conceptual data model properties
and qualities in an objective and cost-efficient manner. The scarcity of early available and
thoroughly validated maintainability measurement instruments motivated us to define a set of
metrics for Entity-Relationship (ER) diagrams, which are a relevant graphical formalism of the
conceptual data modelling method. In this paper we show that these objectives and easily
calculated metrics, measuring internal properties of ER diagrams related to their structural
complexity, can be used as indirect measures (hereafter called indicators) of the maintainability
of the diagrams. These metrics may replace more expensive, subjective, and hence potentially
unreliable maintainability measurement instruments that are based on expert judgement.
Moreover, these metrics are an alternative to direct measurements that can only be obtained
during the actual process of data model maintenance. Another result is that the validation of the
metrics as early maintainability indicators opens up the way for an in-depth study of structural
complexity as a major determinant of conceptual data model maintainability.
Apart from the definition of a metrics suite, a contribution of this study is the methodological
approach that was followed to theoretically validate the proposed metrics as structural
complexity measures and to empirically validate them as maintainability indicators. This
approach is based both on Measurement Theory and on an experimental research methodology,
stemming mainly from current research in the field of empirical software engineering. In the
paper we specifically emphasize the need to conduct a family of related experiments, improving
and confirming each other, to produce relevant, empirically supported knowledge on the
validity and usefulness of metrics.
Keywords: conceptual data model, entity relationship diagrams, model evolution, model
quality, maintainability, structural complexity, metrics, prediction, theoretical validation,
empirical validation, experimentation
In the highly dynamic business environment we are in these days, existing business
models become obsolete at an ever-increasing pace and must therefore be designed
with flexibility in mind to satisfy the needs of agile companies. The constant reshaping
of business models increases the volatility of the data requirements that must be
supported by companies' data resource management policies and technologies. The
ability to rapidly change databases and their underlying data models to support the
needs of changing business models is a main concern of today's information managers.
Database evolution is a significant problem, high on the research agenda of information
It has been observed recently that conceptual data model quality, which is postulated as
a major determinant of the efficiency and effectiveness of data model evolution, is a
main topic in current conceptual modelling research (Olivé, 2002). Especially the
quality characteristic of model maintainability, i.e. the ability to easily change a model
(ISO, 2001), seems to be a key factor in conceptual data model evolution. In order to
evaluate and, if necessary, improve the maintainability of conceptual data models, data
analysts need instruments to assess the maintainability characteristics of the models they
are producing. The earlier this assessment can be done, the better, as it has proven
much more economical to evaluate and, if necessary, improve quality aspects during the
development process than afterwards (Boehm, 1981).
Maintainability is, however, an external quality property meaning that it can only be
assessed with respect to some operating environment or context of use (Fenton and
Pfleeger, 1997). The maintainability of a conceptual data model will, for instance,
depend on the model's understandability, which depends in turn on model properties
such as structure, clarity and self-expressiveness, but as well on the data analyst's
familiarity with the model (as for instance evidenced by the software cost drivers
included in the COCOMO II model for software cost estimation (Boehm et al., 1995)).
Likewise, the model's modifiability, another maintainability sub-characteristic, may
depend on factors external to the model, like the type of modifications that must be
applied. Therefore, external qualities such as maintainability are hard to measure
objectively early on in the modelling process. They generally need to be assessed in a
subjective way, for instance using expert opinions expressed through a formal or
informal scoring system.
For a more objective (and more cost-efficient) assessment of external quality attributes
such as maintainability, an indirect measurement based on internal model properties is
required (Fenton and Pfleeger, 1997). If a significant relationship with conceptual data
model maintainability can be demonstrated, then measures of internal model properties
(i.e. “metrics”) can be used as indicators of sufficient or insufficient maintainability.
Once constructed and validated, a measurement-fed maintainability prediction model
can be implemented and employed with relatively few costs compared to an a-posteriori
Therefore, the early availability of metrics would allow database designers to perform:
− a quantitative comparison of design alternatives, and therefore an objective selection
between several conceptual data model alternatives with equivalent semantic
− an early assessment of conceptual data model maintainability, even during the
modelling activity, and therefore a better resource allocation based on this
assessment (e.g. redesigning high-risk models with respect to maintainability).
In this paper we present a set of metrics for measuring the structural complexity of ER
diagrams and present three controlled experiments we carried out in order to gather
empirical evidence that these metrics are related to ER diagram maintainability. Our
focus on ER modelling is justified by the observation that in today's database design
world, it is still the dominant method of conceptual data modelling (Muller, 1999). Our
focus on structural complexity as an internal model property that is potentially related to
ER diagram maintainability is motivated by similar research in the field of empirical
software engineering, where structural complexity has been shown to be a major
determinant of external software qualities, including maintainability (Li and Henry,
1993; Harrison et al., 2000; Briand et al., 2001a; Fioravanti and Nesi, 2001; Poels and
Dedene, 2001; Bandi et al., 2003). Moreover, as a syntactic model property, the
measurement of ER diagram structural complexity can be automated, which assures the
cost-efficiency, the consistency and repeatability of the (indirect) maintainability
Apart from the proposal of ER diagram metrics to be used as maintainability indicators,
a contribution of this paper is a methodological approach to metric definition, based on
suggestions provided in Briand et al. (1995) and Calero et al. (2001), paying extensive
attention to the validation of the metrics as measures of internal model properties and as
indicators of external quality. For the validation of the metrics as ER diagram structural
complexity measures we use an approach grounded in Measurement Theory that was
proposed by Poels and Dedene (2000) for the theoretical validation of software metrics.
For the validation of the metrics as ER diagram maintainability indicators a series of
three related experiments was conducted in which the relationship between the metrics
values and direct maintainability assessments was statistically analysed for significance.
As far as we know, this is the first extensive empirical investigation of the relationship
between ER diagram maintainability and ER diagram internal properties related to
structural complexity. Apart from being able to assess ER diagram maintainability in
an objective and cost-efficient manner, our study may help towards understanding what
factors make a model maintainable and what factors may prove a hindrance to model
evolution properties, in particular maintainability characteristics. From this
understanding, quality-focused model design knowledge may be derived which can then
be incorporated into the modelling language, method and process.
This paper is structured as follows. In section 2 we discuss related work in the field of
metrics-based conceptual data model quality assurance. Our metric definition and
validation methodology is described and motivated in section 3. After proposing our
metrics suite in section 4, we proceed with the theoretical and empirical validation of
the metrics in sections 5 and 6 respectively. Finally, concluding remarks and an outline
of our future work is presented in section 7.
2. Related work
Even though several quality frameworks for conceptual data models have been
proposed (Lindland et al., 1994; Moody and Shanks, 1994; Krogstie et al., 1995;
Schuette and Rotthowe, 1998) most of them do not include quantitative measures (i.e.
metrics) to evaluate the quality of conceptual data models in an objective and cost-
efficient way. Moreover, the metrics that can be found in these frameworks have not
been thoroughly validated. The evidence that they measure the quality properties that
they are purported to measure is scarce.
In order to illustrate this, we will briefly comment the existing studies related to metrics
for conceptual data models.
Eick (1991) proposed a single quality metric applied to S-diagrams1 (Eick and
Lockemann, 1985; Eick and Raupp, 1991). Three quality aspects, expressiveness,
complexity and normalizedness normality, are meant to be captured by this metric. To
our knowledge there is no published work confirming the theoretical or empirical
validity of Eick´s metric.
Gray et al. (1991) proposed a suite of metrics for evaluating the quality characteristics
of ER diagrams (complexity and deviation from third normal form). These authors
commented that empirical validation of these metrics has been performed, but they do
not provide the results. Independently, Ince and Shepperd have carried out a theoretical
validation of the metrics using the algebraic specification language OBJ (Ince and
Shepperd, 1991) demonstrating the correctness of the underlying syntax of the metrics.
1 The S-diagram is a data model which was influenced by the work on the binary
relation model (Abrial, 1974) and The Semantic Database Model (SDM) (Hammer and
Kesh (1995) proposed a single measure for the quality of ER diagrams, combining
different measures of the ontological quality of ER diagrams. In most cases these
measures are Likert scales that need to be rated in a subjective way. Kesh´s proposal
also requires that, to obtain an overall quality assessment, each measurement for each
ontological quality factor be weighed, but he did not suggest how to determine the
weights. The Kesh measure has not been theoretically validated. After a real world
application of the quality model, Kesh concluded that his model provides a useful
framework for analysing and making revisions to ER diagrams. Apart from this
demonstration experiment, no empirical evidence on the validity and usefulness of the
measure has been collected.
Moody et al. (1998) proposed an extensive framework for evaluating the quality of ER
diagrams. This framework includes a set of 25 measures which capture different quality
characteristics of conceptual data models. Some of them are objectively calculated
while others are based only on subjective expert ratings. After having conducted an
action research study (Moody, 2000), Moody concluded that only the number of entities
and relationship metrics and the development cost estimate have benefits that outweigh
their cost of collection.
Si-Said Cherfi et al. (2002) defined a framework considering three dimensions of
quality: usage, specification and implementation. For each dimension, they defined
quality criteria and their corresponding measures (not all of them are objective). This
framework was used in a case study, with the aim of choosing between alternative ER
diagrams modelling the same reality. Even though the obtained results demonstrate that
the best ER diagram can be chosen using this framework, we do not agree that the ER
diagrams used in the study have an equivalent semantic content (as claimed by the
authors), which confounds the validity of the study.
Table 1 compares the current proposals of conceptual data model quality measures. The
first column of the table contains a reference to the study where the proposal has been
published. In the second column, the quality focus of the measures (i.e. the quality
properties that are measured ) is presented. The third column refers to the scope of the
measures, meaning the kind of data model that is measured. The fourth column shows
whether the measures are objective (e.g. metrics) or subjective (e.g. quality scores or
rating assigned by 'expert judges'). The fifth and sixth columns reflect whether there are
published studies in which the theoretical and the empirical validation of the metrics has
been attempted. The last column reflects whether an automated tool exists for the metric
Authors Quality focus Scope Objective/
Eick (1991) Expressiveness,
third normal form
S-diagrams Objective N
ER diagram Objective Partially Partially Y
Kesh (1995) ER diagram Objective
N N Y
Moody (1998) Completeness,
Table 1. Summary of quality measures for conceptual data models
ER diagram Objective
N Partially Y
et al. (2002)
N Partially Y
Summarising the related work, we can conclude that:
− Most of the existing proposals of quality measures for conceptual data models have
not gone beyond the definition stage. There are few published studies about their
empirical validity and even fewer on their theoretical validity. The validation
studies that have been carried out are demonstration experiments, case-studies or
action research studies and are of a one-of-a-kind nature. A rigorous empirical
validation approach based on controlled and replicated experimentation has not been
− Eick´s metric and Kesh´s metric as single measures for quality are relatively simple,
but do not realistically capture conceptual data model quality. Fenton and Pfleeger
(1997) do not recommend the use of this type of metrics that intend to capture all the
aspects of quality in a single number. Furthermore, combining different
measurement values into one aggregate value is of limited applicability in practice if
the proposal is unclear about the combination rule, weights, etc. (as in Kesh (1995)
or if the choice of weights or coefficients is left to the discretion of the practitioner.
Gray et al. (1991), for instance, remark that the values of the coefficients are likely
to be highly dependent on the local environment, and may vary from one case to
3. Metric definition and validation methodology
In recent literature, a large number of measures have been proposed for capturing
properties of software artifacts in a quantitative way. However, few of these 'software
metrics' have successfully survived the initial definition phase and are actually used in
industry. This is due to a number of problems related to the theoretical and empirical
validity of software metrics, the most relevant of which are summarised below (Briand
et al., 2002):
− Measures are not always defined in the context of some explicit well-defined
measurement goal of the industrial interest they help to reach, e.g. reduction of
development effort or faults present in the software products.
− Even if the goal is explicit, the experimental hypotheses are often not made explicit,
e.g. What do you expect to learn from the analysis and can you believe it?
− Measurement definitions do not always take into account the environment or context
in which they will be applied, e.g. Would you use a complexity measure that was
defined for non-object oriented software in an object oriented context?
− A reasonable theoretical validation of the measure is often not possible because the
attribute that a measure aims to quantify is often not well defined, e.g. Are you using
a measure of complexity that corresponds to your intuition about complexity
− A large number of measures have never been subject to an empirical validation, e.g.
How do you know which measures of size best predict effort in your environment?
This situation has frequently led to some degree of fuzziness in measure definitions,
properties, and underlying assumptions, making the use of the measures difficult, their
interpretation hazardous, and the results of the various validation studies somewhat
These characteristics do not imply that progress cannot be made in the measurement
field. For this purpose metrics must be defined in a methodological and disciplined way.
In this work we follow the definition and validation method proposed by Calero et al.
(2001), that tries to alleviate some of the problems mentioned above. The main tasks of
this method are shown in figure 1.
Figure1. Method for definition and validation of software metrics
In the rest of this section we will take a closer look at the different steps of the method
for obtaining valid metrics.
3.1. Metric definition
Metric definition should be based on clear measurement goals. It is advised to take a
goal-oriented measure definition approach such as GQM (Basili and Rombach, 1988;
Basili and Weiss, 1984; Van Solingen and Berghout, 1999; Briand et al., 2002) to
ensure that the metric selection or definition effort, the validation effort and the
subsequent measurement effort all contribute to the achievement of a well-defined goal.
Often this measurement goal relates to external quality attributes (e.g. maintainability,
reusability) that cannot be measured directly at the desired moment (i.e. when the
information is needed), and for which indirect, early available and easy to collect
measures need to be found.
The metric definition activity further involves the precise definition of the object of
measurement, including its constituents. The unambiguous definition of the domain of
a metric is a prerequisite for the validation of the metric, as well as for the practical use
of the metric afterwards.
Finally, as metrics are functions that take an argument (i.e. the object of measurement)
and produce a value, there must be a precise description of the mathematical, logical or
other formulation and notation that is used to define and denote the function
prescriptions (i.e. how to calculate the value for the given argument).
3.2. Theoretical validation
The main goal of theoretical validation is to assess whether a metric actually measures
what it purports to measure (Fenton and Pfleeger, 1997). In the context of an empirical
study, the theoretical validation of metrics establishes their construct validity, i.e. it
'proves' that they are valid measures for the constructs that are used as variables in the
The validity of the measurement instruments used for the variables of an empirical
study is a key factor in the overall study validity. On the one hand, the theoretical
validity of the ER diagram metrics proposed in section 4 is used to claim their construct
validity when used as measures of ER diagram structural complexity, which is the
independent variable in the empirical studies described in section 6. On the other hand,
knowledge of the scale type of the metrics helps when choosing the appropriate
statistical technique(s) to analyse the data obtained in the experiments. The preferred
theoretical validation approach is one in which measures can be used as ratio scales,
allowing the use of a wide range of data analysis techniques, including parametric
statistics, and thus providing maximum flexibility to the researcher (Fenton and
Unfortunately, as Van den Berg and Van den Broek (1996) remark, even though several
attempts have been made at proposing methods and principles to carry out the
theoretical validation of metrics (mainly in the context of software engineering), there is
not yet a standard, accepted way of theoretically validating a software metric.
Work on theoretical validation has followed two paths:
− Measurement-theory based approaches such as those proposed by Whitmire (1997),
Zuse (1998), and Poels and Dedene (2000).
− Property-based approaches (also called axiomatic approaches), such as those
proposed by Weyuker (1988) and Briand et al. (1996, 1997).
In this work the theoretical validation of the ER diagram structural complexity metrics
will be performed using the DISTANCE framework. This framework, proposed by
Poels and Dedene (2000), is a conceptual framework for software metric validation
grounded in Measurement Theory (Krantz et al., 1971; Roberts, 1979). This theory,
originating from the discipline called Philosophy of Science, is a normative theory
prescribing the conditions that must be satisfied in order to use mathematical functions
as “measures”. Measurement theoretic approaches to software metric validation, such
as DISTANCE, propose methods to verify whether these conditions hold for software
metrics. Their use of Measurement Theory as the reference theory for measurement
distinguishes these approaches from property-based approaches to metric validation
(e.g. Weyuker, 1988; Briand et al., 1996, 1997), which are based on argumentation,
subjective experience or even intuition, instead of having a well-established theoretical
base (Poels and Dedene, 1997).
The DISTANCE framework offers a measure construction procedure to model
properties of software artefacts and define the corresponding software metrics. In this
sense, the framework has an added value above other measurement theoretic approaches
(e.g. Whitmire (1997); Zuse (1998)) that focus on metric validation and cannot be used
for metric definition. The framework is called DISTANCE as it uses the mathematical
concept of distance and its measurement theoretic interpretation as the main cornerstone
of the validation approach. The basic idea of DISTANCE is to define properties of
objects in terms of distances between the objects and other objects that serve as
reference points (or norms) for measurement. The larger these distances, the greater the
extent to which the objects are characterised by the properties (or for quantitative
properties, called quantities by Ellis (1968), the higher their values are). This particular
definition of properties of objects allows them to be measured by functions that are
called “metrics” in Mathematics. Metrics are functions that satisfy the metric axioms,
i.e. the set of axioms that are necessary and sufficient to define distance measures (in
the sense of Measurement Theory) (Suppes et al., 1989). The measurement theoretic
theorems associated with distance measurement are part of the framework, meaning that
the conditions specified by these theorems are met when carrying out the measure
construction procedure. This ensures that the theoretical validity of the measures
obtained with DISTANCE is formally proven within the framework of Measurement
Theory. An important pragmatic consequence of this explicit link with Measurement
Theory is that the resulting measures define ratio scales.
In this section we will summarise the measure construction procedure of DISTANCE
for ease of reference. A detailed discussion of the process model and the measurement
theoretic foundations of DISTANCE is outside the scope of the paper, but can be found
in (Poels and Dedene, 2000; Poels, 1999).
3.2.1. The DISTANCE measure construction procedure
The measure construction procedure prescribes five activities. The procedure is
triggered by a request to construct a measure for a property that characterises the
elements of some set of objects. There might, for instance, be a request that expresses
the need for a measure of some structural complexity aspect (i.e. a property) of ER
The activities of the DISTANCE procedure are briefly summarised below. For
notational convenience, let P be a set of objects that are characterised by some property
pty for which a measure needs to be constructed.
184.108.40.206. Finding a measurement abstraction
The objects of interest must be modelled in such a way that the property for which a
measure is needed is emphasised (Zuse, 1998). This is actually a requirement of all
model-based approaches to metrication since the object of interest needs a
representation before a metric can be applied (Dumke, 1998). A suitable representation,
called measurement abstraction hereafter, should allow to what extent an object is
characterised by the property to be observed. By comparing measurement abstractions,
we should be able to tell whether an object is more, equally, or less characterised by the
property than another object (at least for quantities (Ellis, 1968)).
The outcome of this activity is a set of objects M that can be used as measurement
abstractions for the objects in P with respect to the property pty. Let abs: P → M be the
function that formally describes the rules of the mapping.
220.127.116.11. Defining distances between measurement abstractions
This activity is based on a generic definition of distance that holds for elements in a set.
To define distances between elements in a set, the concept of an ‘elementary
transformation function’ is used. This is a homogeneous function on a set representing
an atomic change of an element in some prescribed way. By applying an elementary
transformation function to an element of a set, the element is transformed into another
element of the set by changing it. Moreover, this change is atomic, meaning that it
cannot be subdivided into a series of ‘smaller’ changes. Elements of a set can be
changed in multiple ways. The second activity of the DISTANCE procedure requires
that a set Te of elementary transformation functions be found that is sufficient to change
any element of M into any other element of M (for a discussion on the conditions
required for defining such a set of elementary transformation functions, we refer to
(Poels, 1999)). If such a set Te is found, then the distance between two elements of M is
defined as a shortest sequence of elementary transformations (i.e. applications of the
elementary transformation functions in Te) to transform one element into the other. In
general, there are multiple sequences of elementary transformations to go from one
element to another. For the notion of distance used by the procedure, only the shortest
sequences are taken into account, i.e. those that require the minimum number of
18.104.22.168. Quantifying distances between measurement abstractions
This activity requires the definition of a distance measure for the elements of M.
Basically this means that the distances defined in the previous activity are now
quantified by representing (i.e. measuring) them as the number of elementary
transformations in the shortest sequences of elementary transformations between
Formally, the activity results in the definition of a metric (in the mathematical sense) δ:
M × M → ℜ that can be used to map (the distance between) a pair of elements in M to a
22.214.171.124. Finding a reference abstraction
This activity requires a kind of thought experiment. We need to determine what the
measurement abstraction for the objects in P would look like if they were characterised
by the theoretical lowest amount of pty (again, on condition that the property is a
quantity). If such a hypothetical measurement abstraction (i.e. an object in M) can be
found (for necessary conditions we refer to (Poels, 1999)), then this object is called the
reference abstraction for P with respect to pty.
The idea of DISTANCE is now to use this reference abstraction as a reference point for
measurement. More particularly, DISTANCE uses the distance between the
measurement abstraction of an object p in P and the reference abstraction of P as a
formal definition of the property pty of the object p. This means that the larger the
distance between the measurement abstraction and the reference abstraction, the more
the property characterises the object (i.e. the greater the amount of pty in p). Let ref: P
→ M be the function that describes the rules of the mapping.
126.96.36.199. Defining a measure for the property
The final activity consists of defining a measure for pty. Since properties are formally
defined as distances, and these distances are quantified with a metric function, the
formal outcome of this activity is the definition of a function µ: P → ℜ such that ∀ p ∈
P: µ(p) = δ(abs(p), ref(p)).
3.3. Empirical validation
Common wisdom, intuition, speculation, and proof of concepts are not reliable sources
of credible knowledge (Basili et al., 1999), hence it is necessary to place the metrics
under empirical validation. Empirical validation is crucial for the success of any
software measurement project (Schneidewind, 1992; Kitchenham et al., 1995; Basili et
al., 1999). Through empirical validation we can demonstrate with real evidence that the
measures we have proposed serve the purpose they were defined for and that they are
useful in practice, i.e. related to some external attribute worth studying and therefore
helping to reach some goal (e.g. assessment, prediction).
In our case we want to demonstrate the empirical validity of the ER diagram structural
complexity metrics, this means that we have to corroborate that they are really related to
ER diagram maintainability.
3.3.1. Type of empirical studies
There are three major types of empirical research strategy (Robson, 1993):
− Experiments. Experiments are formal, rigorous and controlled investigations. They
are launched when we want control over the situation and want to manipulate
behaviour directly, precisely and systematically. Hence, the objective is to
manipulate one or more variables and control all other variables at fixed levels. An
experiment can be carried out in an off-line situation, for example in a laboratory
under controlled conditions, where the events are organised to simulate their
appearance in the real world. Experiments may alternatively be carried out on-line,
which means that the research is executed in the field under normal conditions
(Babbie, 1990). Experiments in the context of empirical software engineering
research are further discussed in Wohlin et al. (2000).
− Case studies. A case study is an observational study, i.e. it is carried out by the
observation of an on-going project or activity. The case study is normally aimed at
tracking a specific attribute or establishing relationships between different attributes.
The level of control is lower in a case study than in an experiment. . Case study
research is further discussed in Robson (1993), Stake (1995), Pfleeger (1994;1995)
and Yin (1994).
− Surveys. A survey is often an investigation performed in retrospect, when for
example, a tool or technique, has been in use for a while. The primary means of
gathering qualitative or quantitative data are interviews or questionnaires. These are
completed by taking samples which are representative of the population to be
studied. The results from the survey are then analysed to derive descriptive or
explanatory conclusions. Surveys provide no control over the execution or the
measurement, though it is possible to compare them to similar ones. Surveys are
discussed further in Babbie (1990), Robson (1993) and Pfleeger and Kitchenham
The prerequisites for an investigation limit the choice of the research strategy. A
comparison of strategies can be used bearing in mind the following factors:
− Execution control. Describes how much control the researcher has over the study.
For example, in a case study data is collected during the execution of a project. If
management decides to stop the studied project due to, for example, economic
reasons, the researcher cannot continue the case study. The opposite is the
experiment where the researcher is in control of the execution.
− Measurement control. Is the degree to which the researcher can decide upon which
measures are to be collected, included or excluded during execution of the study.
− Investigation cost. Depending on which strategy is chosen, the cost differs. This cost
is related to, for example, the size of the investigation and the need for resources.
The difference between case studies and experiments is that if we choose to
investigate a project in a case study, the outcome of the project is some form of the
product that may be retailed, i.e. it is an on-line investigation. In an off-line
experiment the outcome is some form of experience which is not profitable in the
same way as a product.
− Ease of replication. Refers to the ease with which we can replicate the basic
situation we are investigating. If replication is not possible, then you cannot carry
out an experiment. Due to the possibility of replication of experiments their results
are more generalisable.
Table 2 shows each of the factors mentioned above for each empirical strategy and can
be used as a general guideline to decide whether to perform a case study, a survey or an
Factor Survey Case study Experiment
No No Yes
No Yes Yes
High Medium High
Ease of replication
High Low High
Table 2. Research strategy factors (Wohlin et al., 2000)
Looking at the comparison of empirical strategies shown in table 2 and the available
resources, we decided to undertake controlled experiments to validate the ER diagram
metrics proposed in section 4. For this reason we will now look in detail at the steps
involved in the experiment process.