ArticlePDF Available

Evaluating Software Reuse Alternatives: A Model and Its Application to an Industrial Case Study


Abstract and Figures

We propose a model that enables software developers to systematically evaluate and compare all possible alternative reuse scenarios. The model supports the clear identification of the basic operations involved and associates a cost component with each basic operation in a focused and precise way. The model is a practical tool that assists developers to weigh and evaluate different reuse scenarios, based on accumulated organizational data, and then to decide which option to select in a given situation. The model is currently being used at six different companies for cost-benefit analysis of alternative reuse scenarios; we give a case study that illustrates how it has been used in practice.
Content may be subject to copyright.
Evaluating Software Reuse Alternatives:
A Model and Its Application to
an Industrial Case Study
Amir Tomer, Member, IEEE, Leah Goldin, Senior Member, IEEE, Tsvi Kuflik, Member, IEEE,
Esther Kimchi, and Stephen R. Schach, Member, IEEE Computer Society
Abstract—We propose a model that enables software developers to systematically evaluate and compare all possible alternative
reuse scenarios. The model supports the clear identification of the basic operations involved and associates a cost component with
each basic operation in a focused and precise way. The model is a practical tool that assists developers to weigh and evaluate different
reuse scenarios, based on accumulated organizational data, and then to decide which option to select in a given situation. The model
is currently being used at six different companies for cost-benefit analysis of alternative reuse scenarios; we give a case study that
illustrates how it has been used in practice.
Index Terms—Reuse models, cost estimation, maintenance management, software libraries, process metrics, process measurement,
OFTWARE reuse is a major component of many software
productivity improvement efforts because reuse can
result in higher quality software at a lower cost and
delivered within a shorter time [1]. One of the critical reuse
success factors is the adoption of the product-line approach
of Boehm [2]. This approach is detailed in [3], which
describes a product line organized to operate in two circles:
core-asset development and product development. Core
assets are software artifacts that are developed or acquired
(during operation of the first circle), are owned by the
product line, and are available for acquisition as building
blocks for specific products. However, it is at the discretion
of product developers, performing in the other circle, to
decide whether to acquire (buy)coreassetsfromthe
product-line repository or develop their software artifacts
from scratch (make). This decision is critical, not only for
each specific product, but also for the entire product line
because the central activity of core-asset development is
driven by the needs of specific products, both present and
future. In order to evaluate the alternatives correctly, these
make/buy decisions should be based on data collected from
past reuse and development efforts and on estimation of
future activities. On the other hand, it is not trivial to
measure all reuse activities precisely because reuse can be
performed both centrally and individually for specific
products. Moreover, artifacts can be transferred from one
product to another, as part of opportunistic reuse efforts,
and control is lost over both the configuration and the cost.
In this paper, we propose a model by which reuse
activities in product lines may be systematically measured
or estimated and alternative reuse scenarios may be
evaluated and compared for effective support of the
make/buy process. In Section 2, we give background
material on software reuse and software reuse cost models.
In Section 3, we describe and demonstrate, in a case study,
our conceptual model of reuse operations and scenarios. In
Section 4, we add cost elements to the model and address
typical reuse scenarios and their costs. In Section 5, we
describe how the model is being used in practice and
present an industrial case study. Section 6 presents
conclusions and future work.
2.1 Reuse Concepts
Reuse takes place when an existing artifact is utilized to
facilitate the development or maintenance of the target
product. The scope of reuse can vary from the narrowest
possible range, namely, from one product version to
another, to a wider range such as between two different
products within the same line of products or even between
products in different product lines. The scope of reuse is
limited, in general, by the nature of and constraints on a
product line; for example, it is unwise to reuse a desktop
application in a mission-critical system. The granularity of
reuse can vary from the narrowest range of reusing a single
. A. Tomer is with RAFAEL Ltd., PO Box 2250/1P, Haifa 31021, Israel.
. L. Goldin is with Golden Solutions, PO Box 6017, Kfar Saba 44641, Israel.
. T. Kuflik is with the Department of Management Information Systems,
University of Haifa, Mount Carmel, Haifa 31905, Israel.
. E. Kimchi can be reached at 46 Ben Gurion St. Ramat Hasharon 47321,
Israel. E-mail:
. S.R. Schach is with the Department of Electrical Engineering and
Computer Science, Vanderbilt University, Station B Box 351679,
Nashville, TN 37235. E-mail:
Manuscript received 10 Feb. 2004; revised 14 May 2004; accepted 2 July 2004.
Recommended for acceptance by A. Mili.
For information on obtaining reprints of this article, please send e-mail to:, and reference IEEECS Log Number TSE-0022-0204.
0098-5589/04/$20.00 ß 2004 IEEE Published by the IEEE Computer Society
artifact, such as a component, document, or test case, to the
widest possible range, namely, a whole product. Reuse can
take place during any phase of the life cycle, including
marketing and proposals, requirements, analysis, design,
coding, and testing. Performing reuse at a certain level of
reuse (life-cycle phase) usually carries with it reuse at all
subsequent levels. In black-box reuse, the artifact is reused
unchanged, whereas, in white-box reuse, the artifact is
modified to fit the target product.
In order to be able to apply reuse systematically, there is
a need to study and analyze the domain in which software
reuse will take place in order to identify software artifacts
that are candidates for future reuse. This initial task is called
domain analysis [4], a process of systematic studying,
modeling, and analysis of the domain. Domain analysis
starts with domain information gathering and yields a
domain model, architecture, and list of artifacts (existing
and future) that are candidates for future reuse.
2.2 Reuse Costs and Costing Policies
In many ways, reuse-based software development is similar
to the usual software development process that is per-
formed by each software organization in terms of its own
software life-cycle model. Costing all the life-cycle activities
involved in the development of new software artifacts for a
product is therefore done in terms of the life-cycle costing
policy of that organization. However, incorporating reuse
into the process involves the use of artifacts and an
organizational infrastructure that have been developed
and established out of the context of the current product.
In the following, we will list the main costs involved in
reuse-based development, as well as various policies that
may be employed for the costing of the entire product
development effort. Most organizations are aware of these
individual costs. However, some companies have difficulty
in integrating these costs into a precise company-wide
costing policy. Moreover, even when a company has such a
policy in place, such a costing policy is company-specific.
This was our experience, for example, at the beginning of
the ISWRIC project (described in detail in Section 5.1), when
the seven companies involved were unable to share and
compare reuse cost data because each company computed
its costs in its own way. This was the main motivation for
developing the model introduced in this paper.
The issue of reuse-based development costs and costing
(or funding) policies is detailed in [5]. Although we take a
similar general approach, we have refined the cost categor-
ization in order to highlight the costing difficulties, which
are resolved by our model.
The overall costs of reuse-based software development,
from the organizational or product-line point of view, may
be divided into three categories: product construction, core-
asset construction, and infrastructure. The following are the
costs included in each category.
2.2.1 Product Construction Costs
. Asset acquisition cost includes the cost of purchas-
ing the asset from a source outside the product, as
well as the effort invested in seeking the appropriate
asset, whether it is stored in a core-asset repository
or is available elsewhere in the organization, in the
public domain, or the market.
. Asset development cost includes the cost of the
analysis, design, and coding of new or modified
artifacts, as well as the cost incurred by all the
verification and validation activities performed
directly on the asset, such as design reviews, code
walkthroughs, and unit tests. When existing artifacts
are reused as white boxes, the relevant development
cost is only the cost of modifications made to those
artifacts. However, in all instances of reuse, includ-
ing black box and COTS, extra effort may have to be
invested in modifying other artifacts of the product
in order to be able to integrate the reused asset into
that product.
. Product integration, verification, and validation
cost includes the cost of partial and full integrations,
design reviews, subsystem and system testing, and
acceptance testing. It also includes the effort put in
the development of the testing environment, such as
test equipment, simulators, automated test proce-
dures, etc.
2.2.2 Core-Asset Construction Cost s
. Asset acquisition cost is the same type of cost as for
product assets.
. Asset development cost is similar to the develop-
ment cost of product asset development. Although
the repository is not expected to produce complete
products, and therefore the product integration,
verification, and validation cost does not apply, it
might produce compound assets, which may be
constructed in a similar way to a complete product.
In this case, the cost of integration, verification, and
validation incurred until the core asset is ready for
use may be considered entirely as this asset’s
development cost.
2.2.3 Infrastructure Costs
. Repository establishment and maintenance cost
includes the cost of database analysis and design,
tool development or purchase, database administra-
tion, etc.
. Repository storage and cataloging cost includes the
cost of the effort needed for approving the artifacts
for the repository and determining the metadata
required in order to enable efficient search and
retrieval of core assets in the catalog.
. Domain Analysis (DA) cost is incurred by a set of
activities needed to define the scope and the contents
of candidate reuse assets, together with locating
them in past products and making them available
for the core-asset repository. One important activity
in DA is to define the categories into which core
assets are classified in order to check commonality
among candidate reusable artifacts. These categories
are later incorporated into the catalog metadata
accompanying the assets in the repository. Another
DA activity is the “mining” of candidate assets from
past products, in terms of the defined scope and
categorization of artifacts within the repository. DA
is usually performed when a product line is initiated
and established. However, because DA may be
reapplied whenever the scope of the product-line
changes (for example, when entering a new business
niche), it may be considered an on-going activity,
with accumulated costs.
Because the core asset construction costs and the
infrastructure costs previously mentioned are incurred
centrally, at the organization (or the product-line) level, a
costing policy is needed that defines the way these costs are
distributed among target products. A number of examples
of such costing policies (“funding strategies”) are found in
[5]. However, these strategies are explained in words,
providing no practical tool for calculating the costs
The cost model introduced in this paper provides a
systematic and straightforward way of calculating the
overall cost of various reuse alternatives in order to select
the one that is the most cost-effective. When our model is
described later, we will refer to the costs mentioned in this
section and show how they should be considered in terms
of a specific costing policy.
2.3 Related Work
Software reuse is not merely a technical issue. On the
contrary, it is widely accepted that the organizational
challenges of software reuse outweigh the technical ones,
as summarized by the “STARS” program report, for
example [6]. As a result, metrics are needed in order to
make “business decisions possible by quantifying and
justifying the investment necessary to make reuse happen”
[7]. The importance of economic models and metrics of
software reuse is widely recognized, as can be seen from
numerous existing models, such as the measure put
forward by Barnes and Bollinger [8], who suggested
analytical approaches to making good reuse investments,
based on cost-benefit analysis. Another example is the cost-
benefit analysis with net present value model of Malan and
Wentzel [9]. There are many more models, as surveyed by
Wiles [10], Poulin [7], Lim [11], and others.
Various economic models can be used to show that
successful software reuse is possible in theory. In fact, there
are many software reuse success stories in which the
economic benefits are evident in practice. For example,
substantial costs were saved due to implementation of
software reuse in the “STARS” demonstration project. The
first system that was developed cost 43 percent of a
reference baseline and a second cost only 10 percent [6].
Another example is the experience at Hewlett-Packard
described in [11] where, by applying software reuse, a
defect reduction of 15 percent was achieved and produc-
tivity increased by 57 percent. Several additional reuse
success stories are presented by Poulin [7].
Organizations are usually in various stages of “the
incremental adoption of reuse” described by Jacobson et al.
[1] as a practical approach for transition toward becoming a
domain-specific reuse-driven organization. However, the
transition phase is lengthy, and every organization needs to
find its own way; it is not obvious that the same approach is
good for everyone. Hence, organizations are at different
stages of the transition phase, and project managers must
decide what course to take at every point in time, based on
their analysis of the current situation. None of the
previously mentioned models and evaluations provides a
means to analyze and weigh alternative approaches for
software development with various types of reuse (includ-
ing no reuse). In most instances, the point of reference of
these models is software development without any reuse at
all. Even when reuse is an alternative, it is common to find
examples of only white-box reuse.
What is needed, in addition to a detailed economic
model, is a framework for analysis and comparison of
development approaches: without reuse, with white-box
reuse, and with black-box reuse. We believe that the
economic measurements used to assess the benefits of the
entire software reuse effort can also be used to help
developers in the “make versus buy” assessment. A
systematic definition of various reuse scenarios that will
provide a means to weigh the different alternatives, based
on past experience and estimation, is now needed.
The reuse cost model described in this paper is based on a
three-dimensional model, introduced in [12], in which all
the activities of software construction in product lines, as
well as the evolution of all the artifacts involved, are
described along three axes: development, maintenance, and
reuse (see Fig. 1). A key point in the underlying three-
dimensional model and in the product-line approach in
general is the existence of a core-asset repository and the
discipline that artifacts cannot be freely transferred between
specific products without first being stored and cataloged in
the core-asset repository. The activity of fetching reuse
candidates from specific products and copying them into
the repository is called mining,whereastheinverse
operation of copying artifacts from the repository into
specific products, in order to reuse them, is called acquisition
[3]. For simplicity, the model described in this paper is only
two-dimensional; specifically, we distinguish only between
the reuse operation, that is, copying artifacts from one plane
to another along the reuse axis, and all other operations
along the development and maintenance axes that are
Fig. 1. The three-dimensional evolution of a software product line.
concerned with making modifications to artifacts within the
same plane. In other words, for simplicity, we have unified
the development and maintenance axes, as will be
3.1 Asset Types
We assume that there is a core-assets repository in which
artifacts and their relevant metadata are indexed. There is
no need, however, for the artifacts themselves to be
physically stored in the repository—they may reside in
any specific product library, provided that each artifact
preserves its unique identification, namely, its type and the
identification of the specific product to which it belongs [13].
We define two kinds of artifacts, as follows:
. Repository Assets: Artifacts that are cataloged in the
core-asset repository, including their metadata.
These artifacts are available either for acquisition
by specific products or for modification (for further
reuse purposes) within the core-asset repository
itself. One important attribute of an artifact is its
size or some other measure by which the complexity
of two different assets may be compared.
. Private Assets: Artifacts that are contained within a
specific product and are available either for mining
and cataloging or for private modification, but only
within the environment of the same specific product.
Representatives of the two asset types appear in Fig. 2.
The thick perimeter of repository assets denotes that they
are “wrapped” in metadata (catalog information).
Next, we define a number of reuse-related operations,
which typically are performed during software development.
3.2 Elementary Operations
In terms of the underlying three-dimensional model, we
identify an elementary operation as an activity for obtaining
an asset, performed along a single axis. However, in contrast to
the usual type of software product construction, the scope
of reuse-based software construction extends beyond a
single product. Based on the assumption that each product
development is performed within its own budget, this
extension is significant with respect to cost. We therefore
unified the development and maintenance operations of the
original model into one category of elementary operations,
denoted as Transformation Operations (along the horizontal
axis), whereas reuse operations (along the vertical axis) will
be denoted here as Transition Operations (see Fig. 2). Each of
the operations described here may be performed indepen-
dently of other operations. Later, we will define sequences
of operations as scenarios.
3.2.1 Transformation Operations
Adaptation for Reuse (AR): Modifying an existing repository
asset R, resulting in another reusable repository asset R
Note that both R and R
must reside in the repository.
New for Reuse (NR): Constructing a new repository
asset R
from scratch. It is expected that the asset will be
developed in conformance with applicable standards that
make it effectively reusable.
White-Box reuse (WB): Modifying an existing private
asset P into another private asset P
within the same
product. Both P and P
must reside in the product library as
revisions of the same artifact.
New Development (ND): Constructing a new private
asset P
from scratch.
3.2.2 Transition Operations
Cataloged asset Acquisition (CA): Acquiring a copy of a
repository asset R for a specific product as a private asset P.
It is assumed that P needs to undergo further modifications
(white-box reuse) within the product in contrast to black-
box reuse (see next reuse operation).
Black-Box reuse (BB): Acquiring a copy of a repository
asset R
“as is” (that is, with no modifications) for a specific
product as a private asset P
. Ideally, this should be an
elementary copy operation; in practice, however, this
operation may require some overhead activities as a
consequence of adapting the architecture of the target
product in order for the imported asset to fit. This is also the
case when acquiring COTS assets for the product.
Mining and Cataloging (MC): Identifying and acquiring an
existing private asset P , from a certain product, and then
storing and cataloging it formally as a repository asset R.
Copy and Paste (CP): Acquiring a copy of a private asset P
for a specific product. The source asset is not cataloged in
the repository, and awareness of its existence is based on
personal knowledge.
eXternal Acquisition (XA): Acquiring an asset from some
external source and cataloging it as a repository asset R.
This is the case with COTS artifacts.
3.3 Reuse Scenarios
We define any sequence of elementary operations as a
scenario. As an example for such a scenario, we start this
section with a short case study, which will be utilized in the
sections that follow.
Case Study 1. Two software products are being devel-
oped simultaneously, each of which needs to employ a
queue of elements. Although one of the products is to be
programmed in Java and the other in C++, the software
manager realizes that both need a queue class design, which
will apparently be identical in both products. Moreover, a
previous product has implemented a queue package in Ada,
which provides the same functionality. Also, anticipating
Fig. 2. Assets and reuse operations.
that a queue class may be useful for further products in the
future, the software manager decides to assign a software
engineer to reverse-engineer the Ada package and turn it
into an object-oriented queue class design. The entire effort
may now be decomposed in terms of the elementary reuse
operations, as follows (see Fig. 3):
1. First, the Ada queue package (Q
) is identified in the
source product, copied into the repository, and
complemented with metadata, resulting in a reposi-
tory queue package (Q
). This operation was
previously defined as Mining and Cataloging (MC).
2. Because there was no reusable design available, the
repository queue package is reverse-engineered,
obtaining a queue class design in the repository
), ready for reuse by both products. This
operation was previously defined as Adaptation
for Reuse (AR).
3. Next, a copy of the queue class design is obtained by
both the Java product (Q
) and the C++ product
). These operations have been defined as
Catalog Acquisition (CA).
4. Finally, the products need the class design to be
turned into code classes in Java (Q
) and C++
), respectively. This is considered to be White-
Box reuse (WB) because private modifications were
required for both products.
The sequence of operations performed, namely,
MC ! AR ! CA ! WB, is the scenario the software en-
gineers performed in order to obtain the two queue classes
from the Ada queue package. Later, we will compare the
cost of this scenario with the cost of other possible scenarios
for achieving the same result.
A reuse scenario is any sequence of elementary operations
performed while practicing reuse. In order to determine the
optimal scenario through which the final target can be
obtained from the original source assets, we must be able to
compare the relative cost of alternative scenarios. Accord-
ingly, next we define the cost of the elementary operations
from which the cost of typical reuse scenarios can be
computed or estimated and compared.
4.1 Elementary Cost Components
We can now associate a cost factor with each of the nine
basic reuse operations of Section 3.2. In general, each
operation involves two assets: a source asset, the asset to be
copied or modified, and a target asset, the copy or the
modification obtained from the source. New assets, though,
are obtained from scratch. However, in order to deal
uniformly with all cases, we will assume the existence of a
null asset, denoted by ;, which is considered to be the
source asset in cases in which the target asset is new. This is
reflected in Fig. 4.
4.1.1 Cost of Transformation Operations
ðR; R
Þ is the cost of Adaptation for Reuse (AR), derived
from the direct effort (in person-hours, for example)
invested in obtaining repository asset R
from another
repository asset R, in a way that will result in the
compatibility of R
with the reuse standards applicable for
core assets in the product line.
Þ is the cost of developing R
scratch as New for Reuse (NR).
Þ is the cost of White-Box reuse (WB), derived
from the direct effort invested in performing private
modifications to P , at the specific product level, to obtain
. In practice, the cost of white-box reuse is usually smaller
than adaptation for reuse. Private modification focuses on
only those changes needed to adapt the private asset for its
specific needs, within the context of the specific product,
whereas adaptation for reuse takes into consideration
several products that might reuse that asset.
Þ is the cost of constructing the
target private artifact P
from scratch as New Develop-
ment (ND).
4.1.2 Cost of Transition Operations
ðR; P Þ is the cost of Catalog Acquisition (CA), which is
the cost involved in replicating a copy of the repository
asset R as a private asset P ; this cost also includes the search
and evaluation efforts before an asset is selected. Although
in most cases this cost is expected to be close to zero, license
or royalty fees may apply as well.
ð;;PÞ is the cost of Copy and Paste (CP),
incurred in searching for and acquiring an uncataloged
artifact from some external source.
ðP;RÞ is the cost of Mining (M) a copy of a private
asset P from a specific product and Cataloging (C)itasa
repository asset R. This cost typically includes determining
Fig. 3. Reuse scenario of Case Study 1.
Fig. 4. The elementary components of the cost of reuse.
R’s metadata and storing the metadata in the repository for
reuse. MC does not include any changes made to the artifact
ð;;RÞ is the cost of eXternally Acquiring
(XA) a COTS asset R and cataloging it in the repository.
This is usually the purchase price.
Þ is the cost of Black-Box (BB) reuse. It is the
cost of the direct effort invested in acquiring a cataloged
repository asset R
and integrating it into a specific product.
Because the reused artifact itself is not changed, there is no
modification cost as such. However, extra effort may be
required to adapt the architecture, interfaces, or other
artifacts of the target product in order to enable the asset to
be integrated. This is a typical cost when COTS artifacts are
considered for a product.
4.2 Applying a Costing Policy
In Section 2.2, we addressed the typical costs incurred
during reuse-based development (reuse costs). In the
following, we show how costing policies are used to
incorporate these costs in our model. We assume that the
reuse costs can be measured or estimated by the company,
but the difficult question is how to apply these costs to
specific reuse scenarios.
We view a costing policy as a mapping between the
reuse costs of Section 2.2 and the costs of the reuse-related
operations of Section 4.1. Table 1 shows an example of such
a policy. Using this table, the cost of each basic operation is
the sum of the applicable reuse costs, along the relevant
row, as follows:
. An empty entry denotes that the corresponding cost
is not applicable to the operation in question and,
therefore, its contribution to the total is zero. For
example, asset acquisition cost is not applicable to
any transformation operations because they are
performed only on artifacts that already exist in
the product or in the repository.
. “Full” means that the measured or estimated cost is
charged in full to the cost of the operation. This is
usually the case when all of the cost is incurred in
A Costing Policy
obtaining the resulting artifact. For example, the full
product development cost is charged to the white-
box reuse ðC
ÞÞ operation, reflecting the
direct cost of obtaining P
from P .
. “Part” means that part of the measured or estimated
cost is charged to the cost of the operation. This is
usually the case when costs are incurred during
activities that do not necessary involve the resulting
artifact. For example, the cost of repository establish-
ment and maintenance is partially charged to any
transition operation accessing the repository.
We remark that the concept of the costing policy table
can be easily adapted to any other cost directives affecting
the process (training, for example) by adding a new column
to the table and deciding how to allocate the total cost to the
basic reuse operations. We also believe that this tool can be
easily employed in formalizing the various funding
strategies of [5].
4.3 Typical Reuse Scenarios and Their Costs
In Case Study 1, we demonstrated a specific reuse
scenario, whose total cost, according to the costing policy
of Table 1, is
1. C
Þ is the cost of Mining and Cataloging
into Q
2. C
Þ is the cost of reverse-engineering the
queue class design Q
3. C
Þ and C
Þ are the costs of
acquiring Q
for the specific products, and
4. C
Þ and C
Þ are the
costs of programming the Java and the C++ queue
classes, respectively.
The total cost can easily be compared to the cost of
developing both Q
and Q
from scratch for the
purpose of estimating the cost-effectiveness of reusing Q
in this fashion.
We now generalize the case study and consider a variety
of typical reuse scenarios. In each of these scenarios, there
are n products in which product i requires target software
component T
;i¼ 1; ...;n. We further assume that there is
significant commonality among the T
;i¼ 1; ...;n.In
addition, there exists a private component S in an existing
product from which every T
may be obtained by perform-
ing certain modifications. We consider the following typical
scenarios, from pure development to systematic reuse:
Pure Development (PD) scenario. This is the case in which
each group responsible for one of the n products is unaware
of the existence of S and, therefore, develops its target
component T
from scratch (see Fig. 5). In Figs. 5, 6, 7, 8, and
9, shadowed arrows represent repeated application of the
same activity in separate products.
Assuming that the cost of developing a new component T
is C
Þ, the total cost of the PD scenario is
Opportunistic Reuse (OR) scenario. In this case, each group
responsible for one of the n products knows that there exists
a viable source S, but it is not stored in any shared
repository or registered in a public catalog. Therefore, each
of the products invests independently in searching for S,
obtaining a copy of it as a private source asset S
(at a cost of
ÞÞ and then modifying it (white-box
reuse) for utilization in target component T
(see Fig. 6).
Thus, the cost of the entire OR scenario is
ÞÞ þ
Although opportunistic reuse may result in some savings
for a specific product, it is nevertheless performed on an
individual product basis. Opportunistic reuse therefore
does not benefit from common activities or a public
Controlled Reuse (CR) scenario. Suppose that a core-asset
repository has been established in which assets are stored or
cataloged for the benefit of other products. Suppose further
that a group, independent of any specific product, is
responsible for looking for private assets in specific products,
mining them, and cataloging them in the repository. In this
Fig. 5. Pure development scenario.
Fig. 6. Opportunistic reuse scenario.
Fig. 7. Controlled reuse scenario.
Fig. 8. Systematic reuse scenario.
Fig. 9. Alternative systematic reuse scenario.
scenario, a copy of the private component S is first mined and
cataloged as a repository asset S
(at the cost of C
ðS; S
Other than the mining and cataloging effort, no other effort is
invested in modifying S, and it is made available for other
products for acquisition in its original form. Thus, each of the
products will acquire a copy S
of S
(at a cost of C
and then reuse S
(white-box reuse), similarly to the previous
scenario. The total cost of the CR scenario is then (see Fig. 7)
ÞÞ þ
At first sight, the savings of the CR scenario over the
OR scenario do not appear to be significant. However, in
practice, the mining and cataloging effort is performed
just once in the CR scenario, in contrast to n times in the
OR scenario. Furthermore, it is likely that cost of the
catalog acquisition is close to zero.
Systematic Reuse (SR) scenario. The ultimate goal of reuse
processes is to have a set of assets that are readily available
for reuse in all future products without further modifica-
tion. In our case, a copy of S will first be mined and
cataloged as a repository asset S
, at the cost of C
Then, S
will undergo modifications to convert it into a
reusable assets S
, at the cost of C
Þ. Finally, each
of the products will acquire a copy of S
as its private
target asset T
, (black-box reuse). The cost of integrating S
into product i is C
Þ;i¼ 1; ...;n. The total cost of
the S
scenario is therefore (see Fig. 8)
An alternative form of systematic reuse is based on New
for Reuse development (instead of adaptation). In this case,
shown in Fig. 9, the cost would be
5.1 Implementation by the Software Industry
A version of this model was implemented and deployed by
ISWRIC (Israel SoftWare Reuse Industrial Consortium), a
joint project of seven leading Israeli industrial companies
supported, in part, by the Chief Scientist of the Israeli
Ministry of Trade and Industry. Three of these companies
are mainly defense contractors, developing specific custo-
mer-oriented systems, whereas the other four produce off-
the-shelf or customizable commercial products. The model
is deployed as a common measurement tool with which
data from industrial pilot projects are collected and
This two-phase, three-year project commenced in June
2000. During the first phase, a common software reuse
methodology was developed, based on existing approaches
and practices, but modified to take into account the specific
requirements of the consortium members. During the
second phase, all seven participating companies implemen-
ted the methodology in real projects. Each company tailored
the model to the specific needs of its pilot projects and
evaluated the methodological aspects relevant to the pilot
projects and the company.
The model is currently being used to compute and
compare the potential costs and benefits of alternative reuse
scenarios. It is also used for periodical reports of economic
aspects of software reuse programs. Of the seven participat-
ing companies, six have defined and evaluated their reuse
scenarios using the model and have used it as an aid to
justify their selected reuse approach. Scenarios defined by
these six companies include:
. Systematic reuse, implemented by five companies.
Of these companies, three are implementing sys-
tematic reuse based on new development of reusable
assets, and two are implementing systematic reuse
based on existing assets adapted for reuse.
. Controlled reuse, implemented by one company.
This company has selected a reuse policy in which
predefined degree of maturity from where each
product can adapt these assets (by means of white-
box reuse) to its specific needs.
Thedataweregathered,presented, and analyzed
periodically by the consortium members using the model,
as part of periodic project reports and information
Use of the model in the industry to date has demon-
strated the potential benefits of the various reuse scenarios.
The model also provided a common ground for the
presentation and discussion of results at ISWRIC meetings.
5.2 Description of an Industrial Case Study
In this section, we show how the cost model was used at
Tadiran Electronic Systems Ltd. (TES), one of the seven
companies participating in ISWRIC. The results are based
on data collected at TES from 2000 to 2003.
TES is a systems house that develops large-scale soft-
ware-intensive systems to meet specific requirements as
defined by its customers. Most projects have a well-defined
time scale and a budget under a strict contract. Products are
divided into “families” of significant similarity. During the
course of time, the company has acquired many software
assets that are potentially reusable. In practice, however, the
only reuse performed has been opportunistic, based on
individual knowledge. In view of the uncertainty of the
market, it is difficult to anticipate the potential number of
future reuses of an asset and, therefore, the company faces a
constant conflict between the immediate needs of products
versus the need to invest in a long-term infrastructure.
As part of TES’s participation in ISWRIC, TES decided to
adopt the Controlled Reuse scenario for the most part, but
the Systematic Reuse scenario was followed in several
specific cases. The company established a reuse repository
in which assets were cataloged along with their metadata
(but not necessarily ready for black-box reuse). It was
decided to allocate resources to each product team in order
to perform the Adaptation for Reuse activity within the
scope of their product, thus allowing the team to take into
account the needs of all current and future anticipated
reuses of the asset.
The reuse cost model was implemented in one depart-
ment of TES. As a first step, experienced systems and
software engineers performed a thorough domain analysis
in the department. As a result of this analysis, a software
architecture common to most products in the department
was defined, and assets that share a high degree of
commonality among products were identified as candidates
for reuse. These assets were analyzed in depth and later
were cataloged in a reuse repository, along with their
metadata. Eventually, it was decided that some of them
would be adapted for reuse (by adding functionality,
enveloping, improving quality and documentation, etc.),
whereas the others would be rewritten as new assets
(because of their poor quality, technological changes, etc.).
The adaptation for reuse was performed within the context
of three large-scale projects that were running concurrently.
5.3 Results of the Industrial Case Study
The cost model was used to evaluate the cost-effectiveness
of the selected reuse scenarios, over other alternative
scenarios, with respect to seven assets. Four of the assets
(numbered 1 through 4) were software modules that
interface with special-purpose sophisticated hardware
elements, whereas the other three (numbered 5 through 7)
were generic human-computer interface modules that have
to be configured for specific systems.
First, the cost (in person-hours) of each of the basic
operations was determined, as shown in Table 2. The costs
in boldface were actually measured, whereas the others
were estimated by experienced software and systems
engineers. The total cost of the domain analysis, 770 per-
son-hours, was evenly distributed among the seven assets.
Assets 1, 2, 3, 4, and 7 were adapted for black-box reuse and
then were integrated into their target products. Assets 5 and
6 were stored in the repository in their original form, after
domain analysis, and later were acquired by their target
products and adapted for specific use by applying white-
box reuse. The table also includes the number of products
that reused each asset in the pilot study.
As previously mentioned, the seven assets went through
two different scenarios (see Section 3.3): Assets 1, 2, 3, 4, and
7 went through the Systematic Reuse scenario, comprising
mining and cataloging (MC), adaptation for reuse (AR), and
black-box reuse (BB) activities. Assets 5 and 6 went through
the Controlled Reuse scenario, comprised of mining and
cataloging (MC), catalog acquisition (CA), and white-box
reuse (WB). The calculation of the respective costs of these
scenarios, based on the data of Table 2, was performed
using the following formulas, where n represents the
number of reuses of the asset in the pilot study:
Systematic Reuse cost ðadapted assetsÞ
¼ C
þ C
þ n C
Controlled Reuse cost ¼ C
þ n ðC
þ C
The costs of the actual scenarios for each of the seven
assets are shown in boldface on the first and in the third line
of Table 3. In order to evaluate the cost-effectiveness of
these scenarios, they were compared with three other
possible scenarios (see Section 3.3):
. Systematic Reuse of newly developed assets, com-
prised of Mining and Cataloging (MC) (which had
been done anyway), New for Reuse (NR), and Black-
Box reuse (BB);
The Cost of Basic Operations for Seven Assets
Costs in boldface were measured, the others were estimates.
. Opportunistic Reuse, comprised of (separately for
each product) copy and paste (CP) and White-Box
reuse (WB); and
. Pure Development, comprised of New Development
The calculation of the respective costs of these scenarios,
based on the data of Table 2, was performed using the
following formulas, where n represents the number of
reuses of the asset in the pilot study:
Systematic Reuse cost ðnew assetsÞ¼C
þ C
þ n C
Opportunistic Reuse cost ¼ n ðC
þ C
Pure Development cost ¼ n C
We deduce from Table 3 that, except for Asset 5, the
reuse scenario that was implemented in practice achieved
the least cost over all its alternatives. As for Asset 5, Table 3
shows that the cost would have been less if Systematic
Reuse (adapted) had been implemented instead of Con-
trolled Reuse.
In addition to the costs themselves, we can now give a
more quantitative meaning to cost-effectiveness. Suppose
that we chose to implement reuse scenario S over
alternative reuse scenario S
. Suppose also that we
measured C
, the cost of S, while estimating C
, the cost
of S
. As a consequence of our choice, we nominally saved
, which represents ðC
percent savings.
For example, the cost of Systematic Reuse for Asset 1 was
810 p-h, whereas the alternative Pure Development scenario
was estimated at 1,400 p-h. Therefore, the savings were
(1,400 - 810)/1,400, or 42 percent.
Table 4 presents the cost-effectiveness of the actual reuse
scenarios implemented for each of the assets relative to
three other scenarios: the counterpart scenario, Opportu-
nistic Reuse (which has been popular in the company), and
Pure Development (which always appears to be the natural
We can see from Table 4 that, if the choice were between
only Systematic Reuse and Controlled Reuse, then the
largest relative savings (63 percent) would have resulted
from the Systematic Reuse of Asset 7. Conversely, the worst
choice would have been the Controlled Reuse of Asset 5 at a
cost of 28 percent relative to the alternative. In comparison
to the other referenced scenarios, we can see that the relative
savings obtained by implementing the preferred scenario
over Opportunistic Reuse were between 1 and 65 percent. In
comparison to Pure Development, the relative savings were
even more dramatic—between 41 and 81 percent.
The Cost of Alternative Reuse Scenarios for Seven Assets
Relative Savings of Alternative Reuse Scenarios
The data and the analysis of the industrial case study of
TES validate the capability of our model to express reuse
cost-effectiveness in a simple and straightforward way.
Nevertheless, the most significant power of the model lies
in its ability to predict the relative cost-effectiveness of
future reuse, in order to select the best scenario. This
prediction is calculated, as previously shown, by aggregat-
ing estimations of simple and well-defined activities. It is
not surprising that the most significant parameter is the
anticipated number of future reuses of the relevant asset. In
general, we cannot predict this number accurately. How-
ever, calculating the costs of a number of alternative
scenarios for a variable number of future reuses may yield
the cost-effectiveness trend, revealing the break-even point
for each of those alternatives.
Fig. 10 shows the cumulative cost of reusing Asset 4, for
all five scenarios of Table 3, for anticipated number of
reuses from 1 to 7. It is clear that the Systematic Reuse
scenario, based on adapting existing assets, is the most
beneficial from the second reuse onward. However, when
only one reuse is anticipated, then Opportunistic Reuse is
preferred over Controlled Reuse; Systematic Reuse can be
The model described in this paper is a powerful tool that
can be utilized to evaluate and compare all possible
scenarios of reuse-based development, regardless of the
specific costing policies used. The major contribution of this
model is the clear identification of the basic operations
involved, together with the ability to associate a cost
component with each basic operation in a focused and
accurate way. The evaluation is largely based on estimated
data and is therefore subject to the inaccuracies of
estimation. Moreover, in different reuse contexts (such as
different units in an organization, applying reuse at
different times, or different individuals applying reuse),
estimation might vary significantly. However, when this
model is applied consistently within an organization, the
accumulated data can be analyzed for better estimates in the
future. We are currently analyzing the data obtained from
the six companies in order to define learning curves and
derive “organizational factors” that can be applied for
better estimations.
There are two other contributions. First, the cost model
provides a practical tool for developers to weigh and
evaluate different reuse scenarios, based on accumulated
organizational data, and decide what option to choose in a
given situation. Second, it provides a systematic mechanism
for management to analyze and evaluate various reuse
alternatives at the organizational level.
As described in Section 3, this model is derived from the
3D evolution-tree model [11]. In fact, it is a projection of the
entire evolution tree onto the 2D plane defined by the
transition operations (reuse) axis and the transformation
operations (development and maintenance) axis. However,
we are currently investigating a broader model, which
covers all aspects of reuse-based development.
This paper focuses on comparison of cost, which is
usually the most important factor in reuse-based develop-
ment. However, other factors, such as time-to-market and
product quality, are also expected to improve by reusing
software assets. We are currently working on extending our
model to include these other factors, too.
The discussion of cost in this paper is viewed from the
standpoint of the entire product line and considers the
cumulative cost of the entire reuse effort. However,
managers of an individual product are concerned with the
direct cost of that product within the context of the
appropriate infrastructure (such as a repository) and central
activities (such as domain analysis). It is often expected that
these will be financed centrally, by the organizational R&D
group, for example. The development of a cost model,
Fig. 10. Alternative reuse scenarios for Asset 4.
based on that costing policy, is another direction for future
The model was developed within the framework of the
Israeli Software Reuse Industrial Consortium, a group of
seven leading Israeli systems developers. The mutual
lessons learned by the consortium, the different software
reuse scenarios applied by them, and the use of our model
in the different scenarios are other issues for future
The authors would like to thank the members of both the
management and technical committees of the ISWRIC
project who worked together and contributed numerous
useful suggestions toward clarifying and stabilizing the
model: Michael Vinokur (IAI—Israel Aircraft Industry),
Itzhak Lavi (IAI), Varda Barzilay (ECI Telecom), Arieh
Stroul (Creo), Shlomit Morad (Orbotech), Guy Pe’er
(Orbotech), Rami Rashkowitz (Rafael), Anat Grynberg
(NICE), and Moshe Salem (Iltam). The work of Stephen R.
Schach was supported in part by the US National Science
Foundation under grant number CCR-0097056.
[1] I. Jacobson, M. Griss, and P. Johnsson, Software Reuse, Architecture,
Process, and Organization for Business Success. Addison-Wesley,
[2] B. Boehm, “Managing Software Productivity and Reuse,” Compu-
ter, vol. 32, no. 9, pp. 111-113, Sept. 1999.
[3] P. Clements and L.M. Northrop, Software Product Lines: Practices
and Patterns. Addison-Wesley, 2001.
[4] Domain Analysis and Software Systems Modeling, R. Prieto-Dı
az, and
G. Arango, eds. IEEE CS Press, 1991.
[5] “A Framework for Software Product Line Practice, Version 4.1,”
Software Eng. Inst., Carnegie Mellon Univ., 2004, http://www.
[6] “Software Technology for Adaptable, Reliable Systems, Air Force/
STARS Demonstration Project Experience Report,” Version 3.1,
vol. I, USAF Material Command, Electronics Systems Center,
Hanscom AFB, Apr. 1996.
[7] J.S. Poulin, Measuring Software Reuse. Addison-Wesley, 1997.
[8] B.H. Barnes and T.B. Bollinger, “Making Reuse Cost-Effective,”
IEEE Software, vol. 8, no. 1, pp. 13-24, Jan./Feb. 1991.
[9] R. Malan and K. Wentzel, “Economics of Software Reuse
Revisited,” Technical Report HPL-93-31, Hewlett-Packard, 1993.
[10] E. Wiles, “Economic Models of Software Reuse: A Survey,
Comparison and Partial Validation,” Technical Report UWA-
DCS-99-032, Dept. of Computer Science, Univ. of Wales,
Aberystwyth, U.K., Apr. 1999.
[11] W. Lim, “Reuse Economics: A Comparison of Seventeen Models
and Directions for Future Research,” Proc. Fourth Int’l Conf.
Software Reuse, pp. 41-51, Apr. 1996.
[12] S.R. Schach and A. Tomer, “Development/Maintenance/Reuse:
Software Evolution in Product Lines,” Proc. First Software Product
Line Conf., pp. 437-450, Aug. 2000.
[13] A. Tomer and S.R. Schach, “A Three-Dimensional Model for
System Design Evolution,” Systems Eng., vol. 5, no. 4, pp. 264-273,
Amir Tomer received the BSc and MSc degrees
in computer science from the Technion, Israel,
and the PhD degree in computing from Imperial
College, London, United Kingdom. He is the
director of Systems and Software Engineering
Processes at RAFAEL Ltd., Israel, where he has
been since 1982, holding a variety of systems
and software engineering positions, both tech-
nical and managerial. He also teaches software
engineering at the Technion and other colleges.
He is a member of the IEEE.
Leah Goldin received the P hD degree i n
computer science from the Technion, Israel,
where her research focused on requirements
engineering. As the CEO of Golden Solutions,
she is an independent consultant specializing in
software engineering, process, and quality. She
has accumulated more than 20 years of experi-
ence developing embedded systems. During
that period, she has fulfilled various manage-
ment and technical roles, including software
development, SQA, and process improvement, at Rafael, IAI, Com-
verse, and Nice. She currently divides her time between consulting to
high-tech companies and teaching in academia; she was the head of the
Software Engineering Department at the Jerusalem College of
Engineering. She is a senior member of the IEEE and currently serves
as the chair of the Israeli Chapter of the IEEE Computer Society.
Tsvi Kuflik received the BSc and MSc degrees
in computer science and the PhD degree in
industrial engineering (information systems)
from Ben-Gurion University, Israel. He currently
works in the Department of Management In-
formation Systems at the University of Haifa,
Israel. He has 20 years of practical experience in
software and systems engineering and develop-
ment, as well as practical experience in the
design and development of personalized, adap-
table information systems. His research interests include software
engineering, artificial intelligence, and information retrieval. He is a
member of the IEEE and the IEEE Computer Society.
Esther Kimchi received the BSc and MSc
degrees in mathematics from the Technion in
Haifa, Israel, and the PhD degree in mathe-
matics from Tel Aviv University, Israel. She has
10 years of experience in mathematical research
at Tel Aviv University, Israel, and Columbia
University, New York, where she took comple-
mentary studies in computer science. She has
more than 21 years of practical experience in
systems and software engineering and develop-
ment, project management, and company methodology definition at
Tadiran Electronic Systems in Israel. She has actively participated in
various conferences on software engineering and testing. She now
works as an independent consultant.
Stephen R. Schach received the PhD degree
from the University of Cape Town. He is an
associate professor in the Department of Elec-
trical Engineering and Computer Science at
Vanderbilt University, Nash ville , Tennessee.
He is the author of more than 115 refereed
research papers. He has written 10 software
engineering t extbo oks, incl uding Object-Or-
iented and Classical Software Engineering, sixth
edition (McGraw-Hill, 2005). He consults inter-
nationally on software engineering topics. His research interests are in
software maintenance and open-source software engineering. He is a
member of the IEEE Computer Society.
. For more information on this or any other computing topic,
please visit our Digital Library at
... Additional costs for mining and the procurement of reusable materials are: 1) the expense of the technical staff and consultants to determine the necessary components for the application; 2) any costs incurred in ensuring that the reusable feature or device performs properly in order to test its potential re-use in the new application (including media transformations, implementation discrepancies, noncurrent documentation, and costs of the existing functionality evaluation for potential reuse); 3) the production of a specification document for the preparation and execution of the re-use of components; and 4) the purchase price and repair cost from outside the organization [ 6,7,8,9,18 ]. ...
... For example, a reusable component built in C #uses the Microsoft Access database. The new application is built in Java using the Oracle database, so the improvements needed to reuse the part the require a great deal of effort [ 6,9]. ...
... Product testing costs include: 1) the development of a test environment; 2) unit testing and debugging; 3) acceptance testing; 4) subsystem and system testing; and 5) testing of functionality by a quality control engineer [6,9]. The interaction of the reused component with the system must be thoroughly tested to guarantee that the functionality is performing as expected. ...
Conference Paper
Measuring the software reusability has become a prime concern in maintaining the quality of the software. Several techniques use software related metrics and measure the reusability factor of the software, but still face a lot of challenges. This work develops the software reusability estimation model for efficiently measuring the quality of the software components over time. Here, the Rider based Neural Network has been used along with the hybrid optimization algorithm for defining the reusability factor. Initially, nine software related metrics are extracted from the software. Then, a holoentropy based log function identifies the Measuring the software reusability has become a prime concern in maintaining the quality of the software. Several techniques use software related metrics and measure the reusability factor of the software, but still face a lot of challenges. This work develops the software reusability estimation model for efficiently measuring the quality of the software components over time. Here, the Rider based Neural Network has been used along with the hybrid optimization algorithm for defining the reusability factor. Initially, nine software related metrics are extracted from the software. Then, a holoentropy based log function identifies the normalized metric function and provides it to the proposed Cat Swarm Rider Optimization based Neural Network (C-RideNN) algorithm for the software reusability estimation. The proposed C-RideNN algorithm uses the existing Cat Swarm Optimization (CSO) along with the Rider Neural Network (RideNN) for the training purpose. Experimentation results of the proposed C-RideNN are evaluated based on metrics, such as Magnitude of Absolute Error (MAE), Mean Magnitude of the Relative Error (MMRE), and Standard Error of the Mean (SEM). The simulation results reveal that the proposed C-RideNN algorithm has improved performance with 0.0570 as MAE, 0.0145 as MMRE, and 0.6133 as SEM.
... Tomer et al. [50] presented for applicable scenarios for software reuse, and proposed a model for evaluating these scenarios in terms of cost. Seven industrial assets were used in evaluating these scenarios and comparing them with the cost of the normal development. ...
... AL-Badareen et al. [32], [33] proposed a framework for extracting, storing and retrieving normal and reusable components during software development lifecycle. The study presents new scenarios for software reuse, which are not considered in Tomer et al. [50]. Therefore, AL-Badareen et al. [19] proposed new model for evaluating the cost of software reuse taking into account the new scenarios. ...
... In each reuse process, the applicable scenarios are identified based on the type of the available components and their sources. In this section, the cost of the reuse processes is discussed based on the dataset published in [50]. The dataset presents the cost of the basic operations for developing seven industrial software components. ...
Full-text available
For many decades, the cost, time and quality are the main concern of software engineering. The main objective of any software organization is to produce high quality software product within a shorter time and minimum cost. Software reuse is one of the main strategies concerns about using available resources to enhance the productivity of software development and the quality of software products. It aims at using existing software products and components in the development of new software systems. However, various types of software components available in different sources are used in the reuse strategy. This makes the reuse strategy confusable and its efficiency and effectiveness debatable. Selecting unsuitable component or scenario makes the reuse inefficient and ineffective. This study discusses the types of software components, their sources, characteristics and applicable scenarios for developing and reusing these components. A dataset from the literature is used to calculate and compare the cost of reuse processes. The results show that software reuse is an efficient strategy comparing with the normal development. Although, considering the reusability of software components required extra cost to the normal development, it could efficiently save the cost of the development of new software system. Moreover, using existing software components in the development of new reusable component is the most efficient strategy, which required even less than the cost of developing normal component.
... The third part will be the effort in reusing the asset in white-box style. Simply, Tomer, et al. [99] defined [24,97,264]. However, the salary or experience of developers is regarded as another aspect of cost in measurement. ...
... Study of software modularity for open source software (OSS) showed that the high level of software modularity within the OSS community should provide motivation to firms to leverage existing OSS code, thereby partly mitigating the high upfront investment cost of building non-firm-specific internal software modularized components inventory. Tomer et al. [99] proposed cost model for comparing different types of software modularity mode. The model was used to compare the potential costs and benefits of alternative software modularity scenarios, which led cost model more independent. ...
Full-text available
There are two critical elements to software development, i.e. quality and effort. Quality is not the final goal for software development. A more important idea behind quality is the ability to fix the problem, maintain and upgrade the software rapidly. Generally, practitioners refer a bug to the failure or error in software programs. Bugs may seriously interfere with the program functionality or user experience. The effort, instead, is less related to the end-users; however, it critically decides if the software could be released in time. The notorious Brooks’ law raises the idea of adding manpower to a late software project makes it even later. The implied characteristics of software effort are difficult to control in practice. Therefore, software practitioners are still calling for more well-established methods to estimate quality and effort. There are a plenty of research that identifies and discusses potential project factors on software economic elements. Former studies have identified multiple factors of team and project that determine software economics. However, there is serious conclusive inconsistency. Team size, as a rudimentary factor during software development, is often neglected in any effort or quality estimation process. Former empirical research is still lacking holistic investigation between team size and overall economic elements. This part shall provide empirical evidence of the impact of team size and its interactions on software economics. The major difficulty is to identify and investigate the “role players” that impacts quality and effort in the development. This thesis reports the research that aims to estimate quality and efforts of software development from a holistic perspective, including missing data, team size, language, platform, reuse and other project factors. The 1st part of this thesis aims at identifying the potential effect of software reuse in the context of embedded software development. Software reuse has been advocated as a technique with a great capacity to improve product quality and reduce development effort and cost. However, the benefit of reuse is still doubted by serious conclusive and methodological inconsistency. Experts are still calling for more solid empirical studies with objective data on the effects of software reuse on new product performance. The validation of the benefits could build a strong guidance for the software industry. The 2nd part deals with missing data issues in software estimation. With known critical software project factors, appropriate preprocessing is necessary for further machine learning (ML) based empirical software engineering (SE). Historical datasets are widely II used to build models for prediction. However, the missingness inside dataset seriously affects the ability to discover knowledge from constructing effective analogy-based estimation model. Literature review reveals that listwise deletion gains the most popularity but reduce the sample size. And the issue of missing data in empirical software engineering is less addressed. The 3rd part of this thesis investigates and improves one commonly adopted data imputation technique: k nearest neighbor (kNN) based method, instead of ignoring missing observations to make data incomplete. KNN based imputation is improved to predict each missing value with special parameter settings under various missing data patterns. The optimization strategy includes multiple parameter combinations and feature relevance technique. We compare the novel imputation techniques with mean imputation (MEI) and other commonly used kNN ones. Then we conduct various estimation learners on eight real famous software quality datasets to discover the impact of the kNN based imputation methods. To solve and provide better missing data imputation methods helps use possible data for estimation. The 4th part of this thesis exploits the best data preprocessing (DP) combination for various ML methods to maximize the utility of project factors. Due to the complex nature of the software development process, traditional parametric models and statistical methods often appear to be inadequate to model the increasingly complicated relationship between project development effort and the project features (or effort drivers). ML methods, with several reported successful applications, have gained popularity for software effort estimation in recent years. DP has been claimed by many researchers as a fundamental stage of ML methods; however, very few works have focused on the effects of DP techniques. This part strongly addresses this issue from the perspective of the data mining. The thesis reports a real-life study of the impact of reuse on quality, effort and related economic consequences of embedded software development based on first-hand objective data from 30 projects in a small-sized company. The thesis validates the empirical relationships between team/project factors and software economic measurement, including productivity, quality, effort, and time-to-market. The data analysis bases on a renowned dataset, ISBSG (The International Software Benchmarking Standards Group). It also validates and improves classic imputation techniques on well-known datasets with full project factors in the context of empirical ML based SE; (4) III systematically assesses the effectiveness of DP techniques on ML methods in the context of software effort estimation. In this thesis, we first conduct a literature survey on the recent publications using DP techniques, followed by a systematic empirical study to analyse the strengths and weaknesses of individual data preprocessing techniques as well as their combinations. This thesis reveals that (1) a higher reuse rate enhances productivity and quality and reduces the cost of embedded software development. (2) Multiple factors, including team size, language type, and organization type, turn out to have a significant impact on software economics; (3) the proposed cross-validation based kNN imputation performs better in the context of software quality prediction; (4) DP techniques may significantly influence the final prediction. They sometimes might have negative impacts on prediction performance of ML methods. To improve prediction models, meticulous parameter selection and tuning are necessary according to the characteristics of ML methods, as well as the datasets used for software effort estimation. Future work includes (1) mining software reuse repository to discover more knowledge of software reuse benefits, (2) further quantify the relationship between team size and software economic elements, (3) further improvement on kNN imputation in the domain of both effort and quality estimation, (4) more empirical findings in terms of investigating DP combination in empirical SE.
... З метою вирішення проблем пов'язаних із зниженням витрат та зменшенням часу на розробку програмного забезпечення в літературі [4][5][6][7][8] існують пропозиції відносно створення та застосування повторно використовуваних рішень під час розробки програмного забезпечення. ...
... Our vision is for patterns to be inferred from the browsing history of users and constructed from a set of previously developed applications. As we look to the future, we can employ existing studies on reuse scenarios and design space exploration ( Hamid, 2015;Hegedüs et al., 2015;Tomer et al., 2004 ). We would also like to study the integration of our tools with other MDE tools. ...
Several development approaches have been proposed to handle the growing complexity of software system design. The most popular methods use models as the main artifacts to construct and maintain. The desired role of such models is to facilitate, systematize and standardize the construction of software-based systems. In our work, we propose a model-driven engineering (MDE) methodological approach associated with a pattern-based approach to support the development of secure software systems. We address the idea of using patterns to describe solutions for security as recurring security problems in specific design contexts and present a well-proven generic scheme for their solutions. The proposed approach is based on metamodeling and model transformation techniques to define patterns at different levels of abstraction and generate different representations according to the target domain concerns, respectively. Moreover, we describe an operational architecture for development tools to support the approach. Finally, an empirical evaluation of the proposed approach is presented through a practical application to a use case in the metrology domain with strong security requirements, which is followed by a description of a survey performed among domain experts to better understand their perceptions regarding our approach.
... The components can be considered as white-box reuse if the code can be modified to some extend in which the component is applied (Chuang, 1996) (Dash, 2009. Previous research showed that with black -box reuse higher reuse levels can be achieved than with white-box reuse (Tomer et al. 2004) (Fingar, 2009 ...
Full-text available
Background: Cloud computing is a challenging task for many software engineering projects, especially for those projects which need development with reusability. Cloud computing is a style of computing in which virtualized resources are provided as a service over the Internet. Software as a Service (SaaS) is one layer of cloud computing, and it can be used for providing different types of business services. Objective: In this paper, we present a new service named Software Component as a Service “SCaaS” to be available in cloud computing environment. Results: This new layer will be used for reuse software components, as reusable and modular software components are expected to play a vital role in improving software construction processes and in reducing software building time to market. Conclusion: The core of this service had been implemented, and had been used to automatically generate code for several programming languages. Experimental results showed that using SCaaS reduced cost and improved time to market.
Measuring and estimating the reusability of software components is important towards finding reusable candidates. Researchers have shown that software metrics can be effectively used to assess software reusability. This work provides a systematic literature review to investigate the main factors that influence software reusability and how these identified factors can be quantified using software metrics. This paper also investigates tool availability of the identified software metrics. Based on the extensive study, we narrowed down 44 factors that could positively or negatively affect the reusability of software systems. In term of software metrics, we report our findings through five main families of metrics, namely coupling, cohesion, complexity, inheritance, and size. We found that most of the metrics examine reusability at the class‐level, and the availability of software tools is limited. Furthermore, not all reusability affecting factors are equally impactful to assess the reusability of software components. While existing studies often discussed the impact of complexity towards software reusability, we found that only a handful of complexity metrics were meant for assessing reusability. We have identified several open challenges and gaps in the area, in particular lack of quantifiable measurement for reusability, limited software tools, and limited metrics that directly measure reusability.
Reusable code helps to decrease code errors, code units and therefore development time. It serves to improve quality and productivity frameworks in software development. The question is not HOW to make the code reusable, but WHICH amount of software components would be most beneficial (i.e. cost-effective in terms of reuse), and WHAT method should be used to decide whether to make a component reusable or not. If we had unlimited time and resources, we could write any code unit in a reusable way. In other words, its reusability would be 100%. However, in real life, resources and time are limited. Given these constraints, decisions regarding reusability are not always straightforward. The current chapter focuses on decision-making rules for investing in reusable code. It attempts to determine the parameters, which should be taken into account in decisions relating to degrees of reusability. Two new models are presented for decisions-making relating to reusability: (i) a restricted model, and (ii) a non-restricted model. Decisions made by using these models are then analyzed and discussed.
Patterns, micro-patterns, and nano-patterns have many applications: program comprehension, code transformations, documentation aids, improving code robustness, etc. This work revisits the notion of nano-patterns—originally an obiter dictum of the work on micro-patterns. Nano-patterns here are taken as more general than their previous definition in the literature: predicates on short code snippets that represent some common and elementary programming missions such as “for each m ∈ M do...”, or, “use x (but if x is null, y is a substitute)”, which represent small and recurring programming idioms. With this generalization, we offer a taxonomized languageof nanos nano-patterns for Java. We also describe the process of pattern harvesting we used and the underlying rationale, including our proposed prevalence threshold criterion, which, by capitalizing on Hirsch’s famous h-index, makes a robust yard-stick of the pattern’s significance. An empirical survey of 78 Open Source Java projects indicates that the nano-patterns of our proposed language described here have a substantial prevalence in the code. About a third of the commands (executable statements) and half of the methods are instances of nano-patterns in the proposed language. Also, the language’s prevalence is typically higher than that of languages harvested in a project specific, automated machine learning process. Nano-patterns are implementation/language level details for most high level software engineering purposes. One contribution made by the present paper is in identifying the clutter made by the snippets, appreciating its presence, and imposing order on it. The language, the nano-patterns in it, and the contributed automatic tool for tracing nano-patterns in code may help to deal systematically with this low level, yet significant, portion of code.
Context The term software reuse was first used in 1968 at the NATO conference. Since then, work in the scientific literature have stated that the application of software reuse offers benefits such as increase in quality and productivity. Nonetheless, in spite of many publications reporting software reuse experiences, evidence that such benefits having reached industrial settings is scarce. Objective To identify and classify the benefits transferred to real-world settings by the application of software reuse strategies. Method We conducted a systematic mapping study (SMS). Our search strategies retrieved a set of 2,413 papers out of which 49 were selected as primary studies. We defined five facets to classify these studies: a) the type of benefit, b) the reuse process, c) the industry's domain, d) the type of reuse and e) the type of research reported. Results Quality increase (28 papers) and Productivity increase (25 papers) were the two most mentioned benefits. Component-Based Development (CBD) was the most reported reuse strategy (41%), followed by Software Product Lines (SPL, 30%). The selected papers mentioned fourteen industrial domains, of which four stand out: aerospace and defense, telecommunications, electronics and IT services. The application of systematic reuse was reported in 78% of the papers. Regarding the research type, 50% use evaluation research as the investigation method. Finally, 13 papers (27%) reported validity threats for the research method applied. Conclusions The literature analyzed presents a lack of empirical data, making it difficult to evaluate the effective transfer of benefits to the industry. This work did not find any relationship between the reported benefits and the reuse strategy applied by the industry or the industry domain. Although the most reported research method was industrial case studies (25 works), half of these works (12) did not report threats to validity.
Full-text available
Until reuse is better understood, significant reductions in the cost of building large systems will not be possible. This assertion is based primarily on the belief that the defining characteristic of good reuse is not the reuse of software per se, but the reuse of human problem solving. Analytical approaches for making good reuse investments are suggested in terms of increasing a quality-of-investment measure, Q, which is simply the ratio of reuse benefits to reuse investments. The first strategy for increasing Q is to increase the level of consumer reuse. The second technique for increasing Q is to reduce the average cost of reusing work products by making them easy and inexpensive to reuse. The third strategy is to reduce investment costs. Reuse strategies, and reuse and parameterizations, are discussed.< >
We represent the life cycle of the design of a system in a three-dimensional space with engineering, reengineering, and reuse axes. The three-dimensional model is evolution-oriented. It incorporates not only the evolution that occurs after the product has been produced and delivered, but also three types of system design evolution that take place before the product is produced. Associated with each axis are a mathematical operator and its inverse. These operators, together with their inverses, can describe the various systems engineering activities. The model can be used to describe the life cycle of a product line, the evolution of an individual product within that product line, and even the evolution of an individual artifact. The model can be used in conjunction with any life-cycle model and any set of artifacts. © 2002 Wiley Periodicals, Inc. Syst Eng 5, 264–273, 2002
Conference Paper
This article is excerpted from the first two chapters of A Framework for Software Product Line Practice, Version 2.0.The framework is intended to be a living document that will aid the software development and acquisition communities. Each version represents an incremental attempt to capture information about successful product line practices. This information has been gleaned from studies of organizations that have built product lines, from direct collaborations on software product lines with customer organizations, and from leading practitioners in software product lines. In the full document, available on the Web at http://www. sei. cmu. edu/plp/framework. html, Chapter 3 provides a detailed description of how product line practices could be applied to software engineering, technical management, and organizational management practice areas. "Not all of the practice areas have been defined, but our goal is to release the framework in increments to get the information out sooner and to get feedback and contributions," writes co-author Linda Northrop, director of the Product Line Systems Program at the SEI. "Future versions will build upon the current foundation by completing still other practice area descriptions, and by describing a small number of product line scenarios involving the development, acquisition, and/or evolution of a software product line. " Northrop requests that readers provide feedback and make contributions to the framework by contacting her at lmn@sei. cmu. edu.
Conference Paper
The evolution tree model is a two-dimensional model that describes how the versions of the artifacts of a software product evolve. The propagation graph is a data structure that can be used for effective control of the evolution of the artifacts of a software product. In this paper we extend the evolution tree model and propagation graph to handle the evolution of a software product line. Software product lines are characterized by large -scale reuse, especially of core assets. We show how a third dimension can be added to the evolution tree model to handle this reuse. In particular, the new model incorporates bidirectional reuse within product lines. That is, the new model can handle the transfer of an artifact from the core assets repository to a specific product (acquiring a core asset) as well as the transfer of a specific asset from a specific product to the core assets repository (mining an existing asset).
Conference Paper
The paper baselines the state of the art in reuse economic modeling by surveying and comparing seventeen economic models and presenting conclusions and recommendations for further research. The analyses indicate a great deal of commonality among the set of models. While this may indicate that researchers are arriving at similar models independently, it may also suggest that we should direct our efforts at forging new ground in reuse economics. Five areas for further research in reuse economics are recommended, and general guidelines for helping organizations decide on a suitable economic model are discussed.