Content uploaded by Ana Carla Bibiano
Author content
All content in this area was uploaded by Ana Carla Bibiano on Apr 07, 2020
Content may be subject to copyright.
Characterizing and Identifying Composite Refactorings:
Concepts, Heuristics and Paerns
Leonardo Sousa
Electrical & Computer Engineering
Carnegie Mellon University, USA
leo.sousa@sv.cmu.edu
Diego Cedrim
Amazon
Brazil
dccedrim@amazon.com
Alessandro Garcia, Willian
Oizumi
PUC-Rio, Brazil
{afgarcia,woizumi}@inf.puc-rio.br
Ana C. Bibiano, Daniel Oliveira
PUC-Rio, Brazil
{abibiano,doliveira}@inf.puc-rio.br
Miryung Kim
UCLA, USA
miryung@cs.ucla.edu
Anderson Oliveira
PUC-Rio, Brazil
aoliveira@inf.puc-rio.br
ABSTRACT
Refactoring consists of a transformation applied to improve the
program internal structure, for instance, by contributing to remove
code smells. Developers often apply multiple interrelated refactor-
ings called composite refactoring. Even though composite refactor-
ing is a common practice, an investigation from dierent points of
view on how composite refactoring manifests in practice is miss-
ing. Previous empirical studies also neglect how dierent kinds of
composite refactorings aect the removal, prevalence or introduc-
tion of smells. To address these matters, we provide a conceptual
framework and two heuristics to respectively characterize and iden-
tify composite refactorings within and across commits. Then, we
mined the commit history of 48 GitHub software projects. We iden-
tied and analyzed 24,911 composite refactorings involving 104,505
single refactorings. Amongst several ndings, we observed that
most composite refactorings occur in the same commit and have
the same refactoring type. We found that several refactorings are
semantically related to each other, which occur in dierent parts
of the system but are still related to the same task. Our study is
the rst to reveal that many smells are introduced in a program
due to “incomplete” composite refactorings. Our study is also the
rst to reveal 111 patterns of composite refactorings that frequently
introduce or remove certain smell types. These patterns can be used
as guidelines for developers to improve their refactoring practices
as well as for designers of recommender systems.
CCS CONCEPTS
•Software and its engineering →Software design engineer-
ing.
ACM Reference Format:
Leonardo Sousa, Diego Cedrim, Alessandro Garcia, Willian Oizumi, Ana C.
Bibiano, Daniel Oliveira, Miryung Kim, and Anderson Oliveira. 2020. Char-
acterizing and Identifying Composite Refactorings: Concepts, Heuristics
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
©2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7517-7/20/05.. .$15.00
https://doi.org/10.1145/3379597.3387477
and Patterns. In 17th International Conference on Mining Software Reposito-
ries (MSR ’20), October 5–6, 2020, Seoul, Republic of Korea. ACM, New York,
NY, USA, 12 pages. https://doi.org/10.1145/3379597.3387477
1 INTRODUCTION
Software refactoring is a widely used technique in practice [
8
,
10
,
13
,
19
,
20
,
33
,
52
]. Refactoring consists of a program transforma-
tion used to improve software structure, such as removing code
smells [14]. Well-known refactoring types include Extract Method,
Rename Method, and Move Method. Since the term refactoring rst
appeared in the literature [
14
,
40
], studies have been actively in-
vestigating it [
2
,
3
,
7
,
10
,
13
,
19
,
20
,
25
,
32
,
33
,
47
,
52
]. Most of these
studies analyze the characteristics and the impact of each single
refactoring on the software structure.
However, from 40% to 60% of the times, developers apply more
than one refactoring in conjunction [
6
,
33
], even for removing
simple code smells, such as Long Methods [
14
]. In other words,
developers often apply which we call here as composite refactoring.
A composite refactoring – from now on also called composites –
comprises two or more interrelated refactorings that aect one
or more elements [
6
,
8
,
34
,
46
]. There are two broad categories of
composites: (i) temporally-related composite, i.e., those refactorings
applied in the same commit and are likely to be related to the same
developer’s task, and (ii) spatial composite, i.e., a set of refactorings
applied in structurally related code elements, regardless whether
they are performed at the same change (commit) or not.
Recent studies (e.g., [
6
,
8
,
44
,
53
]) have analyzed a single category
of composite at a time. For example, Palomba et al. [
44
] and Tufano
et al. [
53
] analyze temporally-related composites, while Bibiano et
al. [
6
] and Brito et al. [
8
] explore spatial composites. As no study
analyzes these dierent categories altogether, we might have missed
a more comprehensive understanding of composites. For example,
while certain complex smells are likely to be fully removed over
time (e.g., a God Class) through a spatial composite refactoring,
other smells (e.g., Shotgun Surgery) may be removed in a single
commit, but require changes in non-structurally related parts of the
program. As composite categories were studied only under a single
perspective, we have the opportunity to investigate from dierent
perspectives the impact of refactoring on the program structure.
To investigate composite refactorings, we mined the commit
history of 48 GitHub software projects (i) to identify the charac-
teristics of dierent categories of composite refactorings, and (ii)
their eect on either removing or introducing smells. To support
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
our study, we provide a conceptual framework and two heuristics
for detecting composites. The heuristics are named commit-based
and range-based heuristics, and they serve to automatically identify
composites in software projects. The rst supports the analysis of
refactorings which have a temporal relation. The second intends to
capture refactorings that have a spatial relation. These heuristics
enabled us to investigate composites and their impact on smells
from dierent perspectives. We expect that our contributions and
study ndings can help tool builders by uncovering the blind spots
on the relation between composite refactoring and smells.
Our contributions and study ndings can be summarized as
follows. First, we provide a formal and unambiguous denition for
composites, which serves to guide researchers who aim to further
investigate composites. Second, our heuristics enabled us to reveal
characteristics of composites that were not investigated by related
studies [
6
,
8
,
33
]. Some of these characteristics are reported below.
We observe that nearly 41% of composites are complex, i.e., are
comprised by 3 to 20 interrelated refactorings, which contradicts
a recent nding [
6
]. The majority of the composites are conned
to the same commit and homogeneously formed by refactorings of
the same type, e.g., various syntactically related method extractions.
There is also a non-negligible frequency of: (i) heterogeneous and
cross-commit composites, and (ii) semantically related composites
within the same commit, i.e., sequences of refactorings located in
dierent parts of the code, but still related to the same task (e.g.,
removing non-trivial, scattered smells).
Contradicting previous ndings [
5
,
6
,
10
,
49
], we observe that
refactoring do have a considerable eect on smells. We found that
nearly 50% of composites either remove or introduce smells. Previ-
ous studies often suggest otherwise. For instance, Bavota et al [
5
]
stated that refactorings are not related to smell removal. Cedrim
et al. [
10
] and Bibiano et al [
6
] reported that refactorings are most
often neutral, i.e., neither introduce nor remove smells. These stud-
ies either analyze refactorings under the viewpoint of each single
refactoring or multiple refactorings aecting only a single element.
We used our heuristics to identify patterns of composites that
recurrently introduce or remove specic smell types, which have
not been reported in the literature. A manual analysis conrmed a
total of 111 composite-smell patterns: 84 smell-removing patterns
and 27 smell-introducing patterns. As refactoring tools tend to be
underused [
33
], these patterns can be used to improve recommen-
dation systems [
18
,
24
,
31
,
37
,
41
] by leveraging the use of removal
patterns that developers do in practice. This strategy would increas-
ing the chance of such developers adopting automated refactoring
tools. We also provide a replication package [
43
], which includes the
scripts that we used to implement the proposed heuristics and the
catalog of composite-smell patterns for 11 smell types. Our dataset
is available for other researchers who are interested in studying
composites and their eects on smells.
2 RELATED WORK AND EXAMPLE
Diverse views on composite refactoring.
Many researchers have
investigated composites [
6
,
8
,
28
,
33
,
50
,
53
,
54
]. However, they use
dierent terms (e.g.,batch refactoring [
6
]) or denitions to refer
to composite refactoring. Some studies consider a composite as a
set of two or more interrelated refactorings applied by the same
commit1
UserCtrl
+ userDao
+ mediaDao
+ saveUser (u:User)
+ saveMedia (m:Media)
commit2
UserCtrl
+ userDao
+ saveUser (u:User)
MediaCtrl
+ mediaDao
+ saveMedia (m:Media)
commit3
UserCtrl
MediaCtrl
AbstractCtrl
Extract
Superclass
Move
Method
Move
Method
Extract
Method
Extract
Method
Move
Method
Move
Attribute
r1r7
r5
r4
r3
r2r6
God
Class
Speculative
Generality
Figure 1: Refactorings applied to the Mobile Media
developer [
6
,
24
,
31
,
33
,
51
]. Other studies dene a composite as a
set of refactorings applied by multiple developers [
20
,
28
,
50
]. Bib-
iano et al. [
6
] consider the scope of a composite refactoring as an
individual code element. Other studies consider that a composite
refactoring may be applied in the scope of multiple elements [
20
,
28
,
31
,
33
,
50
,
51
]. There is a study that assumes time constraints
to dene a composite [
33
]. There are also studies that proposed
approaches to recommend composite refactorings [24, 31, 34, 51].
Bibiano et al. [
6
] and Vassalo et al. [
54
] are representative exam-
ples of recent studies that explicitly investigated composites. How-
ever, they investigated composites through a single perspective. For
example, Bibiano et al. [
6
] provided a partial view on composite
refactoring since they analyze only composites in the scope of indi-
vidual code elements. Hence, composite refactorings that crosscut
two or more elements were not completely investigated. Thus, their
ndings may not yield a comprehensive understanding of more
complex forms of composites. Next, we illustrate how relying on a
denition can compromise a researcher’s study.
Eect of composites on smells.
In the example of Figure 1,
the researcher wants to investigate the eect of composites on
smells. The gure shows three commits of Mobile Media (MM),
a software product line to derive mobile applications [
59
]. A de-
veloper performed seven refactorings:
r1,r2, . .,r7
along these com-
mits. According to Bibiano et al.’s denition [
6
], a composite com-
prises two or more refactorings within the scope of a single ele-
ment. If the researcher follows this denition, s/he would consider
cr1=[r1,r2,r3,r4,r5]
and
cr2=[r3,r6,r7]
as composites. This
denition forces the researcher to restricting composites to those
occurring in the context of an element, which may be inappropri-
ate to investigate the eects of composites on smells. For example,
in Figure 1, the refactorings
r1
and
r2
removed the God Class. As
these refactorings belong to the composite
cr1
, the researcher would
conclude that composites have a positive eect on the program
structure since
cr1
reduced the incidence of smells. However, this
conclusion is misleading due to the use of a composite denition
that does not properly cover cases such as the one discussed above.
Let us consider the
r3
refactoring (Extract Superclass), which
crosscuts multiple elements. This refactoring creates a superclass
(
AbstractCtrl
) shared by
UserCtrl
and
MediaCtrl
, which led to the
introduction of the Speculative Generality [
14
]. Since the smell is
introduced in the scope of another element, Bibiano et al.’s deni-
tion would not consider it when assessing the eect of a composite -
their denition does not consider the scope of all elements aected
by the refactorings. In this scenario, the composite removed a smell
(God Class) but introduced another (Speculative Generality). There-
fore, the researcher should have concluded that composites have
Characterizing and Identifying Composite Refactorings MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
no eect on the introduction or removal of smells. To have a better
understanding on composite refactorings and their eect on smells,
the researcher would need other heuristics (Section 3.3) to identify
composites that aect the scope of multiple elements. In addition,
although there are several works that study the complex nature
of code smells [
12
,
17
,
35
,
36
,
38
,
39
,
55
], they do not address the
relationship of composite refactorings and smells.
3 CHARACTERIZING AND IDENTIFYING
COMPOSITE REFACTORING
We dene here basic concepts needed to study composites (Sec-
tion 3.1). We rely on these concepts to characterize an existing
heuristic (Section 3.2) and to propose two new ones (Section 3.3).
3.1 A Conceptual Framework
This section presents a conceptual framework for composite refac-
toring. We used this framework to provide a foundation for our
heuristics (Section 3.3) and our empirical study. Other researchers
can also use it to conduct studies based on unambiguous concepts.
3.1.1 Composite Refactoring. Composite refactoring occurs when
two or more interrelated refactorings are applied to a set of code
elements. Thus,
cr =[r1,r2,· · · ,rn]
is a composite of size
n
if
n≥
2. The notion of interrelation depends on the composite scope
(Section 3.1.4). Most studies restrict the composite to refactorings
applied by the same developer [
6
,
33
,
42
,
48
]. However, developers
can work together to apply a composite [
20
]. This scenario can hap-
pen, for example, when they have to team up to plan and perform
a major restructuring in the system, or when they create branches
to apply refactoring exclusively [20].
3.1.2 Composite Uniformity. All the refactorings in the compos-
ite can have the same type or not, which we dene as composite
uniformity. In this context,
type(ri)
is a function that returns the
type of the refactoring
ri
. In our example of Figure 1,
type(r1)=
Move Method
. Therefore, the composite
cr =[r1,r2,· · · ,rn]
is het-
erogeneous if and only if
|type(r1) ∪ type(r2) · · · ∪ type(rn)| >
1.
If
|type(r1) ∪ type(r2) · · · ∪ type(rn)| =
1, then the composite is
homogeneous. Most studies do not consider that a composite only
exists if all refactorings have the same type [33, 42, 45, 48].
3.1.3 Composite Timespan. A developer can start a composite in
a commit and nish it in the same commit or in the subsequent
commits. In this sense, composite timespan indicates if the composite
is either single-commit or cross-commit. To identify the timespan,
let us dene the function
commit(r)
to nd the commit where the
refactoring
r
was performed. Thus, a composite
cr =[r1,r2,· · · ,rn]
is cross-commit if and only if
|commit(r1) ∪ · · · ∪ commit(rn)| >
1.
Similarly, if
|commit(r1) ∪ · · · ∪commit(rn)| =
1, then
cr
is single-
commit. Several studies of refactoring only consider major version
[5] or a single commit [10], or the entire project history [6].
3.1.4 Refactoring and Composite Scope. Elements directly aected
by the refactoring constitute the refactoring scope. Given a refactor-
ing
r
,
scope (r)
is a function that returns the set of elements belong-
ing to the scope of
r
. For instance, the refactoring
r1
in Figure 1
(Move Method) moved the method
mediaDao
from class
UserCtrl
to
MediaCtrl. Hence, the refactoring scope is {mediaDao,U ser Ct rl ,
MediaCtrl }
. Similar to a single refactoring, composites also have a
scope. The composite scope is the set of code elements aected by the
refactorings within a composite. The composite scope also indicates
how the refactorings within the composite are interrelated.
One might naturally say the union of all refactoring scopes from
a composite determines the composite scope, but this is not nec-
essarily true in all scenarios. Related studies have dierent ways
to dene the composite scope. In general, these studies can be di-
vided into two groups: composite refactoring aects only the scope
of a single element [
22
,
30
,
45
] or the scope of multiple elements
[
20
,
42
]. In the rst group, all refactorings within the composite
are related to each other because they aect the same element. In
the second group, if a refactoring crosscuts two elements, then all
refactorings in one element will be related to the refactorings in the
other element. For example, a developer applied refactoring
r1
to
class
A
and
r2
to class
B
. These two refactorings are not related; thus
they do not compose a composite. However, the developer applied
a refactoring
r3
, which moves a method from
A
to
B
. Thus, the three
refactorings became related to each other, creating a composite. In
this case, the composite scope includes both classes.
3.1.5 Composite Synthesis. The process of grouping interrelated
refactorings to nd composites is dened as composite synthesis.
To synthesize a composite, we need rst to detect the refactorings
that occurred in the system. Related studies have dierent strate-
gies to identify refactorings applied by developers. A strategy is
to analyze the commit message to identify the refactorings [
47
].
Another strategy is to use a tool that compares two subsequent
commits to identify refactorings [
52
]. For the sake of explanation,
let us assume that a refactoring detection tool implements a func-
tion R. This function expresses all refactorings in the history Hof
a system
s
, which is composed of all refactorings detected between
subsequent pairs of commits:
H(s)=Ð|Comm it s (s) |−1
i=1R(ci,ci+1)
.
To illustrate the output of function
H(s)
, let us visit the MM sys-
tem presented in Figure 1. This system has four commits, where
three of them are represented in the gure. The fourth one is
produced as the result of applying the refactorings
{r4,r5,r6,r7}
.
Hence,
H(s1)=R(c1,c2) ∪ R(c2,c3) ∪ R(c3,c4)
. In other words,
H(s1)
contains all refactorings presented in Figure 1, which are
{r1,r2,r3,r4,r5,r6,r7}.
3.2 Element-Based Heuristic
This section presents a formal denition of the element-based heuris-
tic [6], which we will use in our study.
Formal Denition.
A heuristic that synthesizes composites
using as scope an individual code element, i.e., either a method
or a class. The goal of this heuristic is to investigate how com-
posites aect an specic element. Formally, a given composite
cr =[r1,r2,· · · ,rn]
is synthesized by the element-based heuristic if
and only if there is an element
e
such as
e∈scope (ri) ∀ri∈cr
. For
instance, let
CRe(h)
be the function that implements the element-
based heuristic over a particular refactoring history
h
(Figure 1).
So,
CRe(H(s1)) ={cra[r1,r2,r3,r4,r5],crb[r3,r6,r7]}
. Thus, this
heuristic synthesizes two composites. The rst one,
cra
, is a com-
posite because
[r1,r2,r3,r4,r5]
aected the same element: UserCtrl.
The second composite, crb, aects the MediaCtrl class.
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
Scope.
In this heuristic, the composite scope is determined by the
element used to synthesize the composites. In this way,
scope (ca)=
{U serC tr l }, and scope (cb)={MediaCt rl}.
The element-based heuristic focuses on the element to nd com-
posites. Focusing on the element is a strength as it allows us to
investigate what occurs with the element during its evolution. At
the same time, focusing on the element is also a weakness. The
scope of some refactoring types goes beyond a single element. Sup-
pose a developer applies an Extracted Method in class
A
, and then a
Move Method from class
A
to
B
. The heuristic will only synthesize a
composite in class
A
. Since class
B
is out of scope, the eects of the
composite in
B
will not be considered. As the eect in each element
will be treated independently, this heuristic may not be entirely
appropriate to investigate the eect of composites on smells.
3.3 Composite Synthesis Heuristics
We propose here two heuristics to synthesize composites.
3.3.1 Commit-Based Heuristic. The composite scope also indicates
how the refactorings are interrelated (Section 3.1.4). Sometimes the
refactorings are not structurally related to each other but they occur
in the same context. For example, a developer may apply several
refactorings to address a task associated with a commit. Hence,
it makes sense to group these refactorings. For this purpose, this
heuristic considers a single commit as the timespan (Section 3.1.3).
In fact, there is a commit policy, widely accepted in the commu-
nity, that recommends developers not to perform code changes for
multiple tasks in the same commit [
21
]. Thus, each commit should
have refactorings somehow related to the same task.
Formal Denition.
The commit-based composite heuristic syn-
thesizes as a composite all refactorings performed within a commit.
The goal of this heuristic is to capture a temporal relation among
the refactorings made at the time frame of a single commit. For-
mally, a composite
cr =[r1,r2,· · · ,rn]
is synthesized if and only
if
|commit(r1) ∪ commit(r2) · · · ∪ commit(rn)| =
1. For instance,
consider
H(s1)=[r1,· · · ,r7]
(Figure 1). Now, let
CRc(h)
be the
function that implements the commit-based heuristic over a refac-
toring history
h
. Thus, the commit-based heuristic produces two
composites: CRc(H(s1)) ={crc[r1,r2],crd[r4,r5,r6,r7]}.
Scope.
The composite scope includes the elements aected by
the refactorings within the commit. Thus,
scope (crc)={U ser Ct rl ,
MediaCtrl }
, and
scope (crd)={U ser Ct rl ,MediaCtr l,Abstr actCtrl }
.
The commit-based heuristic is useful to observe the eect of all
refactorings that occur in a commit. Assuming that all the changes
within a commit are related to the same task [
21
], researchers can
use this heuristic to understand how refactorings aect elements
related to a task. This heuristic solves (partially) the limitation of the
element-based heuristic. Instead of considering only the scope of a
single element, it considers all elements aected by the refactorings
performed along the commit’s task. Thus, this heuristic does not
discard refactorings that crosscut elements. However, there are
cases that the commit-based heuristic discards refactorings to which
it should not. A developer can start a composite in a commit and
nish it in the subsequent commits. For example, a developer can
start a composite, then, s/he can commit the changes and continue
on refactoring the same elements. In this case, the commit-based
heuristic would synthesize two composites rather than one.
3.3.2 Range-Based Heuristic. Some refactorings are structurally
related to each other because they aect elements that are located in
the same part of the source code. Thus, if we want to understand the
eect of composites on the program structure, we need to analyze
how these structurally related refactorings aect the elements. For
example, if a refactoring crosscuts two elements, both elements
should be analyzed to understand the eect of the refactoring. We
propose the range-based heuristic to identify composites in which
their refactorings aect the same location in the code.
Formal Denition.
The range-based composite heuristic con-
siders the notion of refactoring scope to synthesize composites.
In this heuristic, the scope of all refactorings form the compos-
ite scope. A composite starts with an arbitrary refactoring
ra
. A
second refactoring
rb
is part of the same composite if and only
if
ra
and
∃e∈scope(rb)
such as
e∈scope (ra)
. A possible third
refactoring
rc
will be added to the composite if
∃e∈scope(rc)
such
as
e∈scope (ra)
or
e∈scope (rb)
. This process continues until all
refactorings in a particular history are explored.
Scope.
In this heuristic, the composite scope is determined by
the union of the scopes of all refactorings:
∪n
i=1scope (ri)
. The
r1
and
r2
refactorings in Figure 1 moved elements from UserCtrl to MediaC-
trl classes. Hence,
scope (r1)=scope(r2)={Us erC tr l,MediaCt rl }
.
The composite synthesis in this example starts with
r1
. As
r2
was
applied in one element of
scope (r1)
, then the composite grows big-
ger and turns into
[r1,r2]
. The
r3
refactoring aects elements of
scope (r1)
, then the composite is now
[r1,r2,r3]
. The same reason-
ing can be used for the remaining refactorings, so the composite
synthesis produce the composite ce=[r1,r2,r3,r4,r5,r6,r7].
4 STUDY PLANNING
4.1 Research Questions
In the previous section, we proposed heuristics to identify com-
posites. These heuristics allow one to analyze composites from
dierent, albeit complementary, perspectives. To propose them, we
formally dened concepts that characterize a composite. Our goal
is to use these concepts to understand (i) how composites manifest
in software systems and (ii) their eect on smells. To achieve this
goal, we aim to answer the following research question:
RQ1.
What are the characteristics of composites in software
systems?
We address
RQ1
by applying the heuristics to identify three cate-
gories of composites: element-based,commit-based, and range-based
composites. The concepts dened in our conceptual framework al-
low us to compare these categories of composites. Thus, we can
also have a better understanding of the eect of composites on the
program structure. For this purpose, we use the following research
question to investigate if composites aect the incidence of smells:
RQ2.How does composite aect the incidence of smells?
Notice that answering
RQ2
is not trivial. First, we need to iden-
tify the elements aected by each category of composite, but taking
into consideration their composite scope. Then, we analyze what
happened with the smells before and after developers apply the
composites. To support this analysis, we classify each composite
according to their eect on the incidence of smells. We classify a
Characterizing and Identifying Composite Refactorings MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
composite as a
positive
one if it reduces the number of code smells.
Conversely, we classify it as
negative
composite if it increases the
number of smells. Otherwise, we classify it as
neutral
composite.
Other empirical studies applied this type of analysis [6, 9–11].
As a complement to RQ
2
, understanding and distinguishing the
eect of specic types of composites on smells is an essential inves-
tigation. First, our investigation may help tool builders by uncover-
ing the blind spots on the relation between refactoring and smells.
Second, this investigation aims (i) to identify topics that require
further investigation and (ii) to contrast the results with ndings
established in the literature. For example, Fowler [
14
] presented a
catalog of composite types that can be used to remove code smells,
which we named as a composite-smell pattern. A composite-smell
pattern establishes a frequently observed relationship between a
composite type and the introduction or removal of a smell type.
For instance, suppose that there is a method aected by the Fea-
ture Envy code smell. In this case, Fowler recommends to apply a
composite pattern composed of Extract Method followed by a Move
Method. Unfortunately, we do not know if developers apply this
composite pattern in practice. More specically, we do not know
which patterns govern the relation between refactorings and smells.
These patterns are the focus of our next research question:
RQ3.
What are the patterns governing composites and smells?
We address
RQ3
by investigating creational and removal pat-
terns. A
creational pattern
represents a recurring case where the
composite tends to introduce a code smell. A
removal pattern
represents a recurring case where the composite tends to remove
a smell. There is no empirical study in the literature that reports
composites that typically remove or introduce smells. By answering
RQ3
, we are able to reveal composites used by developers not only
to remove, but also to inadvertently introduce smells. The knowl-
edge about creational patterns make developers informed about
the risks of introducing certain smells along composite refactoring.
The removal patterns can be useful to implement recommendation
systems to support developers when removing smells.
4.2 Study Phases
This section presents the ve phases of the study design.
Phase 1: Dataset Acquisition.
In this phase, we choose a set
S
of software projects to analyze. We established GitHub as the
source of projects. To select them, we followed criteria based on
closely related studies [
6
,
10
]. We selected projects with (1) dierent
levels of popularity – based on the number of Github stars, (2) an
active issue tracking system, and (3) at least 90% of code written in
Java. These criteria allowed us to select 48 projects with a diversity
of structure, domain, size and popularity. The replication package
contains information about them [43].
Phase 2: Smell and Refactoring Detection.
In this phase, we
detected (i) the refactorings in all subsequent pairs of commits
ci
and
ci+1
, and (ii) all smells in each commit
ci∈commit(s)
. We
chose Refactoring Miner [
52
] to detect refactorings for two reasons.
First, the tool has precision of 98% and recall of 87% as reported
by Tsantalis et al. [
52
], which leads to a low rate of false positives
and false negatives. Second, the tool identies the most common
refactoring types applied by developers [
33
]. We considered all 14
refactoring types identied by the tool. Refactoring Miner gives us
as output a list of refactorings
R(ci,ci+1)={r1,· · · ,rk}
as dened
before, where kis the number of identied refactorings.
Code smells are often detected with metric-based strategies [
4
].
Each strategy is dened based on a set of metrics and thresholds. Af-
ter collecting metrics for all projects, we applied the rules to detect
smells [
5
,
23
,
27
]. These rules were used because: (i) they repre-
sent renements of well-known rules proposed by Lanza et al. [
23
],
which are used in related studies [
6
,
10
,
29
,
57
]; and (ii) they have,
on average, precision of 72% and recall of 81% [
26
]. We collected
19 smells: Brain Class,Brain Method,Class Data Should Be Private,
Complex Class,Data Class,Dispersed Coupling,Divergent Change,
Feature Envy,God Class,Intensive Coupling,Large Class,Lazy Class,
Long Method,Long Parameter List,Message Chain,Refused Bequest,
Shotgun Surgery,Spaghetti Code,Speculative Generality.
Phase 3: Manual Validation.
We randomly sampled refactor-
ings from each type to validate them manually. To ensure an ac-
ceptable condence level in the results, we calculated the sample
size of each refactoring type based on a condence level of 95% and
a condence interval of 5 points. We recruited ten undergraduate
students from another research group to also analyze the samples.
The samples were divided into ten disjointed sets, and each student
validated one. For each pair of elements, they had to mark it as a
valid refactoring or not. Thus, we estimated the number of false
positives generated by the Refactoring Miner [
52
]. We highlight
that our goal was to ensure the trustability of the tool for our set
of systems. For that matter, we relied on students, familiar with
refactoring, to validate the tool. After the manual validation, we
observed that the tool achieved high precision for all refactoring
types, in which the median was 88.36%. The precision for all refac-
toring types is within one standard deviation (7.73). Applying the
Grubb outlier test (alpha=0.05), we did not nd any outlier. This
result indicates that no refactoring type is strongly inuencing the
median precision. Thus, the precision for all the refactorings in the
validated sample provides trustability to our results.
Some smells can be introduced by functional changes, such as
the implementation of a new feature. Thus, we also validated if
the smells were introduce or removed by the refactorings. First,
we ran the eGit plugin and the Linux di tool to nd changes
between commits. Then, we manually analyzed each change. We
also analyzed the commit message to verify if there was any sign
that the developer applied a pure refactoring. When we identied
a functional change, we classied it as non-pure refactoring [
33
];
otherwise, we classied it as pure refactoring. We validated 1,168
pure refactorings and 3,817 non-pure refactorings. We used the
pure refactorings to conrm some results in Sections 5 and 6.
Phase 4: Synthesis and Classication of Composites.
The
heuristics to synthesize composites require a refactoring history
as input (Section 3.3). We collected this history for each project in
Phase 2. Each refactoring history was submitted to the algorithms
that implement the heuristics, allowing us to collect: (i) element-
based, (ii) range-based, and (iii) commit-based composites. After
collecting them, they were classied according to their eect on
smells. Thus, composites were classied as positive, negative, and
neutral. Finally, we identied composite patterns related to the
introduction and removal of specic types of smell. More details
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
about the composite patterns are provided in Section 6. The al-
gorithms (scripts) that implement the heuristics and classify the
composites are available in the replication package [43].
Phase 5: Systematic Validation of Composite Patterns.
To
increase the reliability of our results, we conducted a systematic
manual validation of a random sample of composites. First, we se-
lected 130 composites associated with the introduction and removal
of Feature Envy and God Class. We focused on these smells since
they are the ones with the most complex composites (Section 6).
Then, we randomly divided the composites among 4 researchers.
For each composite, the researcher conducted the following steps.
(1) Select the GitHub project where the composite happened;
(2) Identify the commits where the composite occur;
(3) Validate the refactorings and the smells in the elements;
(4)
Conrm if the composite is a creational or removal pattern;
(a)
If yes: conrm if the composite explicitly introduced/re-
moved the smell or if it is at least associated with the smell
introduction/removal.
(b)
If no: verify if the composite is an incomplete one, i.e., if
one or more refactorings in the removal pattern would
have removed the smell.
(5)
Analyze the commit messages to nd the developer’s inten-
tion when performing the composite.
(6)
We also veried if the refactorings within a commit-based
composite were semantically related. For this purpose, we
analyzed the commit message and also if the refactorings
addressed a task associated with a commit.
We validated 40 creational patterns, 43 removal patterns and
47 incomplete composites. We will use the validated composites
to exemplify our discussions. In these cases, we will identify the
composite by the “#” symbol followed by its id, e.g., composite
#21517). Our replication package contains all the validated instances
and the detailed steps and information to validate them.
5 COMPOSITES: OCCURRENCE AND EFFECT
We identied 27,911 composites in our dataset. We present their
characteristics (Section 5.1) and smell eects (Section 5.2).
5.1 Synthesized Composites
5.1.1 antity and Size. This section addresses our
RQ1
. Table
1 shows, for each heuristic (1
st column
), the quantity (2
nd column
)
and size of composites.
Table 1: Quantity and size of composites by heuristic
Heur. №
Comp.
Ref. in
Comp.
Size Std.
Dev.
Grubbs
Score
№
Elem.Min Med. Max Avg
Element 12,636 28,394
(54%) 2 2 333 3.9 6.6 49.89538 4,579
Commit 11,545 47,218
(91%) 2 3 2,562 8.0 44.4 57.76980 51,472
Range 3,761 28,883
(55%) 2 2 2,556 7.7 62.2 41.09278 18,132
Providing a broader view on the composites.
In Section 3.2,
we discussed that the element-based heuristic proposed by Bibiano
et al. [
6
] may not be appropriate to researchers who want to inves-
tigate the eect of composites on smells. The reason is that there
are several elements aected by the refactorings that this heuristic
would ignore by denition. Indeed, the number of refactored ele-
ments in the element-based composites is lower when compared to
the other categories of composites (last column in Table 1). When
we compare the average size of element-based composites with the
commit- and range-based composites (7
th column
), we notice a dif-
ference in the number of refactorings in each category of composite.
Comparing the number of elements with the average size, we notice
that the commit- and range-based composites are fragmented in
the element-based composites. This result shows how the element-
based heuristic only provides a partial view of the composites. The
analysis of refactored elements leads to our rst nding:
Finding 1
: Commit- and range-based heuristics allow a broader
assessment on the interrelation among refactored elements.
Capturing complex composites.
Our heuristics are helpful to
nd complex composites. A composite is complex when it is com-
posed of a high number of refactorings, usually aecting multiple el-
ements. When we consider the average of refactorings in a compos-
ite (7
th column
), the size of commit-based (8.0) and range-based (7.7)
composites is near twice the size of element-based composites (3.9).
This comparison shows that the number of interrelated refactorings
(in commit-based or range-based composites) is much larger than
any occurrence in the context of a single element. We also found
that 1,545 (41%) out of 3,761 composites of range-based heuristic,
and 5,793 (50%) out of 11,545 composites of commit-based heuristic
have 3 to 20 interrelated refactorings in conjunction. Therefore,
studies that investigated only single refactorings or only refactor-
ings aecting an element [
5
,
7
,
10
,
11
,
13
,
15
,
16
,
47
,
58
] are not able
to identify complex composites. Thus, they are oversimplifying the
study on refactoring. This result leads us to our next nding:
Finding 2
: There is a non-ignorable frequency of complex
composites that most empirical studies missed.
Most refactorings are interrelated.
After applying the heuris-
tics, a given refactoring will be either classied as a single refac-
toring or interrelated with others in a composite. In this vein, the
3
r d column
of Table 1 presents the quantity of interrelated refac-
torings. As expected, the commit-based heuristic was the one that
grouped the highest number of interrelated refactorings. The heuris-
tic synthesized 11,545 composites, totaling 47,218 interrelated refac-
torings, which represents 91% of the total of refactorings in our
dataset. Previous empirical studies [
10
,
33
] reported that Extract
Method and Rename Method are the most common refactoring
types applied by developers. These studies may give the simplistic
impression that developers tend to most commonly apply single
refactorings with a strict scope, i.e., refactorings that aect one or
two methods of a single class. However, this is not the case.
Even though Extract and Rename Method are the most com-
mon refactoring types, they are most often interrelated with other
refactorings and they tend to be complex. For example, when we
manually validated the 130 composite instances, we found that
when these two refactoring types are applied, they are frequently
part of a much more complex transformation that goes beyond the
scope of a single method or class. For instance, when developers
had the intention to improve the source code, all the refactorings
Characterizing and Identifying Composite Refactorings MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
were associated to the same task: code improvement (e.g., compos-
ites #22691 and #22703 – These composites are available in our
replication package [
43
]). This is even clearer for the commit-based
composites. Since most of the refactorings occur within a commit
(91%), the refactorings are associated with the task’s commit.
Finding 3
: Refactoring composites are much more complex
than what existing empirical studies suggest.
Semantic relation among refactorings.
When we analyze the
commit-based composites, only 9% of the refactorings do not belong
to a composite. This result indicates that 91% of the refactorings
are interrelated. Thus, either these refactorings are part of range-
based composites (55%) or they occur in elements that are not
structurally related to each other. This result indicates that when
developers are working on a task, there are several refactorings
that are not syntactically related to each other. As the refactorings
in the commit-based composites are not syntactically related, we
investigated if they had any relation. We found that several of
these refactorings are semantically associated with the task that the
developer is addressing in the commit. For example, several of the
refactorings were applied to remove smells in dierent elements.
These refactorings were not structurally related to each other, but
they were semantically related to each since they aimed to remove
smells (Section 5.2). Notice that if one analyzes only the range-based
composite, s/he would not be able to identify the semantic relation
between the refactorings. This result leads us to our next nding:
Finding 4
: Several commit-based composites contain refac-
torings that are semantically related to each other.
This nding may jeopardize most refactoring recommendation
systems [
18
,
24
,
31
,
37
,
41
,
42
]. These systems tend to consider only
the structurally related refactorings to learn how to recommend
refactorings. However, they do not explore the semantic relation
among refactorings. Only considering structurally related refactor-
ings may not suce to provide recommendations for developers.
Our dataset also contains extremely large composites (Table 1).
However, we consider them as outliers, since they are rare. For
the commit-based heuristic, for example, 87% of them are com-
posed by 10 or less refactorings. Only 0.004% of the commit-based
composites have more than 100 refactorings. To conrm that the
largest composites are outliers, we applied the Grubbs test for one
outlier. Table 1 shows the Grubbs score in the penultimate column.
The test is calculated as the highest composite size minus mean,
divided by standard deviation. We can accept the hypothesis that
the highest sizes of all heuristics are outliers since for all of them
the Grubbs scores were higher than the critical values. Besides that,
we observed p-values smaller than 0.00001 for all heuristics, which
means that the results are statistically signicant. In our replication
package [43], we have a manual analysis about these outliers.
5.1.2 Heterogeneity and Timespan of Composites. Table 2 presents
the results about the timespan and uniformity of composites.
Most composites are single-commit.
Table 2 shows that most
composites are single-commit. This occurs even in the case of the
range-based composites, which may have a larger composite scope.
We were expecting that developers could start a composite in a
commit and nish it in the following commits. However, out results
Table 2: Timespan and uniformity characteristics
Timespan Uniformity
Heur. Single-Commit Cross-Commit Homoge. Heteroge.
Element 9,094 (72.0%) 3,542 (28.0%) 11,107 (87.9%) 1,529 (12.1%)
Commit 11,545 (100.0%) 0 (0.0%) 6,484 (56.0%) 5,061 (44.0%)
Range 3,486 (93.5%) 244 (6.5%) 2,875 (77.0%) 855 (23.0%)
show that developers tend to limit the composites to a single commit.
This suggests that they intend to perform all refactorings at once,
without splitting the task into multiple commits.
Most composites are homogeneous.
Table 2 shows that most
composites are homogeneous, i.e., they have the same refactoring
type. We were not expecting this result. Fowler [
14
] in his book
presents a catalog of multiple refactorings that can be applied to re-
move some smells. Hence, we assumed that developers would apply
heterogeneous composites in practice. However, our assumption
does not hold in practice since most composites are homogeneous.
The highest incidence of heterogeneous composites are from the
commit-based composites, which can be explained due to the se-
mantic relation among refactorings. As discussed, any refactoring
performed in a given commit can be semantically related to the same
task, even if these refactorings are applied in structurally unrelated
elements. The result about uniformity indicates that developers fre-
quently apply the same refactoring type when restructuring related
elements. These discussed results lead us to our next nding:
Finding 5
: Even though homogeneous and single-commit
composites are more frequent than their counterparts, het-
erogeneous and cross-commits composites occur with a non-
ignorable frequency, which should not be overlooked.
5.2 Eect of Composites on Code Smells
To answer
RQ2
, we classied the composites as positive, negative
or neutral (Section 4.1). Table 3 shows this classication.
Table 3: Composite classication by heuristic
Heuristic Positive Neutral Negative
Element-based 751 (6.0%) 11,264 (89.1%) 621 (4.9%)
Commit-based 1,653 (14.3%) 6,019 (52.1%) 3,873 (33.6%)
Range-based 542 (14.5%) 2,020 (54.2%) 1,168 (31.3%)
Several positive and negative composites.
Table 3 shows the
frequency of positive, negative and neutral composites from the
element-based heuristic diers from the commit- and range-based
heuristics. First, Bibiano et al. found similar values for the element-
based heuristic. However, if we analyze only from the perspective
of element-based heuristic, we will conclude that the frequency
of positive and negative composites is almost negligible. However,
this conclusion is not correct. The other heuristics show that the
positive and negative composites are almost as frequent as neutral
composites. In fact, the frequency of positive, negative and neutral
composites is higher than the results reported in the literature [
5
,
6
,
10
]. As discussed, the scope of some refactoring types goes beyond a
single element. However, the element-based heuristic only consider
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
the scope of a single element. Thus, this heuristic is not entirely
appropriate to investigate refactorings that crosscut elements. This
limitation compromises the study of Bibiano et al. [
6
]. In their
study, the eect of several refactorings out of the composite scope
is ignored. This result leads to our next nding:
Finding 6
: Eects of composites often can only be observed
through the reasoning of refactoring’s relations in the scope
of a range or a commit.
Negative composites are most likely than positive ones.
We had an increase in the number of positive composites when we
compare the element-based composites with the other categories.
As discussed in Finding 4 (Section 5.1.1), several refactorings are
not syntactically related to each other but are semantically related.
This scenario occurred, for instance, when developers had the task
of removing Duplicate Code smell scattered over dierent parts of
the system. When we manually analyzed the commit message for
some of the refactorings, we noticed that the developers tagged
the commits as “structural improvements.” In these commits, we
found three distinct cases where each developer was removing
Duplicate Code. All the commits were tagged with the structural
improvement label, and the developer applied, throughout multiple
commits, refactorings to remove the duplication.
We found several instances of the following commit-based com-
posite
cr1={Extract Superclass,Rename Method }
to remove Dupli-
cate Code. The developer applied the Extract Superclass to create a
superclass for the classes with the smell. Then, s/he renamed the
method in the superclass to be consistent with the functionality pro-
vided. We found a case that a system had three dierent unrelated
instances of Duplicate Code in the same commit. For each instance,
the developer applied the composite
cr1
. Despite the increase in
positive composites, developers are most likely to introduce smells,
as shown in Table 3. This result leads to the next nding:
Finding 7
: Even though most composites are neutral, a non-
ignorable frequency of composites introduce smells.
Eect of the composite on the smell type.
We relied on the
classication of each composite to investigate its inuence on the
incidence of smells (Section 4.1). We found a case in which the
developer applied a composite to a class that had two smells: Feature
Envy and Message Chain. After the composite has been applied,
we noticed that the developer removed the Message Chain, but
s/he introduced a God Class. In this case, our classication scheme
would classify the composite as neutral. However, a God Class
would be often considered worse than a Message Chain. Hence, it
would not be fair to label the composite as neutral. Considering
the “criticality” of the smell, this composite is more likely to be
considered negative because the structure is worse than before. To
mitigate the risk of misclassifying neutral composites, we veried
in our datset the smells presented before and after each neutral
composite. We observed only 30 cases, in a set that contains 27,911
composites, in which a smell was replaced by other from a dierent
type. This investigation leads to our next nding:
Finding 8
: The refactorings in neutral composites very often
do not replace a smell type for another type.
6 COMPOSITE-SMELL PATTERNS
To address
RQ3
, we analyzed removal and creational patterns
emerging from the relationship between range-based composites
and smells (Section 4.1). We focus on discussing here the patterns
of range-based composites that aect Feature Envy and God Class.
We discuss these smells because they are usually associated with
the system structural degradation [
1
,
27
,
56
]. Patterns for the other
smells and categories of composites are available in our replica-
tion package [
43
]. We manually inspected several instances of the
patterns to understand what happened. In particular, we also con-
rmed whether the composites were directly related to the removal
or introduction of the smell. We ended up identifying 111 composite-
smell patterns: 84 removal patterns and 27 creational patterns.
6.1 Feature Envy
Feature Envy is a code smell that represents a method much more
interested in the data of a class other than the one it is actually
declared [
14
]. This smell is the most frequent one in our dataset.
Figure 2 presents all 13 composite types related to Feature Envy.
Green boxes represent the removal patterns; they appear in the
right side of Figure 2. The red ones, in the left side, represent the
creational patterns. The content of each box represents the type of
composite involved in the pattern. There is a caveat regarding the
repetition structure: the
{n}
symbol indicates the refactoring type
was observed more than once in the composite structure.
The arrow weight indicates the frequency of a pattern with: (i) a
removal behavior if the arrow is pointing to a green box, and (ii)
creational behavior if the arrow is departing from a red box. For
instance, the top-right green box indicates that in 77% of the times
a composite with more than one Inline Method followed by more
than one Extract Method removes one instance of Feature Envy.
The same rationale is used to interpret the creational patterns.
0.77
0.96
0.69
0.65
0.61
0.67
0.82
0.73
0.63
Feature
Envy
0.60
MoveAttribute{n},
ExtractMethod
0.69
MoveAttribute,
ExtractMethod{n}
0.70
MoveAttribute,
ExtractMethod
0.63
RenameMethod{n},
ExtractMethod{n}
ExtractMethod,MoveAttribute{n}
InlineMethod{n},ExtractMethod{n}
InlineMethod{n},ExtractMethod
ExtractMethod,InlineMethod
InlineMethod,ExtractMethod
ExtractMethod{n},InlineMethod
ExtractMethod,MoveAttribute
ExtractMethod,MoveMethod
InlineMethod,ExtractMethod{n}
Figure 2: Feature Envy patterns
We discussed in Section 5.1.1 that Extract Method is one of the
most common refactorings and it is most often interrelated with
other refactorings. Indeed, Figure 2 shows that all patterns have
by, at least, one Extract Method (EM). Neither the discussion about
Extract Method in Section 5.1.1, nor the identication of compos-
ite patterns would be possible if (i) we had only analyzed single
refactorings or (ii) used the element-based heuristic.
Characterizing and Identifying Composite Refactorings MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
Incomplete composites. We noticed cases of composites con-
sistently introducing Feature Envies in 31 projects. Composites
with Move Attribute, Extract Method introduced Feature Envies in
more than 60% of the cases as shown in Figure 2. These creational
patterns indicate that the composites are “incomplete”, which con-
tributed to the introduction (rather than the removal) of the Feature
Envy. An incomplete composite occurs when a set of refactorings
aect the smelly structure, but are not sucient to fully remove a
smell. It may even worsen the smelly structure. For instance, the
developers moved attributes in the three rst creational patterns in
Figure 2; however, they did not move the corresponding extracted
methods to fully remove the envy structures. Consequently, the
“unmoved methods” became more interested in the classes to which
the attributes were moved. These composites led to the introduc-
tion of the Feature Envy because they were incomplete; i.e., a Move
Method should also be part of such composites. Examples falling
into this scenario include composites #22092, #22156 and #22419.
This type of scenario reinforces our discussion about the high
number of negative composites (Finding 7). As we discussed in
Section 5.2, our heuristics show that several composites are nega-
tive. This increase in the number of negative impacts is related to
the incomplete composites. We found that developers are trying
to improve the program structure during the refactoring process
but, for dierent reasons, they are not necessarily completing the
restructuring process to fully remove the smelly structure. As a
consequence, incomplete composites lead to the introduction of
smells, such as the Feature Envy. These incomplete composites
were also observed on patterns for the other smell types.
Finding 9
: Developers tend to introduce smells, such as Fea-
ture Envies, due to incomplete composites.
Avoiding misleading results.
As discussed, Bibiano et al. [
6
]
do not provide a broader understanding of the eect of composites
on smells, which can lead to misleading results. The same occurs
with studies that only focus on single refactorings [
5
,
10
]. For exam-
ple, Bavota et al. [
5
] did not nd any relation between specic smells
(e.g., Feature Envy) and specic refactorings (e.g., EM). To illustrate
how these studies are not able to either provide a broader view or
nd relation between refactorings and smells, let us consider the
EM refactoring since it occurs in all the patterns associated with
the Feature Envy (Figure 2). We applied the Fisher’s Exact Test to
investigate the relation between EM and Feature Envy (Table 4). For
each heuristic (1
st column
), we present the number of composites
containing EM that removed and introduced Feature Envies, 2
nd
and 3
r d
columns respectively. The 4
th
and 5
th
columns show the
same information for composites without EM. The last two columns
show the p-value and odds ratio (OR) for the Fisher’s Exact Test.
Table 4: Fisher’s test results for Feature Envy patterns
Heuristic Positive
With EM
Negative
With EM
Positive
Without EM
Negative
Without EM p-value OR
Element 496 86 0 0 1 0
Commit 15,632 2,013 31,398 39,000 <0.000001 9.64
Range 360 110 25 0 0.002338 0
We ran the test with 95% of condence, which means that we
can reject the null hypothesis (H0) when the p-value is smaller
than 0.05. In our case, the H0 is that the introduction or removal of
Feature Envies by composites is independent of the presence of EM.
Given the p-values, only in the case of the element-based heuristic
that we cannot reject H0. Therefore, the element-based composites
mislead us to believe that composites without EM will never re-
move or introduce Feature Envies. However, the results of the other
heuristisc show the opposite, especially in the case of commit-based
composites. Thus, our heuristics were able to reveal that EM often
“partially” contributes to the removal (and introduction) of Feature
Envy, when performed with other refactorings (composites). In
summary, only analyzing element-based composites [
6
] or single
composites [
5
,
10
] does not provide a broader understanding of
composite, or, in the worst-case scenario, it can lead to an erroneous
result. This discussion reinforces Finding 1 (Section 5.1.1).
6.2 God Class
Our second set of composite-smell patterns concerns the God Class.
This smell exists when a class accumulates several responsibilities
[
14
]. We found out that this smell is more frequent than one might
expect. We found 425 distinct instances of God Class distributed
into 26 projects. Figure 3 presents all the 12 patterns.
0.78
0.71
0.59
0.61
0.51
0.59
0.66
0.50
0.57
0.61
0.71
God
Class
0.81
RenameMethod{n},
ExtractMethod{n}
InlineMethod{n},ExtractMethod{n}
PullUpMethod{n},MoveMethod,PullUpMethod
MoveMethod{n}
PullUpMethod{n},MoveMethod,PullUpMethod{n}
ExtractMethod{n},InlineMethod
PullUpAttribute{n},PullUpMethod{n}
InlineMethod{n}
ExtractMethod{n}
PullUpMethod{n}
PullUpAttribute{n},PullUpMethod{n},
MoveMethod,PullUpMethod
PullUpAttribute,PullUpMethod{n}
Figure 3: God Class patterns
Palomba et al. showed that when developers implement new
features, they often apply complex refactorings to improve the code
cohesion [
44
]. Our results provide a new perspective regarding
this scenario. We found that developers tend to decrease the code
cohesion when interleaving refactorings with additional changes.
For example, when developers apply composites of Rename Methods
and Extract Methods, they tend to introduce God Class, as shown
in Figure 3. At rst sight, this pattern is not intuitive. Developers
are not expected to increase the size of classes while performing
Rename and Extract Methods. We analyzed these composites to
understand why they led to the God Class.
Inappropriate additional changes
. We found that this cre-
ational pattern exists when developers interleave refactoring with
additional changes and if they are not performed in conjunction
with other refactorings (e.g., composites #21517 and #20932). The
additional changes comprise the creation of new methods (Extract
Methods), which are, unfortunately, implementing unrelated func-
tionalities. As a consequence of these additions in the extracted
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
methods, developers have to change the methods’ name to express
the new functionalities (Rename Methods). As new functionalities
are introduced, the class cohesion decreases, which leads to the
appearance of a God Class. The composites with Rename Methods
and Extract Methods were not the main reason for the introduction
of the God Class. Still, a recommender system can use this pattern
to improve their refactoring recommendation. For example, if a de-
veloper is introducing non-structural changes along with Rename
Methods and Extract Methods, the system can alert the developer
that s/he may introduce a structural problem.
Moving data to remove the God Class.
We identied 11 re-
moval patterns associated with the God Class. This result shows that
developers often apply a wide range of non-trivial composites to
remove the smell across software project. For example, as discussed
in the previous paragraphs, the God Class was introduced when
the composites of Rename Methods and Extract Methods occurred
with additional changes. We found that these changes introduced
pieces of code that should not be in the classes, contributing to the
God Class. Later on, developers had to apply several refactorings
to move these pieces of code to the classes that suit them better,
removing the God Class. This behavior of applying refactorings
that move data is reected in the removal patterns. All the removal
patterns had refactorings that moved data between classes, except
for Inline Method and Extract Method. This scenario is another ex-
ample of why an element-based heuristic fails to show the eect
of composites on smells. To remove God Class, developers apply
refactorings that aect multiple elements, such as the classes to
which the data is moved. However, if we analyze only the scope of
a single element, we would not be able to notice that composites
moving data play a central role in the addition and removal of God
Classes. This behavior leads us to our next nding:
Finding 10
: The range-based heuristic detects how data is
moved among classes to either introduce or remove God Class.
Providing knowledge based on practice.
Although some pat-
terns emerge in the element-based heuristic, they only provide a
partial view of composite eects. Several of the composite patterns
reported here and in the replication package can only be identied
with range-based and commit-based heuristics. Even Fowler’s cat-
alog [
14
], which lists common composites to remove smells, does
not report our patterns. For example, Fowler’s catalog indicates
that developers should apply Extract Class or Extract Subclass to
remove a God Class. However, we noticed that developers much
more often follow other strategies regarding the refactoring types:
Inline Method,Extract Method,Pull Up Method and Attribute, and
Move Method. Thus, our results suggest that existing refactoring
catalogs [
14
] may not reect the practice. We also observed that
existing recommenders for code smell removal do not recommend
these patterns [
31
,
41
,
51
]. They should rene their recommenda-
tions with our smell-removal composite patterns.
7 THREATS TO VALIDITY
We relied on the Refactoring Miner [
52
], which leads to a threat
associated with the false positives generated by the tool. To mini-
mize this threat, we manually validated each refactoring type (Sec-
tion 4.2). We observed a high precision for each refactoring type.
Some ndings are centered around the dierence among positive,
negative and neutral composites. However, if our classication
procedure is somewhat inaccurate, then we have a major threat to
the validity in our data. To mitigate that, we studied all the cases
where the classication procedure could be inaccurate (Section 5.2).
We found a risk of the classication scheme being wrong on 0.01%
of the cases. Thus, this risk was mitigated by the data disposition.
Our proposed heuristics may have limitations regarding how
they group refactorings (composite synthesis). For example, a rea-
son for dening the range-based heuristic is to capture composites
that would be incomplete from the commit-based perspective. Even
so, the range-base heuristic still can miss refactorings; thus, an
incomplete composite can be a complete one if we use another syn-
thesis strategy. One can consider these limitations as opportunities
for other researchers to dene their synthesis strategy. One could
also investigate an unied heuristic that infers for each refactoring,
exploring additional contextual information from where it occurs,
which is the most appropriate scope in that particular case.
We presented several patterns that remove or introduce smells.
We computed them by verifying how often they happen in the
projects, so they might suer from lack of generality. To avoid this
threat, we only reported patterns that happened in more than 50%
of the instances in our dataset. Additionally, to make sure that all
the three heuristics could nd these patterns, we veried the inter-
section among them. We found that 16 (out of 27) creational pattern
and 80 (out of 84) removal patterns were found by all heuristics.
8 CONCLUSION
Composite refactoring is common in practice, but a wide empirical
knowledge about it is scarce. To tackle this issue, rst, we pro-
vided a conceptual characterization of composites and dened two
heuristics to identify composites in dierent categories. Second, we
investigated how composites manifest in practice, and how they
aect the program structure. Our results show that to study compos-
ite we need to rely on dierent heuristics: they are complementary
to each other, but most empirical studies tend to use only a single
heuristic. For example, the identication of the semantically-related
refactorings was only possible using the commit-based and range-
based heuristics together. Similarly, the identication of several
composite-smell patterns were only possible with the range-based
heuristic.
Our results can be useful both for researchers and practitioners.
In particular, our study helped to explain conicting results in the
literature. For instance, dierent studies (e.g., [
5
] and [
6
]) have come
to dierent conclusions regarding the relation of refactoring types
with specic code smells. Thus, we provided new evidence that
there are composite patterns strongly related to the introduction
or removal of specic code smells (which explain the divergence in
their results). On the practical side, we contributed with insights
and a set of composite-smell patterns that are useful for improving
existing refactoring detection tools or recommender systems.
ACKNOWLEDGMENT
We want to thank the reviewers for their valuable suggestions. This
work is funded by CNPq (grants 434969/2018-4, 312149/2016-6),
CAPES (grant 175956), and FAPERJ (grant 22520-7/2016).
Characterizing and Identifying Composite Refactorings MSR ’20, October 5–6, 2020, Seoul, Republic of Korea
REFERENCES
[1]
M Abbes, F Khomh, Y Gueheneuc, and G Antoniol. 2011. An Empirical Study
of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Com-
prehension. In Proceedings of the 15th European Software Engineering Conference;
Oldenburg, Germany. 181–190.
[2]
Vahid Alizadeh and Marouane Kessentini. 2018. Reducing Interactive Refactoring
Eort via Clustering-based Multi-objective Search. In Proceedings of the 33rd
ACM/IEEE International Conference on Automated SoftwareEngine ering (ASE 2018).
ACM, New York, NY, USA, 464–474. https://doi.org/10.1145/3238147.3238217
[3]
Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni, and Marouane
Kessentini. 2019. Do Design Metrics Capture Developers Perception of Quality?
An Empirical Study on Self-Armed Refactoring Activities. In 13th ACM/IEEE
International Symposium on Empirical Software Engineering and Measurement
(ESEM 2019).
[4]
Roberta Arcoverde, Isela Macia, Alessandro Garcia, and Arndt von Staa. 2012.
Automatically Detecting Architecturally-Relevant Code Anomalies. Proceedings
of the International Workshopon Recommendation Systems for Software Engineering
(2012), 90–91. https://doi.org/10.1109/RSSE.2012.6233419
[5]
Gabriele Bavota, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, and
Fabio Palomba. 2015. An Experimental Investigation On The Innate Relationship
Between Quality And Refactoring. Journal of Systems and Software 107 (2015),
1–14. https://doi.org/10.1016/j.jss.2015.05.024
[6]
Ana Carla Bibiano, Eduardo Fernandes, Daniel Oliveira, Alessandro Garcia, Mar-
cos Kalinowski, Baldoino Fonseca, Roberto Oliveira, Anderson Oliveira, and
Diego Cedrim. 2019. A Quantitative Study on Characteristics and Eect of
Batch Refactoring on Code Smells. In 13th International Symposium on Empirical
Software Engineering and Measurement (ESEM). 1–11.
[7]
Arnaud Blouin, Valéria Lelli, Benoit Baudry, and Fabien Coulon. 2018. User
interface design smell: Automatic detection and refactoring of Blob listeners.
Information and Software Technology 102 (2018), 49 – 64. https://doi.org/10.1016/
j.infsof.2018.05.005
[8]
Aline Brito, Andre Hora, and Marco Tulio Valente. 2020. Refactoring Graphs:
Assessing Refactoring over Time. In 2020 IEEE 27th International Conference on
Software Analysis, Evolution and Reengineering (SANER). IEEE.
[9]
Diego Cedrim, Leonardo da Silva Sousa, Alessandro F. Garcia, and Rohit Gheyi.
2016. Does Refactoring Improve Software Structural Quality? A Longitudinal
Study of 25 Projects. In Proceedings of the 30th Brazilian Symposium on Software
Engineering. ACM, New York, NY, USA, 73–82. https://doi.org/10.1145/2973839.
2973848
[10]
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo
Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez.
2017. Understanding the Impact of Refactoring on Smells: A Longitudinal Study
of 23 Software Projects. In Proceedings of the 11th Joint Meeting on Foundations
of Software Engineering (ESEC/FSE 2017). ACM, New York, NY, USA, 465–475.
https://doi.org/10.1145/3106237.3106259
[11]
Alexander Chávez, Isabella Ferreira, Eduardo Fernandes, Diego Cedrim, and
Alessandro Garcia. 2017. How Does Refactoring Aect Internal Quality At-
tributes? A Multi-Project Study. In Proceedings of the 31st Brazilian Sympo-
sium on Software Engineering (SBES’17). ACM, New York, NY, USA, 74–83.
https://doi.org/10.1145/3131151.3131171
[12]
Rafael Maiani de Mello, Anderson G. Uchôa, Roberto Felicio Oliveira,
Willian Nalepa Oizumi, Jairo Souza, Kleyson Mendes, Daniel Oliveira, Baldoino
Fonseca, and Alessandro Garcia. 2019. Do Research and Practice of Code
Smell Identication Walk Together? A Social Representations Analysis. In 2019
ACM/IEEE International Symposium on Empirical Software Engineering and Mea-
surement, ESEM 2019, Porto de Galinhas, Recife, Brazil, September 19-20, 2019. IEEE,
1–6.
[13]
Danny Dig, Kashif Manzoor, Ralph Johnson, and Tien N. Nguyen. 2007.
Refactoring-Aware Conguration Management for Object-Oriented Programs.
In Proceedings of the 29th International Conference on Software Engineering
(ICSE ’07). IEEE Computer Society, Washington, DC, USA, 427–436. https:
//doi.org/10.1109/ICSE.2007.71
[14] Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. 1999.
Refactoring: Improving The Design Of Existing Code (1st ed.). Addison-Wesley
Longman Publishing Co., Inc., Boston, MA, USA. 464 pages.
[15]
Kenji Fujiwara, Kyohei Fushida, Norihiro Yoshida, and Hajimu Iida. 2013. As-
sessing Refactoring Instances and the Maintainability Benets of Them from
Version Archives. Springer Berlin Heidelberg, Berlin, Heidelberg, 313–323.
https://doi.org/10.1007/978-3- 642-39259- 7_25
[16]
Birgit Geppert, Audris Mockus, and Frank Rossler. 2005. Refactoring for Change-
ability: A Way to Go?. In Proceedings of the 11th IEEE International Software
Metrics Symposium (METRICS ’05). IEEE Computer Society, Washington, DC,
USA, 13–. https://doi.org/10.1109/METRICS.2005.40
[17]
Everton T.Guimarães, Alessandro F. Garcia, and Yuanfang Cai. 2015. Architecture-
sensitive heuristics for prioritizing critical code anomalies. In Proceedings of the
14th International Conference on Modularity, MODULARITY 2015, Fort Collins, CO,
USA, March 16 - 19, 2015, Robert B. France, Sudipto Ghosh, and Gary T. Leavens
(Eds.). ACM, 68–80.
[18]
Mark Harman and Laurence Tratt. 2007. Pareto optimal search based refactoring
at the design level. In 9th Genetic and Evolutionary Computation Conference
(GECCO). 1106–1113.
[19]
Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2012. A Field
Study of Refactoring Challenges and Benets. In Proceedings of the ACM SIGSOFT
20th International Symposium on the Foundations of Software Engineering (FSE
’12). ACM, New York, NY, USA, Article 50, 11 pages. https://doi.org/10.1145/
2393596.2393655
[20]
Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2014. An
Empirical Study of Refactoring Challenges and Benets at Microsoft. IEEE
Transactions on Software Engineering 40, 7 (2014), 633–649. https://doi.org/10.
1109/TSE.2014.2318734
[21]
H. Kirinuki, Y. Higo, K. Hotta, and S. Kusumoto. 2016. Splitting Commits via
Past Code Changes. In 2016 23rd Asia-Pacic Software Engineering Conference
(APSEC). 129–136. https://doi.org/10.1109/APSEC.2016.028
[22]
Martin Kuhlemann, Liang Liang, and Gunter Saake. 2010. Algebraic and cost-
based optimization of refactoring sequences. In 2nd International Workshop on
Model-driven Product Line Engineering (MDPLE). 37–48.
[23]
Michele Lanza and Radu Marinescu. 2010. Object-Oriented Metrics in Practice:
Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-
Oriented Systems (1st ed.). Springer Publishing Company, Incorporated.
[24]
Yun Lin, Xin Peng, Yuanfang Cai, Danny Dig, Diwen Zheng, and Wenyun Zhao.
2016. Interactive and guided architectural refactoring with search-based recom-
mendation. In 24th International Symposium on Foundations of Software Engineer-
ing (FSE). 535–546.
[25]
Kui Liu, Dongsun Kim, Tegawendé F. Bissyandé, Taeyoung Kim, Kisub Kim,
Anil Koyuncu, Suntae Kim, and Yves Le Traon. 2019. Learning to Spot and
Refactor Inconsistent Method Names. In Proceedings of the 41st International
Conference on Software Engineering (ICSE ’19). IEEE Press, Piscataway, NJ, USA,
1–12. https://doi.org/10.1109/ICSE.2019.00019
[26]
Isela Macia. 2013. On The Detection Of Architecturally Relevant Code Anomalies
In Software Systems. Ph.D. Dissertation. Pontical Catholic University of Rio de
Janeiro.
[27]
Isela Macia, Roberta Arcoverde, Alessandro Garcia, Christina Chavez, and Arndt
von Staa. 2012. On the Relevance of Code Anomalies for Identifying Architecture
Degradation Symptoms. Proceedings of the 16th European Conference on Software
Maintenance and Reengineering (2012), 277–286. https://doi.org/10.1109/CSMR.
2012.35
[28]
Mehran Mahmoudi, Sarah Nadi, and Nikolaos Tsantalis. 2019. Are Refactorings
to Blame? An Empirical Study of Refactorings in Merge Conicts. In 2019 IEEE
26th International Conference on Software Analysis, Evolution and Reengineering
(SANER). IEEE, 151–162.
[29]
Leandra Mara, Gustavo Honorato, Francisco Dantas Medeiros, Alessandro Garcia,
and Carlos Lucena. 2011. Hist-Inspect: A Tool for History-Sensitive Detection of
Code Smells. In Proceedings of the 10th International Conference on Aspect-oriented
Software Development Companion (AOSD ’11). ACM, New York, NY, USA, 65–66.
https://doi.org/10.1145/1960314.1960335
[30]
Panita Meananeatra. 2012. Identifying Refactoring Sequences For Improving
Software Maintainability. In Proceedings of the 27th IEEE/ACM International
Conference on Automated Software Engineering. ACM Press, New York, New
York, USA, 406–409. https://doi.org/10.1145/2351676.2351760
[31]
Mohamed Wiem Mkaouer, Marouane Kessentini, Slim Bechikh, Kalyanmoy Deb,
and Mel Ó Cinnéide. 2014. Recommendation system for software refactoring
using innovization and interactive dynamic optimization. In 29th International
Conference on Automated Software Engineering (ASE). 331–336.
[32]
E. Murphy-Hill and A. P. Black. 2008. Refactoring Tools: Fitness for Purpose.
IEEE Software 25, 5 (Sep. 2008), 38–44. https://doi.org/10.1109/MS.2008.123
[33]
E. Murphy-Hill, C. Parnin, and A. P. Black. 2012. How We Refactor, and How We
Know It. IEEE Transactions on Software Engineering 38, 1 (2012), 5–18. https:
//doi.org/10.1109/TSE.2011.41
[34]
Mel Ó Cinnéide and Paddy Nixon. 2000. Composite refactorings for Java programs.
In Proceedings of the Workshop on Formal Techniques for Java Programs, co-located
with the 14th European Conference on Object-Oriented Programming (ECOOP).
1–6.
[35]
Willian Nalepa Oizumi, Leonardo da Silva Sousa, Anderson Oliveira, Alessandro
Garcia, O. I. Anne Benedicte Agbachi, Roberto Felicio Oliveira, and Carlos Lucena.
2018. On the identication of design problems in stinky code: experiences and
tool support. J. Braz. Comp. Soc. 24, 1 (2018), 13:1–13:30.
[36]
Willian Nalepa Oizumi, Alessandro F. Garcia, Leonardo da Silva Sousa, Bruno
Barbieri Pontes Cafeo, and Yixue Zhao. 2016. Code anomalies ock together:
exploring code anomaly agglomerations for locating design problems. In Pro-
ceedings of the 38th International Conference on Software Engineering, ICSE 2016,
Austin, TX, USA, May 14-22, 2016, Laura K. Dillon, Willem Visser, and Laurie
Williams (Eds.). ACM, 440–451.
[37]
Mark O’Keee and Mel Ó Cinnéide. 2008. Search-based Refactoring: An Empirical
Study. J. Softw. Maint. Evol. 20, 5 (Sept. 2008), 345–364. https://doi.org/10.1002/
MSR ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
smr.v20:5
[38]
Roberto Felicio Oliveira, Leonardo da Silva Sousa, Rafael Maiani de Mello, Natasha
M. Costa Valentim, Adriana Lopes, Tayana Conte, Alessandro F. Garcia, Edson
Cesar Cunha de Oliveira, and Carlos José Pereira de Lucena. 2017. Collaborative
Identication of Code Smells: A Multi-Case Study. In 39th IEEE/ACM International
Conference on Software Engineering: Software Engineering in Practice Track, ICSE-
SEIP 2017, Buenos Aires, Argentina, May 20-28, 2017. IEEE Computer Society,
33–42.
[39]
Roberto Felicio Oliveira, Rafael Maiani de Mello, Eduardo Fernandes, Alessandro
Garcia, and Carlos Lucena. 2020. Collaborative or individual identication of
code smells? On the eectiveness of novice and professional developers. Inf.
Softw. Technol. 120 (2020).
[40]
William F. Opdyke. 1992. Refactoring Object-oriented Frameworks. Ph.D. Disserta-
tion. Champaign, IL, USA. UMI Order No. GAX93-05645.
[41]
Ali Ouni, Marouane Kessentini, Mel Ó Cinnéide, Houari Sahraoui, Kalyanmoy
Deb, and Katsuro Inoue. 2017. MORE: A multi-objective refactoring recommen-
dation approach to introducing design patterns and xing code smells. Journal
of Software: Evolution and Process 29, 5 (2017), e1843.
[42]
Ali Ouni, Marouane Kessentini, and Houari Sahraoui. 2013. Search-based refac-
toring using recorded code changes. In 17th European Conference on Software
Maintenance and Reengineering (CSMR). 221–230.
[43] 2020 Replication Package. 2020. https://gshare.com/s/81f7973d07ceb7e4796c.
[44]
Fabio Palomba, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2017. An
exploratory study on the relationship between changes and refactoring. In 2017
IEEE/ACM 25th International Conference on Program Comprehension (ICPC). IEEE,
176–185.
[45]
E. Piveta, J. Araujo, M. Pimenta, A. Moreira, P. Guerreiro, and R. T. Price. 2008.
Searching for Opportunities of Refactoring Sequences: Reducing the Search
Space. In 2008 32nd Annual IEEE International Computer Software and Applications
Conference. 319–326. https://doi.org/10.1109/COMPSAC.2008.119
[46]
K. Prete, N. Rachatasumrit, N. Sudan, and M. Kim. 2010. Template-Based Recon-
struction of Complex Refactorings. In Proceedings of IEEE International Conference
on Software Maintenance. 1–10. https://doi.org/10.1109/ICSM.2010.5609577
[47]
Jacek Ratzinger, Thomas Sigmund, and Harald C Gall. 2008. On The Relation of
Refactorings and Software Defect Prediction. In Proceedings of the International
Workshop on Mining Software Repositories. ACM Press, New York, New York, USA,
35–38. https://doi.org/10.1145/1370750.1370759
[48]
Veselin Raychev, Max Schäfer, Manu Sridharan, and Martin Vechev. 2013. Refac-
toring with synthesis. ACM SIGPLAN Notices 48, 10 (2013), 339–354.
[49]
Danilo Silva, Nikolaos Tsantalis, and Marco Tulio Valente. 2016. Why We Refac-
tor? Confessions of GitHub Contributors. In Proceedings of the 24th ACM SIGSOFT
International Symposium on Foundations of Software Engineering (FSE 2016). ACM,
New York, NY, USA, 858–870. https://doi.org/10.1145/2950290.2950305
[50]
Gábor Szőke, Gábor Antal, Csaba Nagy, Rudolf Ferenc, and Tibor Gyimóthy. 2017.
Empirical study on refactoring large-scale industrial systems and its eects on
maintainability. Journal of Systems and Software 129 (2017), 107–126.
[51]
Nikolaos Tsantalis, Theodoros Chaikalis, and Alexander Chatzigeorgiou. 2018.
Ten years of JDeodorant: Lessons learned from the hunt for smells. In 2018 IEEE
25th International Conference on Software Analysis, Evolution and Reengineering
(SANER). IEEE, 4–14.
[52]
Nikolaos Tsantalis, Matin Mansouri, Laleh M. Eshkevari, Davood Mazinanian, and
Danny Dig. 2018. Accurate and Ecient Refactoring Detection in Commit History.
In Proceedings of the 40th International Conference on Software Engineering (ICSE
’18). ACM, New York, NY, USA, 483–494. https://doi.org/10.1145/3180155.3180206
[53]
Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano
Di Penta, Andrea De Lucia, and Denys Poshyvanyk. 2015. When and Why Your
Code Starts to Smell Bad. In Proceedings of the 37th International Conference on
Software Engineering (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 403–414.
[54]
Carmine Vassallo, Giovanni Grano, Fabio Palomba, Harald C. Gall, and Alberto
Bacchelli. 2019. A large-scale empirical exploration on refactoring activities in
open source software projects. Science of Computer Programming 180 (2019), 1 –
15. https://doi.org/10.1016/j.scico.2019.05.002
[55]
Santiago A. Vidal, Willian Nalepa Oizumi, Alessandro Garcia, J. Andres Diaz-Pace,
and Claudia Marcos. 2019. Ranking architecturally critical agglomerations of
code smells. Sci. Comput. Program. 182 (2019), 64–85.
[56]
Aiko Yamashita and Leon Moonen. 2013. Exploring the Impact of Inter-Smell
Relations on Software Maintainability: An Empirical Study. Proceedings of the
International Conference on Software Engineering (2013), 682–691. https://doi.
org/10.1109/ICSE.2013.6606614
[57]
Aiko Yamashita and Leon Moonen. 2013. To What Extent can Maintenance
Problems be Predicted by Code Smell Detection? An Empirical Study. Information
and Software Technology 55, 12 (2013), 2223–2242. https://doi.org/10.1016/j.infsof.
2013.08.002
[58]
Young Seok Yoon and Brad A. Myers. 2015. Supporting Selective Undo in a Code
Editor. In Proceedings of the 37th International Conference on Software Engineering
- Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 223–233. http://dl.acm.
org/citation.cfm?id=2818754.2818784
[59]
Trevor J. Young. 2005. Using AspectJ to build a software product line for mobile
devices. Ph.D. Dissertation. https://doi.org/10.14288/1.0051632